This is not a recommendation, but just a couple of days ago someone linked to this project, claiming similar goals to lua, great performance, and gradual typing:
I can't tell you what it's actually like though.
A more established, proven option is Haxe. Haxe has a lot of libraries but I think it's specifically designed to be batteries-optional. This Haxe VM in particular looks pretty impressive:
Haxe has the distinction of having been used to ship loads of successful games made by small teams with custom engines.
Another option designed for simplicity, low-complexity and easy embedding is wren:
Implementation is apparently only 4000 lines.