Executing Lua Bytecode From Shell

If we want to execute any Lua file, let's say a file named a.lua, we simply need to provide its path to Lua interpreter:

$ lua a.lua
42

This is not Lua's invention. In fact, this pattern is so common and intuitive that there is a special mechanism that invokes it. Chance is you already know it - it's a shebang. If a file starts with #!, a magic/special sequence, then remainder of the first line is split into an interpreter path and an argument. For Lua that would be, for example:

#!/usr/bin/lua -E -l pl

First the magic sequence #!. Then program path pointed at Lua interpreter /usr/bin/lua. Finally, everything else is one argument: -E -l pl. Yes, everything here is a single argument passed to Lua interpreter whitespace (other than newline) included. To avoid this utilities like env(1) have a mechanisms like -S that can split it into the arguments.

Afterwards, the path of the file that contains the shebang is appended as a separate argument to the invocation. Assume a.lua contains the above shebang. If we call it:

$ ./a.lua

The effective invocation is as if we called:

$ /usr/bin/lua "-E -l pl" ./a.lua

Lua supports compiling the source code into Lua's VM bytecode with luac(1) command. Result of compiliation is a file with a Lua-specific format. The format itself has caveats that are shortly discussed in the manual page. If we pass the path to such luac-compiled file to Lua interpreter, it will recognize it as bytecode and execute it:

$ luac -o a a.lua
$ lua a
42

However, I can't just open the file and prepend shebang, because the file will break:

$ vi a
...
$ lua a
42

Wait... what?

Lua treats first line starting with # in a very special way. It skips it at a very early moment in luaL_loadfilex() where it can possibly skip first line with skipcomment(). Meaning, it is technically possible to prepend shebang to a luac file. This makes it a "somewhat unreadable not-a-script thing" whatever that means. Can we make luac output executable from shell without modifying luac's output?

a hat for a wizard

A bit earlier than just "recently" someone recommended to me FEX, an x86 emulator for ARM64 Linux devices. I hardly use any ARM64 devices that would need x86 emulation. Nonetheless, I explored its source code and wiki. In the end, I enjoy reading what someone else wrote down, be it natural or programming language.

Wine is a compatibility layer that runs Windows applications on Linux (and more). On most modern Linux distributions if you install it and make a Windows "exe" file executable, it will automatically run via Wine. I use it for a very long time. I knew that there is a mechanism in Linux that recognizes format of binary executables and that it does it via magic sequences.

Of course, I only learned the name of this mechanism when reviewing FEX documentation: binfmt_misc. It allows to register arbitrary binary formats via a file system interface. Formats are recognized either by a magic sequence at the beginning of a file or by matching file extension. Systemd provides a binfmt.d(5) to define such binary formats in, among other locations, /etc/binfmt.d.

Files produced by luac have a recognizable header: a magic LUA_SIGNATURE, Lua version, format, and few other important things. They are defined in dumpHeader(). Of course, we can register this as magic sequence in binfmt_misc in Linux.

binfmt_misc and thus binfmt.d consumes configuration formatted as follows:

:name:type:offset:magic:mask:interpreter:flags

Most of these are self-explanatory. Refer to binfmt_misc page if it's not.

For Lua, we'll start at the beginning of the file, and, at the very least, match signature, version, and format, for example:

:lua:M::\x1bLua\x00\x00:\xff\xff\xff\xff\x00\xff:/usr/bin/lua:
:lua54:M::\x1bLua\x54\x00::/usr/bin/lua5.4:

This can be extended to other versions and the catch-all can be removed. Names are arbitrary but must be unique. M type makes its a magic sequence instead of extension match. Empty mask means every provided byte will get matched. That's it. We can save it as /etc/binfmt.d/lua.conf, compile a fresh luac bytecode, make it executable and run it:

$ luac -o a a.lua
$ chmod +x a
$ ./a
42

file(1) recognizes Lua bytecode files in the same way, so instead of going through all of Lua versions, we can refer to their definition.

To avoid running Lua files with different size definitions we may choose to match more than just first three parts of the Lua header. Afterwards there is LUAC_DATA (previously known as LUAC_TAIL, added in 5.2) that is used to catch conversion errors, but also information about sizes of int, Instruction, lua_Integer, and lua_Number. Because data/tail varies, there is no lazy way of matching it that would work with all Lua versions. However, one can check header size for each Lua version, dump files with respective luac, and use first N bytes to make sure that bytecode is compatible with current machine.