If we want to execute any Lua file,
let's say a file named a.lua, we simply need to provide its path
to Lua interpreter:
$ lua a.lua
42
This is not Lua's invention. In fact, this pattern is
so common and intuitive that there is a special mechanism
that invokes it. Chance is you already know it - it's a shebang. If a
file starts with #!, a magic/special sequence,
then remainder of the first line is split into an interpreter path
and an argument. For Lua that would be, for example:
#!/usr/bin/lua -E -l pl
First the magic sequence #!. Then program
path pointed at Lua interpreter /usr/bin/lua.
Finally, everything else is one argument: -E -l
pl. Yes, everything here is a single argument passed
to Lua interpreter whitespace (other than newline) included. To avoid
this utilities like env(1) have a mechanisms like -S
that can split it into the arguments.
Afterwards, the path of the file that contains the shebang is appended
as a separate argument to the invocation. Assume a.lua contains
the above shebang. If we call it:
$ ./a.lua
The effective invocation is as if we called:
$ /usr/bin/lua "-E -l pl" ./a.lua
Lua supports compiling the source code into Lua's VM bytecode
with luac(1) command. Result of compiliation is a file with a
Lua-specific format. The format itself has caveats that are shortly
discussed in the manual page. If we pass the path to such luac-compiled
file to Lua interpreter, it will recognize it as bytecode and execute it:
$ luac -o a a.lua
$ lua a
42
However, I can't just open the file and prepend shebang, because the file
will break:
$ vi a
...
$ lua a
42
Wait... what?
Lua treats first line starting with # in
a very special way. It skips it at a very early moment in luaL_loadfilex()
where it can possibly skip first line with skipcomment().
Meaning, it is technically possible to prepend shebang to a luac file.
This makes it a "somewhat unreadable not-a-script thing" whatever that
means. Can we make luac output executable from shell without modifying
luac's output?
A bit earlier than just "recently" someone recommended to me FEX, an x86 emulator for
ARM64 Linux devices. I hardly use any ARM64 devices that would need
x86 emulation. Nonetheless, I explored its source code and wiki.
In the end, I enjoy reading what someone else wrote down, be it natural
or programming language.
Wine is a compatibility layer
that runs Windows applications on Linux (and more). On most modern Linux
distributions if you install it and make a Windows "exe" file executable,
it will automatically run via Wine. I use it for a very long time.
I knew that there is a mechanism in Linux that recognizes format of
binary executables and that it does it via magic sequences.
Of course, I only learned the name of
this mechanism when reviewing FEX documentation: binfmt_misc.
It allows to register arbitrary binary formats via a file system
interface. Formats are recognized either by a magic sequence at the
beginning of a file or by matching file extension. Systemd provides
a binfmt.d(5) to define such binary formats in, among other
locations, /etc/binfmt.d.
Files produced by luac have a recognizable header: a magic LUA_SIGNATURE,
Lua version, format, and few other important things. They are defined in dumpHeader().
Of course, we can register this as magic sequence in binfmt_misc in Linux.
binfmt_misc and thus binfmt.d consumes configuration formatted
as follows:
:name:type:offset:magic:mask:interpreter:flags
Most of these are self-explanatory. Refer to binfmt_misc
page if it's not.
For Lua, we'll start at the beginning of the file, and, at the very
least, match signature, version, and format, for example:
:lua:M::\x1bLua\x00\x00:\xff\xff\xff\xff\x00\xff:/usr/bin/lua:
:lua54:M::\x1bLua\x54\x00::/usr/bin/lua5.4:
This can be extended to other versions and the catch-all can
be removed. Names are arbitrary but must be unique. M
type makes its a magic sequence instead of extension match. Empty mask
means every provided byte will get matched. That's it. We can save
it as /etc/binfmt.d/lua.conf, compile a fresh luac bytecode,
make it executable and run it:
$ luac -o a a.lua
$ chmod +x a
$ ./a
42
file(1) recognizes Lua bytecode files in the same way,
so instead of going through all of Lua versions, we can refer to their
definition.
To avoid running Lua files with different size
definitions we may choose to match more than just first
three parts of the Lua header. Afterwards there is LUAC_DATA
(previously known as LUAC_TAIL,
added in 5.2) that is used to catch conversion
errors, but also information about sizes of int, Instruction,
lua_Integer,
and lua_Number.
Because data/tail varies, there is no lazy way of matching it that would
work with all Lua versions. However, one can check header size for each
Lua version, dump files with respective luac, and use first N bytes to
make sure that bytecode is compatible with current machine.