If you are curious as to how the Linux binary compatibility
works, this is the section you want to read. Most of what follows
is based heavily on an email written to FreeBSD chat mailing list by Terry Lambert
<tlambert@primenet.com>
(Message ID:
<199906020108.SAA07001@usr09.primenet.com>
).
FreeBSD has an abstraction called an „execution class loader”. This is a wedge into the execve(2) system call.
What happens is that FreeBSD has a list of loaders, instead of
a single loader with a fallback to the #!
loader for running any shell interpreters or shell scripts.
Historically, the only loader on the UNIX® platform examined the magic number (generally the first 4 or 8 bytes of the file) to see if it was a binary known to the system, and if so, invoked the binary loader.
If it was not the binary type for the system, the execve(2) call returned a failure, and the shell attempted to start executing it as shell commands.
The assumption was a default of „whatever the current shell is”.
Later, a hack was made for sh(1) to examine the first two
characters, and if they were :\n
, then it
invoked the csh(1) shell instead (we believe SCO first made
this hack).
What FreeBSD does now is go through a list of loaders, with a
generic #!
loader that knows about interpreters
as the characters which follow to the next whitespace next to
last, followed by a fallback to
/bin/sh
.
For the Linux ABI support, FreeBSD sees the magic number as an ELF binary (it makes no distinction between FreeBSD, Solaris™, Linux, or any other OS which has an ELF image type, at this point).
The ELF loader looks for a specialized brand, which is a comment section in the ELF image, and which is not present on SVR4/Solaris™ ELF binaries.
For Linux binaries to function, they must be
branded as type Linux
from brandelf(1):
#
brandelf -t Linux file
When this is done, the ELF loader will see the
Linux
brand on the file.
When the ELF loader sees the Linux
brand,
the loader replaces a pointer in the proc
structure. All system calls are indexed through this pointer (in
a traditional UNIX® system, this would be the
sysent[]
structure array, containing the system
calls). In addition, the process is flagged for special handling of
the trap vector for the signal trampoline code, and several other
(minor) fix-ups that are handled by the Linux kernel
module.
The Linux system call vector contains, among other things, a
list of sysent[]
entries whose addresses reside
in the kernel module.
When a system call is called by the Linux binary, the trap
code dereferences the system call function pointer off the
proc
structure, and gets the Linux, not the
FreeBSD, system call entry points.
In addition, the Linux mode dynamically
reroots lookups; this is, in effect, what the
union
option to file system mounts
(not the unionfs
file system type!) does. First, an attempt
is made to lookup the file in the
/compat/linux/original-path
directory, then only if that fails, the
lookup is done in the
/original-path
directory. This makes sure that binaries that require other
binaries can run (e.g., the Linux toolchain can all run under
Linux ABI support). It also means that the Linux binaries can
load and execute FreeBSD binaries, if there are no corresponding
Linux binaries present, and that you could place a uname(1)
command in the /compat/linux
directory tree
to ensure that the Linux binaries could not tell they were not
running on Linux.
In effect, there is a Linux kernel in the FreeBSD kernel; the
various underlying functions that implement all of the services
provided by the kernel are identical to both the FreeBSD system
call table entries, and the Linux system call table entries: file
system operations, virtual memory operations, signal delivery,
System V IPC, etc… The only difference is that FreeBSD
binaries get the FreeBSD glue functions, and
Linux binaries get the Linux glue functions
(most older OS's only had their own glue
functions: addresses of functions in a static global
sysent[]
structure array, instead of addresses
of functions dereferenced off a dynamically initialized pointer in
the proc
structure of the process making the
call).
Which one is the native FreeBSD ABI? It does not matter. Basically the only difference is that (currently; this could easily be changed in a future release, and probably will be after this) the FreeBSD glue functions are statically linked into the kernel, and the Linux glue functions can be statically linked, or they can be accessed via a kernel module.
Yeah, but is this really emulation? No. It is an ABI implementation, not an emulation. There is no emulator (or simulator, to cut off the next question) involved.
So why is it sometimes called „Linux emulation”? To make it hard to sell FreeBSD! Really, it is because the historical implementation was done at a time when there was really no word other than that to describe what was going on; saying that FreeBSD ran Linux binaries was not true, if you did not compile the code in or load a module, and there needed to be a word to describe what was being loaded—hence „the Linux emulator”.
All FreeBSD documents are available for download at http://ftp.FreeBSD.org/pub/FreeBSD/doc/
Questions that are not answered by the
documentation may be
sent to <freebsd-questions@FreeBSD.org>.
Send questions about this document to <freebsd-doc@FreeBSD.org>.