Decoding ELF: How Linux Actually Loads Your Binary
You type ./a.out. The kernel reads four magic bytes, parses a header, maps a few memory regions, jumps to an entry point, and your program is running. "Executable" is not magic — it's a file format with very specific structure.
That format is ELF, the Executable and Linkable Format.
The Four Bytes
Every ELF file starts with \x7fELF. Run this on any binary:
xxd /bin/ls | head -1
# 00000000: 7f45 4c46 0201 0103 0000 0000 0000 0000 .ELF............
The kernel checks these four bytes when you execve(). If they match, control passes to the ELF binfmt handler. If not, it tries shebang (#!), then fails with ENOEXEC.
Two Views of the Same File
ELF was designed to serve both the linker and the loader, so it has two parallel tables:
- Section headers — the linker's view:
.text,.data,.bss,.rodata,.symtab,.rela.dyn, etc. Used to combine and relocate. - Program headers — the loader's view:
LOAD,DYNAMIC,INTERP,GNU_STACK,NOTE. Tells the kernel what to map.
The loader doesn't care about sections. It only reads program headers.
ELF File
├─ ELF header (e_entry, e_phoff, e_shoff)
├─ Program headers → used by execve()
├─ Sections
│ ├─ .text
│ ├─ .rodata
│ ├─ .data
│ └─ .bss
└─ Section headers → used by ld, gdb, objdump
Inspecting Headers
readelf -h /bin/ls
# Type: DYN (Position-Independent Executable)
# Entry point address: 0x6ab0
readelf -l /bin/ls
# LOAD 0x000000 R E 0x18000
# LOAD 0x018000 RW 0x06000
# INTERP /lib64/ld-linux-x86-64.so.2
readelf -l prints the program headers — exactly what the kernel will read.
What execve() Does
A simplified pseudocode of the kernel side:
execve("/bin/ls")
read 64 bytes: ELF header
validate magic, class, machine
read program headers
for each PT_LOAD:
mmap(file, offset, vaddr, size, prot)
if PT_INTERP exists:
load /lib64/ld-linux-x86-64.so.2 the same way
jump to dynamic linker's entry point
else:
jump to e_entry
No bytes are copied into RAM eagerly. mmap sets up the mapping; pages fault in lazily when first accessed. That's why a 50MB binary takes microseconds to start — you're paying for page faults, not file I/O.
Static vs Dynamic
Statically linked: no PT_INTERP segment. The kernel jumps straight to _start. Your binary contains every function it calls.
Dynamically linked: a PT_INTERP segment names the dynamic linker (typically ld-linux.so). The kernel loads it first; the linker then loads libc.so.6, libm.so.6, etc., resolves symbols, and only then jumps to your code.
The dynamic linker is itself an ELF, and a clever one — it's its own interpreter, designed to bootstrap before any libraries are available.
The .bss Trick
Uninitialized globals don't exist in the file:
int big_buffer[1024 * 1024]; // 4 MB of zeros
This takes ~zero bytes on disk. The program header reserves the address range; the kernel maps it as anonymous, zero-filled pages on first access. Your file stays small; your runtime gets the full 4MB.
Compare:
int also_big[1024 * 1024] = { 1 }; // initialized
This goes in .data and is in the file — 4MB on disk.
PIE and ASLR
Modern compilers default to Position-Independent Executables. The ELF type is DYN, not EXEC. The base address is randomized at load time by ASLR — every run sees /bin/ls at a different virtual address.
This works because all internal references use RIP-relative addressing on x86-64 — no absolute addresses to relocate.
Common Inspection Commands
| Command | What it shows |
|---|---|
file ./a.out |
Quick summary (ELF, dynamic, stripped) |
readelf -h |
ELF header |
readelf -l |
Program headers (loader view) |
readelf -S |
Section headers (linker view) |
readelf -d |
Dynamic section (runtime deps, RPATH) |
nm / objdump -t |
Symbol table |
ldd ./a.out |
Resolved shared library paths |
objdump -D |
Full disassembly |
strip ./a.out |
Remove symbols and debug info |
A Single-Page Mental Model
execve("./hello")
→ kernel reads ELF header
→ maps PT_LOAD segments into the process
→ if PT_INTERP, loads ld.so first
→ jumps to entry (ld.so or _start)
ld.so
→ maps libc, resolves symbols
→ jumps to _start in your binary
_start → __libc_start_main → main
That's the whole pipeline. Everything else is detail.
Takeaways
- ELF is a file format, not an abstraction — four magic bytes and a header.
- The loader only reads program headers; sections are for the linker.
mmap-and-fault is what makes binary startup fast.- Dynamic linking is bootstrapping a language onto itself — ld.so is an ELF that loads ELFs.
readelfis your microscope. Use it.