A.out
Executable Formats |
---|
Microsoft |
*nix |
Apple |
A.OUT is the "original" binary format for Unix machines. It is considered obsolete today because of several shortcomings. It was superseded by System V ELF. However, as it is extremely simple and supported by many compilers/assemblers, it may be a good choice if you're willing to develop your own format or have more information than 'raw binary' for your bootloader.
Structure
An a.out binary file consists of up to 7 sections. In order, these sections are:
Section | Description |
---|---|
exec header | Contains parameters used by the kernel to load a binary file into memory and execute it, and by the link editor ld to combine a binary file with other binary files. This section is the only mandatory one. |
text segment | Contains machine code and related data that are loaded into memory when a program executes. Should be loaded read-only. |
data segment | Contains initialized data; always loaded into writable memory. |
text relocations | Contains records used by the link editor to update pointers in the text segment when combining binary files. |
data relocations | Like the text relocation section, but for data segment pointers. |
symbol table | Contains records used by the link editor to cross reference the addresses of named variables and functions (`symbols') between binary files. |
string table | Contains the character strings corresponding to the symbol names. |
The exec header
Every binary file begins with an exec structure, which is 32 bytes in size. Some later versions of a.out are modified to allow 64 bit code execution. Their header is 64 bytes in size, however.
32 bit executable header:
struct exec
{
uint32_t a_midmag;
uint32_t a_text;
uint32_t a_data;
uint32_t a_bss;
uint32_t a_syms;
uint32_t a_entry;
uint32_t a_trsize;
uint32_t a_drsize;
};
The fields have the following functions:
Position (32 bit) | Position (64 bit) | Field | Description |
---|---|---|---|
0-3 | 0-7 | a_midmag | This field is stored in network byte-order so that binaries for machines with alternative byte orders can be distinguished. It has a number of sub-components accessed by the macros N_GETFLAG(), N_GETMID(), and N_GETMAGIC(), and set by the macro N_SETMAGIC(). |
4-7 | 8-15 | a_text | Contains the size of the text segment in bytes. |
8-11 | 16-23 | a_data | Contains the size of the data segment in bytes. |
12-15 | 24-31 | a_bss | Contains the number of bytes in the `bss segment'. The kernel loads the program so that this amount of writable memory appears to follow the data segment and initially reads as zeroes. |
16-19 | 32-39 | a_syms | Contains the size in bytes of the symbol table section. |
20-23 | 40-47 | a_entry | Contains the address in memory of the entry point of the program after the kernel has loaded it; the kernel starts the execution of the program from the machine instruction at this address. |
24-27 | 48-55 | a_trsize | Contains the size in bytes of the text relocation table. |
28-31 | 56-63 | a_drsize | Contains the size in bytes of the data relocation table. |
The relocation info structure
Relocation records have a standard format which is described by the relocation_info structure:
struct relocation_info
{
int32_t r_address;
uint32_t r_symbolnum : 24,
r_pcrel : 1,
r_length : 2,
r_extern : 1,
r_baserel : 1,
r_jmptable : 1,
r_relative : 1,
r_copy : 1;
};
The fields have the following functions:
Position in bits (32 bit) | Field | Description |
---|---|---|
0-31 | r_address | Contains the byte offset of a pointer that needs to be link-edited. Text relocation offsets are reckoned from the start of the text segment, and data relocation offsets from the start of the data segment. The link editor adds the value that is already stored at this offset into the new value that it computes using this relocation record. |
32-55 | r_symbolnum | Contains the ordinal number of a symbol structure in the symbol table (it is not a byte offset). |
56 | r_pcrel | If this is set, the link editor assumes that it is updating a pointer that is part of a machine code instruction using pc-relative addressing. The address of the relocated pointer is implicitly added to its value when the running program uses it. |
57-58 | r_length | Contains the log base 2 of the length of the pointer in bytes. |
59 | r_extern | Set if this relocation requires an external reference. |
60 | r_baserel | If set, the symbol is to be relocated to an offset into the GOT. |
61 | r_jmptable | If set, the symbol is to be relocated to an offset into the PLT. |
62 | r_relative | If set, this relocation is relative to the load address of the image this object file is going to be a part of. |
63 | r_copy | If set, this relocation record identifies a symbol whose contents should be copied to the location given in r_address. |