A.out

From OSDev.wiki
Jump to navigation Jump to search
This page is a stub.
You can help the wiki by accurately adding more contents to it.
Executable Formats
Microsoft

16 bit:
COM
MZ
NE
Mixed (16/32 bit):
LE
32/64 bit:
PE
COFF

*nix
Apple

A.OUT is the "original" binary format for Unix machines. It is considered obsolete today because of several shortcomings. It was superseded by System V ELF. However, as it is extremely simple and supported by many compilers/assemblers, it may be a good choice if you're willing to develop your own format or have more information than 'raw binary' for your bootloader.

Structure

An a.out binary file consists of up to 7 sections. In order, these sections are:

Section Description
exec header Contains parameters used by the kernel to load a binary file into memory and execute it, and by the link editor ld to combine a binary file with other binary files. This section is the only mandatory one.
text segment Contains machine code and related data that are loaded into memory when a program executes. Should be loaded read-only.
data segment Contains initialized data; always loaded into writable memory.
text relocations Contains records used by the link editor to update pointers in the text segment when combining binary files.
data relocations Like the text relocation section, but for data segment pointers.
symbol table Contains records used by the link editor to cross reference the addresses of named variables and functions (`symbols') between binary files.
string table Contains the character strings corresponding to the symbol names.

The exec header

Every binary file begins with an exec structure, which is 32 bytes in size. Some later versions of a.out are modified to allow 64 bit code execution. Their header is 64 bytes in size, however.

32 bit executable header:

struct exec
{

	uint32_t   a_midmag;
	uint32_t   a_text;
	uint32_t   a_data;
	uint32_t   a_bss;
	uint32_t   a_syms;
	uint32_t   a_entry;
	uint32_t   a_trsize;
	uint32_t   a_drsize;

};

The fields have the following functions:

Position (32 bit) Position (64 bit) Field Description
0-3 0-7 a_midmag This field is stored in network byte-order so that binaries for machines with alternative byte orders can be distinguished. It has a number of sub-components accessed by the macros N_GETFLAG(), N_GETMID(), and N_GETMAGIC(), and set by the macro N_SETMAGIC().
4-7 8-15 a_text Contains the size of the text segment in bytes.
8-11 16-23 a_data Contains the size of the data segment in bytes.
12-15 24-31 a_bss Contains the number of bytes in the `bss segment'. The kernel loads the program so that this amount of writable memory appears to follow the data segment and initially reads as zeroes.
16-19 32-39 a_syms Contains the size in bytes of the symbol table section.
20-23 40-47 a_entry Contains the address in memory of the entry point of the program after the kernel has loaded it; the kernel starts the execution of the program from the machine instruction at this address.
24-27 48-55 a_trsize Contains the size in bytes of the text relocation table.
28-31 56-63 a_drsize Contains the size in bytes of the data relocation table.

The relocation info structure

Relocation records have a standard format which is described by the relocation_info structure:

struct relocation_info
{

        int32_t  r_address;

        uint32_t r_symbolnum : 24,
                 r_pcrel     : 1,
                 r_length    : 2,
                 r_extern    : 1,
                 r_baserel   : 1,
                 r_jmptable  : 1,
                 r_relative  : 1,
                 r_copy      : 1;

};

The fields have the following functions:

Position in bits (32 bit) Field Description
0-31 r_address Contains the byte offset of a pointer that needs to be link-edited. Text relocation offsets are reckoned from the start of the text segment, and data relocation offsets from the start of the data segment. The link editor adds the value that is already stored at this offset into the new value that it computes using this relocation record.
32-55 r_symbolnum Contains the ordinal number of a symbol structure in the symbol table (it is not a byte offset).
56 r_pcrel If this is set, the link editor assumes that it is updating a pointer that is part of a machine code instruction using pc-relative addressing. The address of the relocated pointer is implicitly added to its value when the running program uses it.
57-58 r_length Contains the log base 2 of the length of the pointer in bytes.
59 r_extern Set if this relocation requires an external reference.
60 r_baserel If set, the symbol is to be relocated to an offset into the GOT.
61 r_jmptable If set, the symbol is to be relocated to an offset into the PLT.
62 r_relative If set, this relocation is relative to the load address of the image this object file is going to be a part of.
63 r_copy If set, this relocation record identifies a symbol whose contents should be copied to the location given in r_address.

See Also