ELF: Difference between revisions

586 bytes added ,  18 days ago
no edit summary
[unchecked revision][unchecked revision]
(That is not related to ELF)
No edit summary
 
(16 intermediate revisions by 9 users not shown)
Line 1:
{{FirstPerson}}
{{You}}
{{File formats}}
[[ELF]] ([[ELF|Executable and Linkable Format]]) wasis a file format designed by Unix System Laboratories while working with Sun Microsystems on SVR4 (UNIX System V Release 4.0). Consequently, ELF first appeared in Solaris 2.0 (aka SunOS 5.0), which is based on SVR4. The format is specified in the [[System V ABI]].
 
ELF (Executable and Linkable Format) was designed by Unix System Laboratories while working with Sun Microsystems on SVR4 (UNIX System V Release 4.0). Consequently, ELF first appeared in Solaris 2.0 (aka SunOS 5.0), which is based on SVR4. The format is specified in the [[System V ABI]].
 
A very versatile file format, it was later picked up by many other operating systems for use as both executable files and as shared library files. It does distinguish between TEXT, DATA and BSS.
 
Today, ELF is considered the standard format on Unix-alike systems. While it has some drawbacks (e.g., using up one of the scarce general purpose registers of the IA-32 when using position-independent code), it is well supported and documented.
 
==File Structure==
ELF is a format for storing many program types (see ELF Header table) on the disk, created as a result of compiling and linking. An ELF file might indepedenently contain sections or segments. For an executable program, an ELF header and a segment are the bare minimum, while sections are optional, though it's common for an executable to have a ".text" section for the code and ".data" section for initialized data. Libraries don't have segments, but only sections because they are used for linking purposes. Sections and segments are described by their respective headers that contain information about their sizes, required alignment, etc.
 
ELF is a format for storing programs or fragments of programs on disk, created as a result of compiling and linking. An ELF file is divided into sections. For an executable program, these are the text section for the code, the data section for global variables and the rodata section that usually contains constant strings. The ELF file contains headers that describe how these sections should be stored in memory.
 
Note that depending on whether your file is a linkable or an executable file, the headers in the ELF file won't be the same:
process.o, result of gcc -c process.c $SOME_FLAGS
 
<pre>
C32/kernel/bin/.process.o
Line 39 ⟶ 37:
CONTENTS, READONLY
</pre>
 
The 'flags' will tell you what's actually available in the ELF file. Here, we have [[Symbol_Table|symbol tables]] and relocation: all that we need to link the file against another, but virtually no information about how to load the file in memory (even if that could be guessed). We don't have the program entry point, for instance, and we have a sections table rather than a program header.
 
{| {{wikitable}}
|-
Line 62 ⟶ 58:
| debugging symbols & similar information.
|}
 
/bin/bash, a real executable file
 
<pre>
/bin/bash: file format elf32-i386
Line 116 ⟶ 110:
40014000-40015000 rw-p 00013000 03:06 27304 /lib/ld-2.3.2.so
</pre>
 
We can recognize our 'code bits' and 'data bits', by stating that the second one should be loaded at 0x080bd*120* and that it starts in file at 0x00074*120*, we actually preserved page-to-disk blocks mapping (e.g. if page 0x80bc000 is missing, just fetch file blocks from 0x75000). That means, however, that a part of the code is mapped twice, but with different permissions. I suggest you do give them different physical pages too if you don't want to end up with modifiable code.
 
==Loading ELF Binaries==
[[ImageFile:Elfdiagram.png|framecenter|Executable image and elf binary can being mapped onto each other]]
 
[[Image:Elfdiagram.png|frame|Executable image and elf binary can being mapped onto each other]]
The ELF header contains all of the relevant information required to load an ELF executable. The format of this header is described in the [http://www.skyfree.org/linux/references/ELF_Format.pdf ELF Specification]. The most relevant sections for this purpose are 1.1 to 1.4 and 2.1 to 2.7. Instructions on loading an executable are contained within section 2.7.
 
The following is a rough outline of the steps that an ELF executable loader must perform:
 
* Verify that the file starts with the ELF magic number (4 bytes) as described in figure 1-4 (and subsequent table) on page 11 in the ELF specification.
* Read the ELF Header. The ELF header is always located at the very beginning of an ELF file. The ELF header contains information about how the rest of the file is laid out. An executable loader is only concerned with the program headers.
Line 138 ⟶ 128:
 
==Relocation==
 
Relocation becomes handy when you need to load, for example, modules or drivers. It's possible to use the "-r" option to ld to permit you to have multiple object files linked into one big one, which means easier coding and faster testing.
 
Line 148 ⟶ 137:
# Go through all sections resolving external references against the kernel symbol table
# If all succeeded, you can use the "e_entry" field of the header as the offset from the load address to call the entry point (if one was specified), or do a symbol lookup, or just return a success error code.
 
Once you can relocate ELF objects you'll be able to have drivers loaded when needed instead of at startup - which is always a Good Thing (tm).
 
== Tables ==
=== ELF Header ===
The ELF header is always found at the start of the ELF file.
{| {{wikitable}}
|-
Line 175 ⟶ 162:
| 6
| ELF header version
|-
| 7
| 7
Line 186 ⟶ 173:
| 16-17
| 16-17
| Type (1 = relocatable, 2 = executable, 3 = shared, 4 = core)
|-
| 18-19
Line 194 ⟶ 181:
| 20-23
| 20-23
| ELF Version (currently 1)
|-
| 24-27
| 24-31
| Program entry positionoffset
|-
| 28-31
| 32-39
| Program header table positionoffset
|-
| 32-35
| 40-47
| Section header table positionoffset
|-
| 36-39
Line 214 ⟶ 201:
| 40-41
| 52-53
| ELF Header size
|-
| 42-43
Line 234 ⟶ 221:
| 50-51
| 62-63
| IndexSection inindex to the section header string table with the section names
|}
 
The flags entry can probably be ignored for x86 ELFs, as no flags are actually defined.
 
Line 246 ⟶ 232:
|-
| No Specific
| 00x00
|-
| [[:Category:Sparc | Sparc]]
| 20x02
|-
| '''[[x86]]'''
| 30x03
|-
| [[:Category:MIPS | MIPS]]
| 80x08
|-
| [[PowerPC]]
Line 279 ⟶ 265:
|}
The most common architectures are in bold.
 
=== Program header ===
This is an array of N (given in the main header) entries in the following format. Make sure to use the correct version depending on whether the file is 32 bit or 64 bit as the tables are quite different.
Line 299 ⟶ 284:
|-
| 12-15
| Reserved for segment's physical address (p_paddr)
| Undefined for the System V ABI
|-
| 16-19
Line 305 ⟶ 290:
|-
| 20-23
| Size of the segment in memory (p_memsz, at least as big as p_filesz)
|-
| 24-27
Line 311 ⟶ 296:
|-
| 28-31
| The required alignment for this section (must beusually a power of 2)
|}
 
64 bit version:
{| {{wikitable}}
Line 333 ⟶ 317:
|-
| 24-31
| Reserved for segment's physical address (p_paddr)
| Undefined for the System V ABI
|-
| 32-39
Line 339 ⟶ 323:
|-
| 40-47
| Size of the segment in memory (p_memsz, at least as big as p_filesz)
|-
| 48-55
| The required alignment for this section (must beusually a power of 2)
|}
Segment types: 0 = null - ignore the entry; 1 = load - clear p_memsz bytes at p_vaddr to 0, then copy p_filesz bytes from p_offset to p_vaddr; 2 = dynamic - requires dynamic linking; 3 = interp - contains a file path to an executable to use as an interpreter for the following segment; 4 = note section. There are more values, but mostly contain architecture/environment specific information, which is probably not required for the majority of ELF files.
 
Segment types: 0 = null - ignore the entry; 1 = load - clear p_memsz bytes at p_vaddr to 0, then copy p_filesz bytes from p_offset to p_vaddr; 2 = dynamic - requires dynamic linking; 3 = interp - contains a file path to an executable to use as an interpreter for the following segment; 4 = note section. There are more values, but mostly contain architecture/environment specific information, which is probably not required for the majority of ELF files.
 
Flags: 1 = executable, 2 = writable, 4 = readable.
 
=== Dynamic Linking ===
{{Main|Dynamic Linker}}
 
Dynamic Linking is when the OS gives a program shared libraries if it needs them. Meaning, the libraries are found in the system and then "bind" to the program that needs them while the program is running, versus static linking, which links the libraries '''before''' the program is run. The main advantages are that programs take up less memory, and are smaller in file size. The main disadvantage, however, is that the program becomes less portable because the program depends on many different shared libraries.
 
In order to implement this, you need to have proper scheduling in place, a library, and a program to use that library.
You can create a library with GCC:
 
<pre>
myos-gcc -c -fPIC -o oneobject.o oneobject.c
Line 362 ⟶ 342:
myos-gcc -shared -fPIC -Wl,-soname,nameofmylib oneobject.o anotherobject.o -o mylib.so
</pre>
 
This library should be treated as a file, which is loaded when the OS detects its attempted usage. You will need to implement this "[[Dynamic Linker]]" into a certain classification of code such as in your memory management or your task management section. When the ELF program is run, the system should attach the shared object data to a malloc() region of memory, where the function calls to the libraries redirect to that malloc() region of memory. Once the program is finished, the region can be given up back to the OS with a call to free().
 
That should be a good starting point to writing a dynamic linker.
 
== See Also ==
=== Articles ===
Line 373 ⟶ 351:
* [[Modular Kernel]]
* [[DWARF]]
 
=== External Links ===
* [http://www.skyfree.org/linux/references/ELF_Format.pdf The ELF file format] in detail
Line 379 ⟶ 356:
* [http://www.sco.com/developers/gabi/latest/contents.html System V ABI] about ELF
* [http://www.linuxfoundation.org/en/Specifications LSB specifications]<br />See (generic or platform-specific) 'Core' specifications for additional ELF information.
* [httphttps://en.wikipedia.org/wiki/Executable_and_Linkable_Format Executable and Linkable Format on Wikipedia],which contains a detail of elf references
<!--
* [httphttps://downloads.openwatcomuclibc.org/ftp/devel/docs/elf-64-gen.pdf The ELF file format(64-bit)] ELF 64-Bit, General extension to ELF32.
Link is dead.
*ftp://tsx.mit.edu/pub/linux/packages/GCC/ELF.doc.tar.gz
-->
* [http://en.wikipedia.org/wiki/Executable_and_Linkable_Format Executable and Linkable Format on Wikipedia],which contains a detail of elf references
* [http://downloads.openwatcom.org/ftp/devel/docs/elf-64-gen.pdf The ELF file format(64-bit)] ELF 64-Bit, General extension to ELF32.
* [http://www.x86-64.org/documentation/abi.pdf x86-64 ABI] Documented x86-64 specific extensions with ELF64.
* [http://www.robinhoksbergen.com/papers/howto_elf.html Manually Creating an ELF Executable] (dead, [https://web.archive.org/web/20140130143820/http://www.robinhoksbergen.com/papers/howto_elf.html link from archive.org]) Detailed guide on how to create ELF binaries from scratch.
* [https://www.youtube.com/playlist?list=PLZCIHSjpQ12woLj0sjsnqDH8yVuXwTy3p Handmade Linux x86 executables] Youtube playlist about Linux x86 executables, explains ELF binary structure
[[Category:ABI]]
[[Category:Executable Formats]]