X86-64: Difference between revisions

2,046 bytes added ,  25 days ago
m
Fix lint errors
[unchecked revision][unchecked revision]
m (add a link to the swapgs article)
m (Fix lint errors)
 
(13 intermediate revisions by 10 users not shown)
Line 20:
=== Further information ===
 
:''This feature overview is incomplete. Please see the [[http://en.wikipedia.org/wiki/X86-64 Wikipedia article on x86-64]] for more information.''
 
==Setting up==
Line 33:
* Disable paging
* Set the PAE enable bit in CR4
* Load CR3 with the physical address of the PML4 (Level 4 Page Map)
* Enable long mode by setting the EFER.LME flag (bit 8) in MSR 0xC0000080 (aka EFER)
* Enable paging
 
''Reference: [https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-3a-part-1-manual.pdf Intel 64 and IA-32 Architectures Software Developer's Manual], Section 9.8.5''
 
Now the CPU will be in compatibility mode, and instructions are still 32-bit. To enter long mode, the D/B bit (bit 22, 2nd 32-bit value) of the GDT code segment must be clear (as it would be for a 16-bit code segment), and the L bit (bit 21, 2nd 32-bit value) of the GDT code segment must be set. Once that is done, the CPU is in 64-bit long mode.
Line 163 ⟶ 165:
| LP64
|}
 
=== Text Segment Types ===
 
Another thing that you must keep in mind, that although the address space (and with it all the pointers) are 64 bit wide, the generated code in the text segment is most likely not. That's because by default gcc compiles to the "mov" instruction which has only 32 bit immediate. This means 64 bit programs are limited to 2G, just as 32 bit mode programs.
 
If you have ever seen an error message like this:
<syntaxhighlight lang="bash">
relocation truncated to fit: R_X86_64_32 against symbol
</syntaxhighlight>
then your code hit this barrier. For Assembly, you must use the "movabs" instruction instead of "mov", and for gcc you need to select a different text segment model with the "-mcmodel" argument.
 
{| {{wikitable}}
! Flag
! Text Segment Addressing
|-
| -mcmodel=small
| The program and its symbols must be linked in the lower 2 GB of the address space (this is the default model)
|-
| -mcmodel=large
| This model makes no assumptions about addresses and sizes of sections.
|-
| -mcmodel=medium
| The program is linked in the lower 2 GB of the address space. Small symbols are also placed there. Symbols with sizes larger than -mlarge-data-threshold are put into large data or bss sections and can be located above 2GB.
|-
| -mcmodel=kernel
| The kernel runs in the negative 2 GB of the address space. This model has to be used for Linux kernel code.
|}
It worth noting that code models are different for architectures, as they are tied with the instruction encoding. For example, AArch64 has a "-mcmodel=tiny" too, which allows 1M addressing, unknown to x86_64. And for AArch64 "-mcmodel=small" has a 4G limitation, not 2G as for the x86_64.
 
== See Also ==
Line 168 ⟶ 198:
* [[EM64T|Intel EM64T]]
* [[Creating a 64-bit kernel]]
* [[BOOTBOOT|BOOTBOOT bootloader]]
* [[Limine|Limine bootloader]]
* [[X86-64 Instruction Encoding]]
* [[Setting_Up_Long_Mode|Setting up long mode]]
Line 182 ⟶ 214:
 
[[Category:X86 CPU]]
[[Category:X86-64]]
[[Category:Operating Modes]]