Page Tables: Difference between revisions

From OSDev.wiki
Jump to navigation Jump to search
[unchecked revision][unchecked revision]
Content added Content deleted
m (moved Page tables to Page Tables: Naming convention)
No edit summary
 
(6 intermediate revisions by 3 users not shown)
Line 26: Line 26:


~ [http://forum.osdev.org/viewtopic.php?p=198045#p198045 Brendan]
~ [http://forum.osdev.org/viewtopic.php?p=198045#p198045 Brendan]

Luckily though you can freely mix 4k, 2MiB and 1GiB pages. You don't have to use an uniform page size for everything. So a process having 9MB of data can get 4 2MiB pages and make up the rest with 4k pages. That saves paging structures, improves TLB usage and does not increase the overhead at all.

== Recursive mapping ==
To make it easier to change the current address space's page map, you can map an entry of the highest page map level into itself.

Recursive mapping wastes some virtual address space. This table shows the relative space used by the recursively mapped page table:
{| class="wikitable"
! Mode
! Page size
! Max page map size
! Used virtual space
! Total virtual space
! colspan="2" | Ratio
|-
| rowspan="2" | Protected mode (non-PAE)
| 4 KiB
| style="text-align: right;" | 4 MiB
| rowspan="2" style="text-align: right;" | 4 MiB
| rowspan="2" style="text-align: right;" | 4 GiB
| rowspan="2" style="text-align: right; border-right: none;" | 1/1024
| rowspan="2" style="text-align: left; border-left: none;" | (0.1%)
|-
| 4 MiB
| style="text-align: right;" | 4 KiB
|-
| rowspan="2" | Protected mode (PAE)
| 4 KiB
| style="text-align: right;" | 8 MiB
| rowspan="2" style="text-align: right;" | 1 GiB
| rowspan="2" style="text-align: right;" | 4 GiB
| rowspan="2" style="text-align: right; border-right: none;" | 1/4
| rowspan="2" style="text-align: left; border-left: none;" | (25%)
|-
| 2 MiB
| style="text-align: right;" | 16 KiB
|-
| rowspan="3" | Long mode (48-bit)
| 4 KiB
| style="text-align: right;" | 512 GiB
| rowspan="3" style="text-align: right;" | 512 GiB
| rowspan="3" style="text-align: right;" | 256 TiB
| rowspan="3" style="text-align: right; border-right: none;" | 1/512
| rowspan="3" style="text-align: left; border-left: none;" | (0.2%)
|-
| 2 MiB
| style="text-align: right;" | 1 GiB
|-
| 1 GiB
| style="text-align: right;" | 2 MiB
|}

The ''Recursive mapping'' column in the tables below shows the base address and offset to use to get to a particular page map level when the page map is recursively mapped as the ''last'' entry.


== Protected/compatibility mode (32-bit) page map ==
== Protected/compatibility mode (32-bit) page map ==
Line 61: Line 114:
| style="text-align: right; border-right: none;" | 0x1000
| style="text-align: right; border-right: none;" | 0x1000
| style="text-align: left; border-left: none;" | (4 KiB)
| style="text-align: left; border-left: none;" | (4 KiB)
| style="text-align: right; border-right: none;" | 0x400000
| style="text-align: right; border-right: none;" | 0x40 0000
| style="text-align: left; border-left: none;" | (4 MiB)
| style="text-align: left; border-left: none;" | (4 MiB)
| 10 bits
| 10 bits
Line 67: Line 120:
| style="text-align: right; border-right: none;" | 0x400
| style="text-align: right; border-right: none;" | 0x400
| style="text-align: left; border-left: none;" | (1024)
| style="text-align: left; border-left: none;" | (1024)
| style="text-align: right; border-right: none;" | 0xFFC00000
| style="text-align: right; border-right: none;" | 0xFFC0 0000
| style="text-align: left; border-left: none;" | + 0x1000 * PDi
| style="text-align: left; border-left: none;" | + 0x1000 * PDi
|-
|-
Line 74: Line 127:
| style="text-align: right; border-right: none;" | 0x1000
| style="text-align: right; border-right: none;" | 0x1000
| style="text-align: left; border-left: none;" | (4 KiB)
| style="text-align: left; border-left: none;" | (4 KiB)
| style="text-align: right; border-right: none;" | 0x100000000
| style="text-align: right; border-right: none;" | 0x10000 0000
| style="text-align: left; border-left: none;" | (4 GiB)
| style="text-align: left; border-left: none;" | (4 GiB)
| 10 bits
| 10 bits
| 1024
| 1024
| style="text-align: right; border-right: none;" | 0x100000
| style="text-align: right; border-right: none;" | 0x10 0000
| style="text-align: left; border-left: none;" | (1048576)
| style="text-align: left; border-left: none;" | (1048576)
| style="text-align: right; border-right: none;" | 0xFFFFF000
| style="text-align: right; border-right: none;" | 0xFFFF F000
| style="text-align: left; border-left: none;" |  
| style="text-align: left; border-left: none;" |  
|}
|}
Line 99: Line 152:
| style="text-align: right; border-right: none;" | -
| style="text-align: right; border-right: none;" | -
| style="text-align: left; border-left: none;" |  
| style="text-align: left; border-left: none;" |  
| style="text-align: right; border-right: none;" | 0x400000
| style="text-align: right; border-right: none;" | 0x40 0000
| style="text-align: left; border-left: none;" | (4 MiB)
| style="text-align: left; border-left: none;" | (4 MiB)
| 22 bits
| 22 bits
Line 112: Line 165:
| style="text-align: right; border-right: none;" | 0x1000
| style="text-align: right; border-right: none;" | 0x1000
| style="text-align: left; border-left: none;" | (4 KiB)
| style="text-align: left; border-left: none;" | (4 KiB)
| style="text-align: right; border-right: none;" | 0x100000000
| style="text-align: right; border-right: none;" | 0x10000 0000
| style="text-align: left; border-left: none;" | (4 GiB)
| style="text-align: left; border-left: none;" | (4 GiB)
| 10 bits
| 10 bits
Line 118: Line 171:
| style="text-align: right; border-right: none;" | 0x400
| style="text-align: right; border-right: none;" | 0x400
| style="text-align: left; border-left: none;" | (1024)
| style="text-align: left; border-left: none;" | (1024)
| style="text-align: right; border-right: none;" | 0xFFC00000
| style="text-align: right; border-right: none;" | 0xFFC0 0000
| style="text-align: left; border-left: none;" |  
| style="text-align: left; border-left: none;" |  
|}
|}
Line 153: Line 206:
| style="text-align: right; border-right: none;" | 0x1000
| style="text-align: right; border-right: none;" | 0x1000
| style="text-align: left; border-left: none;" | (4 KiB)
| style="text-align: left; border-left: none;" | (4 KiB)
| style="text-align: right; border-right: none;" | 0x200000
| style="text-align: right; border-right: none;" | 0x20 0000
| style="text-align: left; border-left: none;" | (2 MiB)
| style="text-align: left; border-left: none;" | (2 MiB)
| 9 bits
| 9 bits
Line 159: Line 212:
| style="text-align: right; border-right: none;" | 0x200
| style="text-align: right; border-right: none;" | 0x200
| style="text-align: left; border-left: none;" | (512)
| style="text-align: left; border-left: none;" | (512)
| style="text-align: right; border-right: none;" | 0xC0000000
| style="text-align: right; border-right: none;" | 0xC000 0000
| style="text-align: left; border-left: none;" | + 0x200000 * PDi + 0x1000 * PTi
| style="text-align: left; border-left: none;" | + 0x20 0000 * PDi + 0x1000 * PTi
|-
|-
| 2
| 2
Line 166: Line 219:
| style="text-align: right; border-right: none;" | 0x1000
| style="text-align: right; border-right: none;" | 0x1000
| style="text-align: left; border-left: none;" | (4 KiB)
| style="text-align: left; border-left: none;" | (4 KiB)
| style="text-align: right; border-right: none;" | 0x40000000
| style="text-align: right; border-right: none;" | 0x4000 0000
| style="text-align: left; border-left: none;" | (1 GiB)
| style="text-align: left; border-left: none;" | (1 GiB)
| 9 bits
| 9 bits
Line 172: Line 225:
| style="text-align: right; border-right: none;" | 0x40000
| style="text-align: right; border-right: none;" | 0x40000
| style="text-align: left; border-left: none;" | (262144)
| style="text-align: left; border-left: none;" | (262144)
| style="text-align: right; border-right: none;" | 0xC0600000
| style="text-align: right; border-right: none;" | 0xC060 0000
| style="text-align: left; border-left: none;" | + 0x1000 * PDi
| style="text-align: left; border-left: none;" | + 0x1000 * PDi
|-
|-
Line 179: Line 232:
| style="text-align: right; border-right: none;" | 0x20
| style="text-align: right; border-right: none;" | 0x20
| style="text-align: left; border-left: none;" | (32 bytes)
| style="text-align: left; border-left: none;" | (32 bytes)
| style="text-align: right; border-right: none;" | 0x100000000
| style="text-align: right; border-right: none;" | 0x10000 0000
| style="text-align: left; border-left: none;" | (4 GiB)
| style="text-align: left; border-left: none;" | (4 GiB)
| 2 bits
| 2 bits
| 4
| 4
| style="text-align: right; border-right: none;" | 0x100000
| style="text-align: right; border-right: none;" | 0x10 0000
| style="text-align: left; border-left: none;" | (1048576)
| style="text-align: left; border-left: none;" | (1048576)
| style="text-align: right; border-right: none;" | 0xC0603000
| style="text-align: right; border-right: none;" | 0xC060 3000
| style="text-align: left; border-left: none;" |  
| style="text-align: left; border-left: none;" |  
|}
|}
Line 204: Line 257:
| style="text-align: right; border-right: none;" | -
| style="text-align: right; border-right: none;" | -
| style="text-align: left; border-left: none;" |  
| style="text-align: left; border-left: none;" |  
| style="text-align: right; border-right: none;" | 0x200000
| style="text-align: right; border-right: none;" | 0x20 0000
| style="text-align: left; border-left: none;" | (2 MiB)
| style="text-align: left; border-left: none;" | (2 MiB)
| 21 bits
| 21 bits
Line 217: Line 270:
| style="text-align: right; border-right: none;" | 0x1000
| style="text-align: right; border-right: none;" | 0x1000
| style="text-align: left; border-left: none;" | (4 KiB)
| style="text-align: left; border-left: none;" | (4 KiB)
| style="text-align: right; border-right: none;" | 0x40000000
| style="text-align: right; border-right: none;" | 0x4000 0000
| style="text-align: left; border-left: none;" | (1 GiB)
| style="text-align: left; border-left: none;" | (1 GiB)
| 9 bits
| 9 bits
Line 223: Line 276:
| style="text-align: right; border-right: none;" | 0x200
| style="text-align: right; border-right: none;" | 0x200
| style="text-align: left; border-left: none;" | (512)
| style="text-align: left; border-left: none;" | (512)
| style="text-align: right; border-right: none;" | 0xC0000000
| style="text-align: right; border-right: none;" | 0xC000 0000
| style="text-align: left; border-left: none;" | + 0x200000 * PDi
| style="text-align: left; border-left: none;" | + 0x20 0000 * PDi
|-
|-
| 3
| 3
Line 230: Line 283:
| style="text-align: right; border-right: none;" | 0x20
| style="text-align: right; border-right: none;" | 0x20
| style="text-align: left; border-left: none;" | (32 bytes)
| style="text-align: left; border-left: none;" | (32 bytes)
| style="text-align: right; border-right: none;" | 0x100000000
| style="text-align: right; border-right: none;" | 0x10000 0000
| style="text-align: left; border-left: none;" | (4 GiB)
| style="text-align: left; border-left: none;" | (4 GiB)
| 2 bits
| 2 bits
Line 236: Line 289:
| style="text-align: right; border-right: none;" | 0x800
| style="text-align: right; border-right: none;" | 0x800
| style="text-align: left; border-left: none;" | (2048)
| style="text-align: left; border-left: none;" | (2048)
| style="text-align: right; border-right: none;" | 0xC0600000
| style="text-align: right; border-right: none;" | 0xC060 0000
| style="text-align: left; border-left: none;" |  
| style="text-align: left; border-left: none;" |  
|}
|}
Line 274: Line 327:
| style="text-align: right; border-right: none;" | 0x1000
| style="text-align: right; border-right: none;" | 0x1000
| style="text-align: left; border-left: none;" | (4 KiB)
| style="text-align: left; border-left: none;" | (4 KiB)
| style="text-align: right; border-right: none;" | 0x200000
| style="text-align: right; border-right: none;" | 0x20 0000
| style="text-align: left; border-left: none;" | (2 MiB)
| style="text-align: left; border-left: none;" | (2 MiB)
| 9 bits
| 9 bits
Line 280: Line 333:
| style="text-align: right; border-right: none;" | 0x200
| style="text-align: right; border-right: none;" | 0x200
| style="text-align: left; border-left: none;" | (512)
| style="text-align: left; border-left: none;" | (512)
| style="text-align: right; border-right: none;" | 0xFF80 0000 0000
| style="text-align: right; border-right: none;" | 0xFFFF FF80 0000 0000
| style="text-align: left; border-left: none;" | + 0x40000000 * PDPi + 0x200000 * PDi + 0x1000 * PTi
| style="text-align: left; border-left: none;" | + 0x4000 0000 * PDPi + 0x20 0000 * PDi + 0x1000 * PTi
|-
|-
| 2
| 2
Line 287: Line 340:
| style="text-align: right; border-right: none;" | 0x1000
| style="text-align: right; border-right: none;" | 0x1000
| style="text-align: left; border-left: none;" | (4 KiB)
| style="text-align: left; border-left: none;" | (4 KiB)
| style="text-align: right; border-right: none;" | 0x40000000
| style="text-align: right; border-right: none;" | 0x4000 0000
| style="text-align: left; border-left: none;" | (1 GiB)
| style="text-align: left; border-left: none;" | (1 GiB)
| 9 bits
| 9 bits
Line 293: Line 346:
| style="text-align: right; border-right: none;" | 0x40000
| style="text-align: right; border-right: none;" | 0x40000
| style="text-align: left; border-left: none;" | (262144)
| style="text-align: left; border-left: none;" | (262144)
| style="text-align: right; border-right: none;" | 0xFFFF C000 0000
| style="text-align: right; border-right: none;" | 0xFFFF FFFF C000 0000
| style="text-align: left; border-left: none;" | + 0x200000 * PDPi + 0x1000 * PDi
| style="text-align: left; border-left: none;" | + 0x20 0000 * PDPi + 0x1000 * PDi
|-
|-
| 3
| 3
Line 300: Line 353:
| style="text-align: right; border-right: none;" | 0x1000
| style="text-align: right; border-right: none;" | 0x1000
| style="text-align: left; border-left: none;" | (4 KiB)
| style="text-align: left; border-left: none;" | (4 KiB)
| style="text-align: right; border-right: none;" | 0x8000000000
| style="text-align: right; border-right: none;" | 0x80 0000 0000
| style="text-align: left; border-left: none;" | (512 GiB)
| style="text-align: left; border-left: none;" | (512 GiB)
| 9 bits
| 9 bits
| 512
| 512
| style="text-align: right; border-right: none;" | 0x8000000
| style="text-align: right; border-right: none;" | 0x800 0000
| style="text-align: left; border-left: none;" | (134217728)
| style="text-align: left; border-left: none;" | (134217728)
| style="text-align: right; border-right: none;" | 0xFFFF FFE0 0000
| style="text-align: right; border-right: none;" | 0xFFFF FFFF FFE0 0000
| style="text-align: left; border-left: none;" | + 0x1000 * PDPi
| style="text-align: left; border-left: none;" | + 0x1000 * PDPi
|-
|-
Line 313: Line 366:
| style="text-align: right; border-right: none;" | 0x1000
| style="text-align: right; border-right: none;" | 0x1000
| style="text-align: left; border-left: none;" | (4 KiB)
| style="text-align: left; border-left: none;" | (4 KiB)
| style="text-align: right; border-right: none;" | 0x1000000000000
| style="text-align: right; border-right: none;" | 0x10000 0000 0000
| style="text-align: left; border-left: none;" | (256 TiB)
| style="text-align: left; border-left: none;" | (256 TiB)
| 9 bits
| 9 bits
| 512
| 512
| style="text-align: right; border-right: none;" | 0x1000000000
| style="text-align: right; border-right: none;" | 0x10 0000 0000
| style="text-align: left; border-left: none;" | (68719476736)
| style="text-align: left; border-left: none;" | (68719476736)
| style="text-align: right; border-right: none;" | 0xFFFF FFFF F000
| style="text-align: right; border-right: none;" | 0xFFFF FFFF FFFF F000
| style="text-align: left; border-left: none;" |  
| style="text-align: left; border-left: none;" |  
|}
|}
Line 339: Line 392:
| style="text-align: right; border-right: none;" | -
| style="text-align: right; border-right: none;" | -
| style="text-align: left; border-left: none;" |  
| style="text-align: left; border-left: none;" |  
| style="text-align: right; border-right: none;" | 0x200000
| style="text-align: right; border-right: none;" | 0x20 0000
| style="text-align: left; border-left: none;" | (2 MiB)
| style="text-align: left; border-left: none;" | (2 MiB)
| 21 bits
| 21 bits
Line 352: Line 405:
| style="text-align: right; border-right: none;" | 0x1000
| style="text-align: right; border-right: none;" | 0x1000
| style="text-align: left; border-left: none;" | (4 KiB)
| style="text-align: left; border-left: none;" | (4 KiB)
| style="text-align: right; border-right: none;" | 0x40000000
| style="text-align: right; border-right: none;" | 0x4000 0000
| style="text-align: left; border-left: none;" | (1 GiB)
| style="text-align: left; border-left: none;" | (1 GiB)
| 9 bits
| 9 bits
Line 358: Line 411:
| style="text-align: right; border-right: none;" | 0x200
| style="text-align: right; border-right: none;" | 0x200
| style="text-align: left; border-left: none;" | (512)
| style="text-align: left; border-left: none;" | (512)
| style="text-align: right; border-right: none;" | 0xFF80 0000 0000
| style="text-align: right; border-right: none;" | 0xFFFF FF80 0000 0000
| style="text-align: left; border-left: none;" | + 0x40000000 * PDPi + 0x200000 * PDi
| style="text-align: left; border-left: none;" | + 0x4000 0000 * PDPi + 0x20 0000 * PDi
|-
|-
| 3
| 3
Line 365: Line 418:
| style="text-align: right; border-right: none;" | 0x1000
| style="text-align: right; border-right: none;" | 0x1000
| style="text-align: left; border-left: none;" | (4 KiB)
| style="text-align: left; border-left: none;" | (4 KiB)
| style="text-align: right; border-right: none;" | 0x8000000000
| style="text-align: right; border-right: none;" | 0x80 0000 0000
| style="text-align: left; border-left: none;" | (512 GiB)
| style="text-align: left; border-left: none;" | (512 GiB)
| 9 bits
| 9 bits
Line 371: Line 424:
| style="text-align: right; border-right: none;" | 0x40000
| style="text-align: right; border-right: none;" | 0x40000
| style="text-align: left; border-left: none;" | (262144)
| style="text-align: left; border-left: none;" | (262144)
| style="text-align: right; border-right: none;" | 0xFFFF C000 0000
| style="text-align: right; border-right: none;" | 0xFFFF FFFF C000 0000
| style="text-align: left; border-left: none;" | + 0x200000 * PDPi
| style="text-align: left; border-left: none;" | + 0x20 0000 * PDPi
|-
|-
| 4
| 4
Line 378: Line 431:
| style="text-align: right; border-right: none;" | 0x1000
| style="text-align: right; border-right: none;" | 0x1000
| style="text-align: left; border-left: none;" | (4 KiB)
| style="text-align: left; border-left: none;" | (4 KiB)
| style="text-align: right; border-right: none;" | 0x1000000000000
| style="text-align: right; border-right: none;" | 0x10000 0000 0000
| style="text-align: left; border-left: none;" | (256 TiB)
| style="text-align: left; border-left: none;" | (256 TiB)
| 9 bits
| 9 bits
Line 384: Line 437:
| style="text-align: right; border-right: none;" | 0x8000000
| style="text-align: right; border-right: none;" | 0x8000000
| style="text-align: left; border-left: none;" | (134217728)
| style="text-align: left; border-left: none;" | (134217728)
| style="text-align: right; border-right: none;" | 0xFFFF FFE0 0000
| style="text-align: right; border-right: none;" | 0xFFFF FFFF FFE0 0000
| style="text-align: left; border-left: none;" |  
| style="text-align: left; border-left: none;" |  
|}
|}
Line 404: Line 457:
| style="text-align: right; border-right: none;" | -
| style="text-align: right; border-right: none;" | -
| style="text-align: left; border-left: none;" |  
| style="text-align: left; border-left: none;" |  
| style="text-align: right; border-right: none;" | 0x40000000
| style="text-align: right; border-right: none;" | 0x4000 0000
| style="text-align: left; border-left: none;" | (1 GiB)
| style="text-align: left; border-left: none;" | (1 GiB)
| 30 bits
| 30 bits
Line 417: Line 470:
| style="text-align: right; border-right: none;" | 0x1000
| style="text-align: right; border-right: none;" | 0x1000
| style="text-align: left; border-left: none;" | (4 KiB)
| style="text-align: left; border-left: none;" | (4 KiB)
| style="text-align: right; border-right: none;" | 0x8000000000
| style="text-align: right; border-right: none;" | 0x80 0000 0000
| style="text-align: left; border-left: none;" | (512 GiB)
| style="text-align: left; border-left: none;" | (512 GiB)
| 9 bits
| 9 bits
Line 423: Line 476:
| style="text-align: right; border-right: none;" | 0x200
| style="text-align: right; border-right: none;" | 0x200
| style="text-align: left; border-left: none;" | (512)
| style="text-align: left; border-left: none;" | (512)
| style="text-align: right; border-right: none;" | 0xFF80 0000 0000
| style="text-align: right; border-right: none;" | 0xFFFF FF80 0000 0000
| style="text-align: left; border-left: none;" | + 0x40000000 * PDPi
| style="text-align: left; border-left: none;" | + 0x4000 0000 * PDPi
|-
|-
| 4
| 4
Line 430: Line 483:
| style="text-align: right; border-right: none;" | 0x1000
| style="text-align: right; border-right: none;" | 0x1000
| style="text-align: left; border-left: none;" | (4 KiB)
| style="text-align: left; border-left: none;" | (4 KiB)
| style="text-align: right; border-right: none;" | 0x1000000000000
| style="text-align: right; border-right: none;" | 0x10000 0000 0000
| style="text-align: left; border-left: none;" | (256 TiB)
| style="text-align: left; border-left: none;" | (256 TiB)
| 9 bits
| 9 bits
| 512
| 512
| style="text-align: right; border-right: none;" | 0x40000
| style="text-align: right; border-right: none;" | 0x40 000
| style="text-align: left; border-left: none;" | (262144)
| style="text-align: left; border-left: none;" | (262144)
| style="text-align: right; border-right: none;" | 0xFFFF C000 0000
| style="text-align: right; border-right: none;" | 0xFFFF FFFF C000 0000
| style="text-align: left; border-left: none;" |  
| style="text-align: left; border-left: none;" |  
|}
|}
Line 451: Line 504:


[[Category:Memory management]]
[[Category:Memory management]]
[[Category:Paging]]
[[Category:Virtual Memory]]

Latest revision as of 20:27, 10 July 2023

The page tables (or page map levels) are used to map each virtual page to a corresponding physical page. Zero or more virtual pages can correspond to the same physical page. The size of a page depends on the processor mode (protected, compatibility or long mode), the extensions used (e.g. PAE) and the virtual address bits supported by the processor (current AMD64 processors support up to 48-bit virtual addresses).

Determining your page size

To determine the ideal page size, you can't just look at the overhead of the paging structures alone.

Typically (for user space), each process has at least 3 areas with different characteristics: executable, read only no-execute and read/write no-execute. When paging is used to enforce this, you end up with padding from the actual end of each area up to the next page boundary. You can assume that on average this padding will be 50% of the page size. For example, with 4096 byte pages and 3 areas you'd expect 6 KiB of RAM wasted per process due to padding, and with 2 MiB pages you'd expect 3 MiB of RAM wasted per process.

Typically (for OSs like Windows and Linux) there's around 50 processes where most use very little RAM and some use lots (X, browser, etc). If you assume 50 processes that use an average of 10 MiB of RAM each; then (for 4 KiB pages in long mode) each process (on average) would use a PML4, PDPT, PD and 5 page tables; or 32 KiB for all paging structures. For 2 MiB pages each process (on average) would use a PML4, PDPT and a PD; or 12 KiB for all paging structures.

That gives us some rough figures for comparison. For 50 processes where each process is an average of 10 MiB and has 3 different areas:

  • 4 KiB paging will cost about 6 KiB for padding and 32 KiB for paging structures, or about 38 KiB of overhead per process
  • 2 MiB paging will cost about 3 MiB for padding and 12 KiB for paging structures, or about 3084 KiB of overhead per process
  • 4 KiB paging will cost about 1.86 MiB of overhead for all 50 processes combined
  • 2 MiB paging will cost about 150.6 MiB of overhead for all 50 processes combined
  • With 500 MiB of RAM actually used by all processes; 1.86 MiB of total overhead works out to about 0.37% more, which is almost nothing
  • With 500 MiB of RAM actually used by all processes; 150.6 MiB of total overhead works out to about 23.15% more, which is massive

For performance (e.g. TLB misses) it's much harder to estimate the likely cost, as it depends on how much of the paging structures remain in the cache (and how much has to be fetched from RAM), how large the TLBs are (for both small pages and large pages), if the CPU caches higher level structures (modern CPUs do), the working set of each process and its access pattern, how often you switch between processes, etc...

However (for typical OSs with typical loads), I'd assume it's very unlikely that the performance gains you get from using 2 MiB pages is going to justify having roughly 23% more RAM wasted.

Basically, 4 KiB pages (with 4 levels of paging structures) is starting to get a little small, but the next step up (2 MiB pages with 3 levels of paging structures) is far too big to be practical for most things.

To reduce the number of levels of paging structures, a better idea would be to also increase the size of page directories, PDPTs, etc. For example, for 55-bit virtual addresses you could have 64 KiB pages, 64 KiB page tables, 64 KiB page directories and 64 KiB PDPTs. Unfortunately we have to wait for Intel (or AMD) to do something like that though.

~ Brendan

Luckily though you can freely mix 4k, 2MiB and 1GiB pages. You don't have to use an uniform page size for everything. So a process having 9MB of data can get 4 2MiB pages and make up the rest with 4k pages. That saves paging structures, improves TLB usage and does not increase the overhead at all.

Recursive mapping

To make it easier to change the current address space's page map, you can map an entry of the highest page map level into itself.

Recursive mapping wastes some virtual address space. This table shows the relative space used by the recursively mapped page table:

Mode Page size Max page map size Used virtual space Total virtual space Ratio
Protected mode (non-PAE) 4 KiB 4 MiB 4 MiB 4 GiB 1/1024 (0.1%)
4 MiB 4 KiB
Protected mode (PAE) 4 KiB 8 MiB 1 GiB 4 GiB 1/4 (25%)
2 MiB 16 KiB
Long mode (48-bit) 4 KiB 512 GiB 512 GiB 256 TiB 1/512 (0.2%)
2 MiB 1 GiB
1 GiB 2 MiB

The Recursive mapping column in the tables below shows the base address and offset to use to get to a particular page map level when the page map is recursively mapped as the last entry.

Protected/compatibility mode (32-bit) page map

In protected mode, the virtual address space is 32-bit (4 GiB) in size, regardless of whether PAE is enabled or not.

Non-PAE mode

Without enabling PAE, the page map can reference physical pages with addresses up to 32-bit (4 GiB).

4 KiB pages

Level Table Size Range Bits Entries Pages Recursive mapping
0 (page) -   0x1000 (4 KiB) 12 bits - 0x1 (1) -  
1 PT 0x1000 (4 KiB) 0x40 0000 (4 MiB) 10 bits 1024 0x400 (1024) 0xFFC0 0000 + 0x1000 * PDi
2 PD 0x1000 (4 KiB) 0x10000 0000 (4 GiB) 10 bits 1024 0x10 0000 (1048576) 0xFFFF F000  

4 MiB pages

Level Table Size Range Bits Entries Pages Recursive mapping
0 (page) -   0x40 0000 (4 MiB) 22 bits - 0x1 (1) -  
2 PD 0x1000 (4 KiB) 0x10000 0000 (4 GiB) 10 bits 1024 0x400 (1024) 0xFFC0 0000  

PAE mode

With PAE, the page map can reference physical pages with addresses up to 36-bit (64 GiB).

4 KiB pages

Level Table Size Range Bits Entries Pages Recursive mapping
0 (page) -   0x1000 (4 KiB) 12 bits - 0x1 (1) -  
1 PT 0x1000 (4 KiB) 0x20 0000 (2 MiB) 9 bits 512 0x200 (512) 0xC000 0000 + 0x20 0000 * PDi + 0x1000 * PTi
2 PD 0x1000 (4 KiB) 0x4000 0000 (1 GiB) 9 bits 512 0x40000 (262144) 0xC060 0000 + 0x1000 * PDi
3 PDP 0x20 (32 bytes) 0x10000 0000 (4 GiB) 2 bits 4 0x10 0000 (1048576) 0xC060 3000  

2 MiB pages

Level Table Size Range Bits Entries Pages Recursive mapping
0 (page) -   0x20 0000 (2 MiB) 21 bits - 0x1 (1) -  
2 PD 0x1000 (4 KiB) 0x4000 0000 (1 GiB) 9 bits 512 0x200 (512) 0xC000 0000 + 0x20 0000 * PDi
3 PDP 0x20 (32 bytes) 0x10000 0000 (4 GiB) 2 bits 4 0x800 (2048) 0xC060 0000  


Long mode (64-bit) page map

In long mode, the virtual address space could in theory be 64-bit (16 EiB) in size, but individual processors allow only a portion of that space to be addressed. The currently most common processor implementations allow a 48-bit (256 TiB) address space. For such virtual addresses bits 48-63 must be a copy of bit 47 (similar to sign extension), and this splits the virtual address space in a higher half and a lower half.

48-bit virtual address space

4 KiB pages

Level Table Size Range Bits Entries Pages Recursive mapping
0 (page) -   0x1000 (4 KiB) 12 bits - 0x1 (1) -  
1 PT 0x1000 (4 KiB) 0x20 0000 (2 MiB) 9 bits 512 0x200 (512) 0xFFFF FF80 0000 0000 + 0x4000 0000 * PDPi + 0x20 0000 * PDi + 0x1000 * PTi
2 PD 0x1000 (4 KiB) 0x4000 0000 (1 GiB) 9 bits 512 0x40000 (262144) 0xFFFF FFFF C000 0000 + 0x20 0000 * PDPi + 0x1000 * PDi
3 PDP 0x1000 (4 KiB) 0x80 0000 0000 (512 GiB) 9 bits 512 0x800 0000 (134217728) 0xFFFF FFFF FFE0 0000 + 0x1000 * PDPi
4 PML4 0x1000 (4 KiB) 0x10000 0000 0000 (256 TiB) 9 bits 512 0x10 0000 0000 (68719476736) 0xFFFF FFFF FFFF F000  


2 MiB pages

Level Table Size Range Bits Entries Pages Recursive mapping
0 (page) -   0x20 0000 (2 MiB) 21 bits - 0x1 (1) -  
2 PD 0x1000 (4 KiB) 0x4000 0000 (1 GiB) 9 bits 512 0x200 (512) 0xFFFF FF80 0000 0000 + 0x4000 0000 * PDPi + 0x20 0000 * PDi
3 PDP 0x1000 (4 KiB) 0x80 0000 0000 (512 GiB) 9 bits 512 0x40000 (262144) 0xFFFF FFFF C000 0000 + 0x20 0000 * PDPi
4 PML4 0x1000 (4 KiB) 0x10000 0000 0000 (256 TiB) 9 bits 512 0x8000000 (134217728) 0xFFFF FFFF FFE0 0000  


1 GiB pages

Level Table Size Range Bits Entries Pages Recursive mapping
0 (page) -   0x4000 0000 (1 GiB) 30 bits - 0x1 (1) -  
3 PDP 0x1000 (4 KiB) 0x80 0000 0000 (512 GiB) 9 bits 512 0x200 (512) 0xFFFF FF80 0000 0000 + 0x4000 0000 * PDPi
4 PML4 0x1000 (4 KiB) 0x10000 0000 0000 (256 TiB) 9 bits 512 0x40 000 (262144) 0xFFFF FFFF C000 0000  

See Also

Articles

Forum