Paging: Difference between revisions

general improvement
[unchecked revision][unchecked revision]
(Fixed some formatting, organization and content issues.)
(general improvement)
Line 7:
Once an Operating System has paging, it can also make use of other benefits and workarounds, such as linear framebuffer simulation for memory-mapped IO and paging out to disk, where disk storage space is used to free up physical RAM.
 
== MMU ==
 
Paging is achieved through the use of the MMU (temporary: [[MMU|article 1]], [[Memory Management Unit|article 2]]). On the x86, the MMU maps memory through a series of tables, two to be exact. They are the paging directory (PD), and the paging table (PT).
Line 13:
Both tables contain 1024 4-byte entries, making them 4 KiB each. In the page directory, each entry points to a page table. In the page table, each entry points to a physical address that is then mapped to the virtual address found by calculating the offset within the directory and the offset within the table. This can be done as the entire table system represents a linear 4-GiB virtual memory map.
 
=== Page Directory ===
The topmost paging structure is the page directory. It is essentially an array of page directory entries that take the following form.
 
Line 33:
'''Note: With 4-MiB pages, bits 21 through 12 are reserved!''' Thus, the physical address must also be 4-MiB-aligned.
 
=== Page Table ===
[[Image:Page table.png|frame|A Page Table Entry]]
 
Line 40:
''Note: Only explanations of the bits unique to the page table are below.
 
The first item, is once again, a 4kb4-KiB aligned physical address. Unlike previously, however, the address is not that of a page table, but instead a 4kb4 KiB block of physical memory that is then mapped to that location in the page table and directory.
 
The Global, or 'G' above, flag, if set, prevents the [[TLB]] from updating the address in it's cache if CR3 is reset. Note, that the page global enable bit in CR4 must be set to enable this feature.
Line 48:
The 'C' bit is 'D' bit above.
 
=== Example ===
 
Say I loaded my kernel to 0x100000. However, I want it mapped to 0xc0000000. After loading my kernel, I initiate paging, and set up the appropriate tables. (See [[Higher Half Kernel]]) After [[Identity Paging]] the first megabyte, I start to create my second table (ie. at entry #768 in my directory.) to map 0x100000 to 0xc0000000. My code could be like: <br /><br />
 
<source lang="asm">
Line 67:
</source>
 
== Enabling ==
Enabling paging is actually very simple. All that is needed is to load CR3 with the address of the page directory and to set the paging bit of CR0.
 
Line 89:
</source>
 
== Physical Address Extension ==
All Intel processors since Pentium Pro (with exception of the Pentium M at 400 Mhz) and all AMD since the Athlon series implement the [[PAE|Physical Address Extension]] (PAE). This feature allows you to access up to 64 GBGiB (2^36) of RAM. You can check for this feature using CPUID. Once checked, you can activate this feature by setting bit 5 in CR4. Once active, the CR3 register points to a table of 4 64bit64-bit entries, each one pointing to a page directory made of 4096 bytes (like in normal paging), divided into 512 64bit64-bit entries, each pointing to a 4096 byte page table, divided into 512 64bit page entries.
 
== Usage ==
Due to the simplicity in the design of paging, it has many uses.
 
=== Virtual Address Spaces ===
In a paged system, each process may execute in its own 4gb4 GiB area of memory, without any chance of effecting any other process's memory, or the kernel's.
[[Image:Virtual memory.png|frame|none|paging illustrated: two process with different views of the same physical memory]]
 
=== Virtual Memory ===
Because paging allows for the dynamic handling of unallocated page tables, an OS can swap entire pages, not in current use, to the hard drive where they can wait until they are called. In the mean time, however, the physical memory that they were using can be used elsewhere. In this way, the OS can manipulate the system so that programs actually seem to have more RAM than there actually is.
 
''More...''
 
== Manipulation ==
 
The CR3 value, that is, the value containing the address of the page directory, is in physical form. Once, then, the computer is in paging mode, only recognizing those virtual addresses mapped into the paging tables, how can the tables be edited and dynamically changed?
 
Many prefer to map the last PDE to itself. The page directory will look like a page table to the system. To get the physical address of any virtual address in the range 0x00000000-0xFFFFF000 is then just a matter of:
 
<source lang="C">
void * get_physaddr(void * virtualaddr)
Line 126 ⟶ 127:
 
To map a virtual address to a physical address can be done as follows:
 
<source lang="C">
void map_page(void * physaddr, void * virtualaddr, unsigned int flags)
Line 149 ⟶ 151:
}
</source>
 
Unmapping an entry is essentially the same as above, but instead of assigning the <code>pt[ptindex]</code> a value, you set it to 0x00000000 (i.e. not present). When the entire page table is empty, you may want to remove it and mark the page directory entry 'not present'. Of course you don't need the 'flags' or 'physaddr' for unmapping.
 
== Page Faults ==
A page fault exception is caused when a process is seeking to access an area of virtual memory that is not mapped to any physical memory, when a write is attempted on a read-only page, when accessing a PTE or PDE with the reserved bit or when permissions are inadequate.
 
=== Handling ===
The CPU pushes an error code on the stack before firing a page fault exception. The error code must be analyzed by the exception handler to determine how to handle the exception. The bottom 3 bits of the exception code are the only ones used, bits 3-31 are reserved.
Bit 0 (P) is the Present flag.
Line 172 ⟶ 175:
When the CPU fires a page-not-present exception the CR2 register is populated with the linear address that caused the exception. The upper 10 bits specify the page directory entry (PDE) and the middle 10 bits specify the page table entry (PTE). First check the PDE and see if it's present bit is set, if not setup a page table and point the PDE to the base address of the page table, set the present bit and iretd. If the PDE is present then the present bit of the PTE will be cleared. You'll need to map some physical memory to the page table, set the present bit and then iretd to continue processing.
 
== INVLPG ==
 
INVLPG is an instruction available since the i486 that invalidates a single page in the TLB. Intel notes that this instruction may be implemented differently on future processes, but that this alternate behavior must be explicitly enabled. INVLPG modifies no flags.
 
NASM example:
 
<source lang="C">
invlpg [0]
</source>
 
Inline asmassembly infor GCC (from Linux kernel source):
 
<source lang="C">
static inline void __native_flush_tlb_single(unsigned long addr)
Line 189 ⟶ 194:
</source>
 
== Paging Tricks ==
The processor always fires a page fault exception when the present bit is cleared in the PDE or PTE regardless of the address. This means the contents of the PTE or PDE can be used to indicate a location of the page saved on mass storage and to quickly load it. When a page gets swapped to disk, use these entries to identify the location in the paging file where they can be quickly loaded from then set the present bit to 0.
 
== See Also ==
 
=== Articles ===
*[[Identity Paging]]
*[[Page Frame Allocation]]
Line 199 ⟶ 205:
*[[Page Tables]]
 
=== External Links ===
*[http://forum.osdev.org/viewtopic.php?f=1&t=18222 INVLPG thread]
*[http://www.dumaisnet.ca/index.php?article=ff3b7adb128cb438ac1e306b3fbe37e7 Process Context ID]
Anonymous user