Anonymous user
Context Switching: Difference between revisions
m
minor grammar correction
[unchecked revision] | [unchecked revision] |
No edit summary |
m (minor grammar correction) |
||
(17 intermediate revisions by 14 users not shown) | |||
Line 9:
==Software Context Switching==
Software context switching can be used on all CPUs, and can be used to save and reload only the state that needs to be changed. The basic idea is to provide a function that saves the current [[Stack|stack pointer]] (ESP) and reloads a new stack pointer (SS:ESP). When the function is called EIP would be stored on the old stack and a new EIP would be popped off the new stack when the function returns. Of course the operating system usually needs to change much more than just the stack and EIP.
Note how preemption occurs in an interrupt handler. If your OS saves register state for interrupt handlers, then if you stick a setjmp / longjmp in your scheduler, you can jump into the interrupt handler of the IRQ that preempted the process you're switching to. Then just return.
Eflags, the general registers and any data segment registers should also be pushed on the old stack and popped off the new stack. If the paging structures need to be changed, CR3 will also need to be reloaded.
Line 15 ⟶ 17:
The FPU/MMX and SSE state could be saved and reloaded, but the CPU can also be tricked into generating an exception the first time that an FPU/MMX or SSE instruction is used by copying the hardware context switch mechanism (setting the TS flag in CR0).
===Details===
If a context switch also entails a change in IO port permissions, a different TSS may be loaded for each [[wikipedia:Process_(computing)|Process]]. When running virtual 8086 tasks, the IO permission map in the TSS isn't checked to provide I/O port protection. IO protection can be implemented by setting the IO Permission Level to 0. This will generate a [[Exceptions#General_Protection_Fault|General Protection Fault]] when a process in ring 3 attempts to write to or read from an IO port. The GP fault handler can then check permissions and carry out the port IO on behalf of the user-mode code.
===Other Possibilities===
During a context switch the operating system can do additional work that isn't strictly part of the context switch. One common thing is calculating the amount of time the last thread/task/process used so that software (and the end user) can determine where all the CPU time is going. Another possibility would be dynamically changing thread/task/process
===
Translating a virtual address to a physical address is expensive. The processor must access the pages table structures, which usually have 3-4 levels. Thus, a single memory access actually requires 4-5 memory accesses.
To mitigate this issue, most modern processors cache virtual-to-physical translations in a translation lookaside buffer ([[TLB]]). The TLB is part of the MMU and is (mostly) transparent to the system developer and users.
▲During a context switch the operating system can do additional work that isn't strictly part of the context switch. One common thing is calculating the amount of time the last thread/task/process used so that software (and the end user) can determine where all the CPU time is going. Another possibility would be dynamically changing thread/task/process priorites.
When virtual memory is updated -- for instance, when one process's address space is replaced with another's during a software context switch -- the TLB suddenly contains "stale" translations that are no longer valid. These translations must be flushed for correct behavior. Writing to CR3 will flush the TLB. However, by writing to CR3, you also eliminate all translations for the kernel, in addition to the last user process. This is less than ideal, as the next few operations must wait for the slow virtual-to-physical translations.
Recent Intel and AMD processors sport a tagged TLB, which allow you to tag a given translation with a certain address space configuration. In this scheme TLB entries never get "stale", and thus there is no need to flush the TLB.
==Hardware Context Switching==
Line 36 ⟶ 43:
The hardware context switching mechanism (called Hardware Task Switching in the CPU manuals) can be used to change all of the CPU's state except for the FPU/MMX and SSE state. To use the hardware mechanism you need to tell the CPU where to save the existing CPU state, and where to load the new CPU state. The CPU state is always stored in a special data structure called a TSS (Task State Segment).
To trigger a context switch and tell the CPU where to load
The CPU has a register called the "TR" (or Task Register) which tells which TSS will receive the old CPU state. When the TR register is loaded with an "
===A step further with Hardware Switches ...===
Line 44 ⟶ 51:
In addition to the CALL and JMP instructions, a context switch can be triggered by a using a Task-Gate Descriptor. Unlike TSS Descriptors, task-gate descriptors can be in the GDT, LDT or IDT. Normally, task-gate descriptors are used in the IDT, so that an exception (or IRQ) can cause a context switch, which is the only way of handling a double fault exception with complete reliability.
The design of the basic hardware mechanism is limited by the number of usable entries in the GDT because TSS descriptors can be in the GDT only (theoretical limit is 8190 tasks). However, it is possible to avoid this restriction by dynamically changing TSS descriptor/s, by setting the TSS descriptor's base before each context switch. Care must be taken when using this approach when task-gate descriptors in the IDT are also used (the TSS descriptors
If the FPU/MMX and SSE state also needs to be changed during a context switch there are a few options. The data could be explicitly saved by any code that causes a context switch, or the CPU can generate an exception the first time an FPU/MMX or SSE instruction is used. With the second option, the exception handlers would save the old FPU/MMX/SSE state and reload the new state. This option may prevent this data from being changed when it's not necessary (for e.g. when no tasks or only one task is using them), but fails to work correctly in a
===Performance Considerations===
Because the hardware mechanism saves almost all of the CPU state it can be slower than is necessary. For example, when the CPU loads new segment registers it does all of the access and permission checks that are involved. As most modern operating systems don't use segmentation, loading the segment registers during context switches may be not be required, so for performance reasons these operating systems tend not to use the hardware context switching mechanism. Due to it not being used as much CPU manufacturers don't
[[Category:Processes and Threads]]
[[Category:Multitasking]]
|