Brendan's Multi-tasking Tutorial: Difference between revisions

Jump to navigation Jump to search
[unchecked revision][unchecked revision]
Content deleted Content added
Brendan (talk | contribs)
Clarification of locking for new tasks
Brendan (talk | contribs)
Added some stuff for after multi-tasking is working
Line 937: Line 937:


Once you've decided how your scheduler will work; you should only need to modify your "schedule()" and "unblock_task()" functions to make it happen.
Once you've decided how your scheduler will work; you should only need to modify your "schedule()" and "unblock_task()" functions to make it happen.


==Adding User-Space Support==

Sooner or later you're probably going to want user-space tasks (e.g. processes and threads). This doesn't have much to do with the scheduler itself - it just means that while a kernel task is running it can switch between kernel and user-space where appropriate. It's important/useful to understand that when the CPU is running user-space code, something (system call, IRQ, exception, ...) will happen to cause the CPU to switch to kernel code, and then after kernel code is already running the kernel's code may decide to do a task switch. Because kernel code is always running before a task switch happens, kernel code will also be running after a task switch happens. In other words, task switches only ever happen between tasks running kernel code and other tasks running kernel code (and never involve a task running user-space code). This also means that the "switch_to_task()" code described in this tutorial (which is very fast because it doesn't save or load much) doesn't need to be changed when you add user-space support (until you start supporting things like FPU, MMX, SSE, AVX).

Typically; the main difficulty of adding user-space support is that you need to create a new virtual address space, have an executable loader (that might start the executable file directly, or might start a "user-space executable loader" that starts the executable file), create a new "first task for process" that includes a user-space stack, etc. I prefer to create a kernel task first and then do all this other work (to set up a process) while using the kernel task, because this works a lot better with task priorities (e.g. if a very high priority task creates a very low priority process, then most of the work needed to set up the low priority process is done by a low priority task, and the high priority task can continue without waiting for all that work).

Note: There are two common ways that processes can be started - spawning a new process from nothing, and "fork()". Forking is horribly inefficient and far more complicated (as you need to clone an existing process, typically including making all of the existing process' pages "copy on write", and then reversing all the work very soon after when the new child process calls "exec()"). For beginners I recommend not having "fork()" and just implementing a "spawn_process()" (and for people who are not beginners I recommend not having "fork()" because it's awful and has a risk of security problems - e.g. malicious code using "fork()" and then having its own implementation of "exec()" designed to inject malware into the new child process).

Once you have the ability to start a new process (including starting an initial task/thread for that process), adding a way to spawn a new task/thread for an existing process should be easy.


==Adding FPU/MMX/SSE/AVX Support (80x86 only)==

Originally (for single-CPU) Intel designed the FPU so that an OS can avoid saving/loading FPU state during task switches. The general idea was to keep track of an "FPU owner" and use a flag ("TS" in EFLAGS) to indicate when the currently running task isn't the FPU owner. If the CPU executes an instruction that uses FPU but the current task isn't the FPU owner, then the CPU raises an exception (because "TS" is set), and the exception handler saves the FPU state (belonging to a different task) and loads the FPU state for the currently running task. This can (in some cases) improve performance a lot - for example, if you have 100 tasks running where only one uses FPU, you'd never need to save or load FPU state. Intel continued this original idea when newer extensions (MMX, SSE, AVX) where added (although for AVX the implementation is significantly different).

However; when almost all tasks are using FPU/MMX/SSE/AVX state, it makes performance worse (due to the extra cost of an inevitable exception); and it doesn't work well for multi-CPU (where the currently running task's FPU state may still be in a completely different CPU).

The simplest solution is to always save and load FPU/MMX/SSE/AVX state during all task switches (and that would be a recommended starting point). There are also multiple more complex (and more efficient) ways of doing it; including always saving the previous task's FPU/MMX/SSE/AVX state during task switches if you know it was used but still postponing the work of load the next task's state until the task actually does need it; and including keeping track of "never used, sometimes used, always used" and pre-loading the FPU/MMX/SSE/AVX state when you know a task always uses it (or uses it often enough).