James Molloy's Tutorial Known Bugs: Difference between revisions

m
[unchecked revision][unchecked revision]
m (renamed real mode to user mode)
 
(14 intermediate revisions by 9 users not shown)
Line 1:
Several sources - including this Wiki - point to [http://www.jamesmolloy.co.uk/tutorial_html/ James Molloy's Roll your own toy UNIX-clone OS] Tutorial as a starting point. This is wellfine, but the tutorial has some well-known weak points that cause trouble for people again and again. It's not uncommon that well-established members traced back mysterious bugs to early parts of their operating systems based on this tutorial. Nonetheless, it's one of the best introductory tutorials out there even if it has the occasional landmine. This article is meant to preempt issues arising from following the tutorial and to aid those that have encountered such problems. It is generally recommended to be sceptical of its advise on how to design your kernel and compare its information against this wiki. Some issues are quite subtle and only experts will recognize them.
 
== Before you follow the tutorial ==
Line 13:
== Problem: __cdecl calling convention ==
 
The tutorial states that the <tt>__cdecl</tt> calling convention is used. This is, however, a Windows term. and yourYour cross-compiler uses a similar calling convention but it is called the System V ABI for i386. It is advisable to understand this calling convention in depth, especially parts about how the parameters on the stack are clobbered and how structure parameters are passed. This will be very useful later and will help you avoid a later subtle bug. The function call example in 2.3 neglects to add 12 to esp following the call instruction, andso the three parameters are never thus popped.
 
== Problem: CFLAGS ==
Line 46:
== Problem: Missing functions ==
 
The gccGCC documentation mentions that the <tt>memset</tt>, <tt>memcpy</tt>, <tt>memmove</tt> and <tt>memcmp</tt> functions must always be present. The compiler uses these automatically for certain optimization purposes and even code that doesn't use them can automatically generate calls to them. You should add them at your earliest convenience.
 
== Problem: Interrupt handlers corrupt interrupted state ==
Line 53:
There are two ways around this. The most practical method is to pass the structure as a pointer instead, which allows you to explicitly edit the register state when needed - very useful for system calls, without having the compiler randomly doing it for you. The compiler can still edit the pointer on the stack when it's not specifically needed. The second option is to make another copy the structure and pass that.
 
== Problem: ISR 17 and 3021 have error codes ==
 
The interrupt handling code in the downloadable code have a bug where it handles ISR 17 and 3021 by pushing a fake error code, but the CPU does push error codes here.
 
== Problem: struct registers::esp is useless ==
Line 63:
== Problem: __attribute__((packed)) ==
 
This attribute packs the associated structure. This is useful in a few cases, such as the idtIDT and gdtGDT code (actually just the idtIDT, gdtGDT and tssTSS pointers). However, the tutorial tends to randomly attach it to every struct parameter, even where it isn't even needed. It is only needed where you badly want aligned structure members, it doesn't do anything if all the structure members were already naturally aligned. Otherwise, the compiler will automatically insert gaps between structure members so each begins at its own natural alignment.
 
== Problem: cli and sti in interrupt handlers ==
Line 74:
 
The <tt>kmalloc</tt> function in 6.4.1 only 1-byte aligns or page-aligns its memory address. This means you can only reliably use it allocate memories for chars (size 1), but not any larger types unless you use page-alignment. A proper malloc implementation returns pointers that are aligned such that they are suitable for all the common types, for instance it could be 64-bit (8-byte) aligned. You'll also want to modify the parameters such that it uses <tt>size_t</tt> appropriately rather than <tt>u32int</tt>.
 
Additionally the check if the address is page aligned is wrong.
<nowiki>
if (align == 1 && (placement_address & 0xFFFFF000)) // If the address is not already page-aligned</nowiki>
should be
<nowiki>
if (align == 1 && (placement_address & 0x00000FFF)) // If the address is not already page-aligned</nowiki>
 
== Problem: Paging Code ==
 
The paging code isn't terribly good and it is worth it to fully understand paging and design it all yourself. Paging code tends to be quite ugly, but it'll probably be decent after your fifth design revision. There is no need to always re-enable paging in <tt>switch_page_directory</tt>, it is likely best to have a special function the first time paging is enabled. The inlineInline assemblyAssembly in 6.4.5. doesn't need to be volatile as it is simply reading a memory value, which has no side-effects and it is acceptable if the compiler optimizes it away if the value is never used.
 
== Problem: Heap Code ==
 
It is probably best that you write your own heap implementation.
 
There is an operator precedence bug in <tt>find_smallest_hole()</tt> that will cause bad allocations and memory overwrites if attempting to fork in user mode later on.
 
To fix the problem, change this:
<nowiki>
if (((location+sizeof(header_t)) & 0xFFFFF000) != 0)</nowiki>
 
to this:
<nowiki>
if ((location+sizeof(header_t) & 0xFFFFF000) != 0)</nowiki>
 
See the section on user mode below for more details.
 
== Problem: VFS Code ==
Line 90 ⟶ 109:
 
It's advisable to get a copy of <tt>multiboot.h</tt> from the GRUB source code rather than copied from the tutorial. Beware, the copy in the GRUB documentation is out of date, use one from an official release.
 
 
== Problem: Multitasking ==
Line 95 ⟶ 115:
It is strongly recommended that you write your own implementation of this and disregard the tutorial. The tutorial attempts to implement forking kernel threads by searching for magic values on the stack, which is insanity. If you wish to create a new kernel thread, simply decide which registers it should have and point its stack pointer at its freshly allocated stack. It will then start executing at your desired entry point. The part where it disables paging is bad and you should just map the source and destination physical frames at appropriate virtual addresses and memcpy with paging on at all times. Section 9.3 in particular is insanity and has blown up at least one well-established hobby operating system.
 
 
=== Inline assemblyAssembly optimiser problem with gccGCC 4.8 ===
As mentioned above, writing inlineInline assemblyAssembly can be tricky. The original inlineInline assemblyAssembly is this:
 
<nowiki>
Line 121 ⟶ 142:
10388c: ff e1 jmp *%ecx</nowiki>
 
Note how the eaxEAX register is assigned to the ecxECX register. However, later on the ecxECX register is assigned to ebpEBP register. The reason for this is that the optimizer was using the eaxEAX register to store the eipEIP variable and the ecxECX register to store the ebpEBP variable. This results in the eipEIP variable being assigned to the ecxECX register ''as well as'' the ebpEBP register. This leads to a subsequent '''ret''' statement sending the cpuCPU to some invalid memory location.
 
A way to fix this is to remove the inlineInline assemblyAssembly by, for example adding this to '''process.cs''':
<nowiki>
; Here we:
Line 131 ⟶ 152:
; * Set the base and stack pointers
; * Set the page directory
; * Put a dummy value (0x12345) in EAX so that above we can recogniserecognize that we've just
; switched task.
; * Restart interrupts. The STI instruction has a delay - it doesn't take effect until after
Line 140 ⟶ 161:
perform_task_switch:
cli;
mov ecx, [esp+4] ; eipEIP
mov eax, [esp+8] ; physical address of current directory
mov ebp, [esp+12] ; ebpEBP
mov esp, [esp+16] ; espESP
mov cr3, eax ; set the page directory
mov eax, 0x12345 ; magic number to detect a task switch
Line 152 ⟶ 173:
<nowiki>
extern void perform_task_switch(u32int, u32int, u32int, u32int);</nowiki>
and replace the inlineInline assemblyAssembly with:
<nowiki>
perform_task_switch(eip, current_directory->physicalAddr, ebp, esp);</nowiki>
Line 185 ⟶ 206:
%endmacro</nowiki>
 
=== Problem 2: Don't forget to allow interrupts in user mode in idt_set_gate ===
 
Find this comment in chapter 4 and uncomment the code:
Line 213 ⟶ 234:
This fixes a problem where the syscall hander won't get called.
 
=== Problem 4: MisingMissing documentation around set_kernel_stack ===
 
<tt>KERNEL_STACK_SIZE</tt> needs defining in <tt>task.h</tt>:
Line 237 ⟶ 258:
set_kernel_stack(current_task->kernel_stack+KERNEL_STACK_SIZE);</nowiki>
 
=== Problem 5: Operator precedencefind_smallest_hole() bug in find_smallest_holeheap code causing fork() to page fault ===
 
There is an operator precedenceThis bug infrom <tt>find_smallest_hole()</tt>the thatheap willchapter causemay badnot allocationshit andyou memoryuntil overwritesnow. ifThe attempting to fork. Thisbug results in the newly allocated <tt>kernel_stack</tt> messing up the page directory, causing <tt>clone_directory()</tt> to fail in <tt>fork()</tt>. See the heap section above for the details.
 
To fix the problem, change this:
<nowiki>
if ((location+sizeof(header_t) & 0xFFFFF000) != 0)</nowiki>
 
to this:
<nowiki>
if (((location+sizeof(header_t)) & 0xFFFFF000) != 0)</nowiki>
 
== Conclusion ==
73

edits