James Molloy's Tutorial Known Bugs: Difference between revisions
[unchecked revision] | [unchecked revision] |
Content deleted Content added
Created page with "Several sources - including this Wiki - point to [http://www.jamesmolloy.co.uk/tutorial_html/ James Molloy's Roll your own toy UNIX-clone OS] Tutorial as a starting point. Thi..." |
|||
(33 intermediate revisions by 16 users not shown) | |||
Line 1:
Several sources - including this Wiki - point to [http://www.jamesmolloy.co.uk/tutorial_html/ James Molloy's Roll your own toy UNIX-clone OS] Tutorial as a starting point. This is
== Before you follow the tutorial ==
{{Main|Bare Bones}}
It is recommended that you follow the this wiki's standard tutorial [[Bare Bones]] before you begin with the tutorial. This ensures you get the a proper cross-compiler and use the proper compilation options. If you have already followed the tutorial, please compare your current build environment against the recommended
== Problem: Not using a cross-compiler ==
Line 13:
== Problem: __cdecl calling convention ==
The tutorial states that the <tt>__cdecl</tt> calling convention is used. This is, however, a Windows term.
== Problem: CFLAGS ==
The tutorial recommends using these compilation options <tt>-nostdlib -nostdinc -fno-builtin -fno-stack-protector</tt>, but this is not the recommended practice. The tutorial
== Problem: Not using libgcc ==
Line 23:
{{Main|libgcc}}
The tutorial disables libc and [[libgcc]] through the <tt>-nodefaultlibs</tt> option (implied by <tt>-nostdlib</tt>) but
== Problem: Not setting a stack ==
{{main|Stack#Setup the stack}}
The tutorial
== Problem: main function ==
Line 35 ⟶ 36:
== Problem: Data types ==
The tutorial uses non-standard data types such as <tt>u32int</tt> while the international C standard (99 revision) introduces standard fixed-width data types like <tt>uint32_t</tt> that you should use instead. Simply include <tt><stdint.h></tt> which comes with your cross-compiler and works even in freestanding mode. This is the reason you should not pass the <tt>-nostdinc</tt> option.
== Problem: Inline Assembly ==
Line 45 ⟶ 46:
== Problem: Missing functions ==
The
== Problem: Interrupt handlers corrupt interrupted state ==
This article previously told you to know the ABI. If you do you will see a huge problem in the interrupt.s suggested by the tutorial: It breaks the ABI for structure passing! It creates an instance of the <tt>struct registers</tt> on the stack and then passes it by value to the <tt>isr_handler</tt> function and then assumes the structure is intact afterwards. However, the function parameters on the stack belongs to the function and it is allowed to trash these values as it sees fit (if you need to know whether the compiler actually does this, you are thinking the wrong way, but it actually does)
There are two ways around this. The most practical method is to pass the structure as a pointer instead, which allows you to explicitly edit the register state when needed - very useful for system calls, without having the compiler randomly doing it for you. The compiler can still edit the pointer on the stack when it's not specifically needed. The second option is to make another copy the structure and pass that.
== Problem: ISR 17 and 21 have error codes ==
The interrupt handling code in the downloadable code have a bug where it handles ISR 17 and 21 by pushing a fake error code, but the CPU does push error codes here.
== Problem: struct registers::esp is useless ==
Line 57 ⟶ 63:
== Problem: __attribute__((packed)) ==
This attribute packs the associated structure. This is useful in a few cases, such as the
== Problem: cli and sti in interrupt handlers ==
The interrupt.s file invokes the <tt>cli</tt> and <tt>sti</tt> in the interrupt handler to disable and enable interrupts, as if the author didn't know whether the interrupt handlers were run with interrupts on or off. You can control whether they are run with interrupts on and off by simply deciding it in your IDT entry for the particular interrupt. The <tt>sti</tt> during the interrupt handler end is also useless as <tt>iret</tt> loads the eflags value from the stack, which contains a bit telling whether interrupts are on or off; in other words the interrupt handler automatically restores interrupts whether or not interrupts were enabled before this interrupt.
== Problem: kmalloc isn't properly aligned ==
Each data type in C has its own natural alignment. For instance, on the ABI that you are using an int is a signed 32-bit value that must be 32-bit aligned in memory (4 byte alignment). The same applies for structures, where the alignment of the whole structure is the maximum alignment of all its members. It is undefined behavior to access an unaligned value. For instance, you could decide you want an int at a particular unaligned (for an int) memory address and construct a pointer to it. When you attempt to write an int value to that pointer, undefined behavior happens. Furthermore, SIMD registers have alignment needs that are bigger than their individual components.
The <tt>kmalloc</tt> function in 6.4.1 only 1-byte aligns or page-aligns its memory address. This means you can only reliably use it allocate memories for chars (size 1), but not any larger types unless you use page-alignment. A proper malloc implementation returns pointers that are aligned such that they are suitable for all the common types, for instance it could be 64-bit (8-byte) aligned. You'll also want to modify the parameters such that it uses <tt>size_t</tt> appropriately rather than <tt>u32int</tt>.
Additionally the check if the address is page aligned is wrong.
<nowiki>
if (align == 1 && (placement_address & 0xFFFFF000)) // If the address is not already page-aligned</nowiki>
should be
<nowiki>
if (align == 1 && (placement_address & 0x00000FFF)) // If the address is not already page-aligned</nowiki>
== Problem: Paging Code ==
The paging code isn't terribly good and it is worth it to fully
== Problem: Heap Code ==
It is probably best that you write your own heap implementation.
There is an operator precedence bug in <tt>find_smallest_hole()</tt> that will cause bad allocations and memory overwrites if attempting to fork in user mode later on.
To fix the problem, change this:
<nowiki>
if ((location+sizeof(header_t)) & 0xFFFFF000 != 0)</nowiki>
to this:
<nowiki>
if ((location+sizeof(header_t) & 0xFFFFF000) != 0)</nowiki>
See the section on user mode below for more details.
== Problem: VFS Code ==
Line 83 ⟶ 108:
== Problem: multiboot.h ==
It's advisable to get a copy of <tt>multiboot.h</tt> from the GRUB source code rather than copied from the tutorial. Beware, the copy in the GRUB documentation is out of date, use one from an official release.
== Problem: Multitasking ==
It is strongly recommended that you write your own implementation of this and disregard the
=== Inline Assembly optimiser problem with GCC 4.8 ===
As mentioned above, writing Inline Assembly can be tricky. The original Inline Assembly is this:
<nowiki>
asm volatile(" \
cli; \
mov %0, %%ecx; \
mov %1, %%esp; \
mov %2, %%ebp; \
mov %3, %%cr3; \
mov $0x12345, %%eax; \
sti; \
jmp *%%ecx "
: : "r"(eip), "r"(esp), "r"(ebp), "r"(current_directory->physicalAddr)); </nowiki>
Everything works fine when using gcc-4.2.4. However, the gcc-4.8.4 optimizer produces the following assembly (produced with '''objdump -d src/kernel'''):
<nowiki>
10387c: fa cli
10387d: 89 c1 mov %eax,%ecx
10387f: 89 d4 mov %edx,%esp
103881: 89 cd mov %ecx,%ebp
103883: 0f 22 db mov %ebx,%cr3
103886: b8 45 23 01 00 mov $0x12345,%eax
10388b: fb sti
10388c: ff e1 jmp *%ecx</nowiki>
Note how the EAX register is assigned to the ECX register. However, later on the ECX register is assigned to EBP register. The reason for this is that the optimizer was using the EAX register to store the EIP variable and the ECX register to store the EBP variable. This results in the EIP variable being assigned to the ECX register ''as well as'' the EBP register. This leads to a subsequent '''ret''' statement sending the CPU to some invalid memory location.
A way to fix this is to remove the Inline Assembly by, for example adding this to '''process.s''':
<nowiki>
; Here we:
; * Stop interrupts so we don't get interrupted.
; * Temporarily put the new EIP location in ECX.
; * Temporarily put the new page directory's physical address in EAX.
; * Set the base and stack pointers
; * Set the page directory
; * Put a dummy value (0x12345) in EAX so that above we can recognize that we've just
; switched task.
; * Restart interrupts. The STI instruction has a delay - it doesn't take effect until after
; the next instruction.
; * Jump to the location in ECX (remember we put the new EIP in there).
[GLOBAL perform_task_switch]
perform_task_switch:
cli;
mov ecx, [esp+4] ; EIP
mov eax, [esp+8] ; physical address of current directory
mov ebp, [esp+12] ; EBP
mov esp, [esp+16] ; ESP
mov cr3, eax ; set the page directory
mov eax, 0x12345 ; magic number to detect a task switch
sti;
jmp ecx</nowiki>
Then edit '''task.c''' and add this to the top of the file:
<nowiki>
extern void perform_task_switch(u32int, u32int, u32int, u32int);</nowiki>
and replace the Inline Assembly with:
<nowiki>
perform_task_switch(eip, current_directory->physicalAddr, ebp, esp);</nowiki>
== Problem: User mode ==
There are several problems. The downloadable code has everything fixed except a <tt>find_smallest_hole()</tt> page-aligned heap allocation bug.
=== Problem 1: nasm byte keyword causing <tt>0x80</tt> to become <tt>0xffffff80</tt> ===
This code is at fault:
<nowiki>
%macro ISR_NOERRCODE 1 ; define a macro, taking one parameter
[GLOBAL isr%1] ; %1 accesses the first parameter.
isr%1:
cli
push byte 0
push byte %1
jmp isr_common_stub
%endmacro</nowiki>
It should be:
<nowiki>
%macro ISR_NOERRCODE 1 ; define a macro, taking one parameter
[GLOBAL isr%1] ; %1 accesses the first parameter.
isr%1:
cli
push byte 0
push %1
jmp isr_common_stub
%endmacro</nowiki>
=== Problem 2: Don't forget to allow interrupts in user mode in idt_set_gate ===
Find this comment in chapter 4 and uncomment the code:
<nowiki>
// We must uncomment the OR below when we get to using user-mode.
// It sets the interrupt gate's privilege level to 3.
idt_entries[num].flags = flags /* | 0x60 */;</nowiki>
=== Problem 3: regs var must called by reference instead of by value in the irq and isr handlers ===
Various changes are needed.
Change this in <tt>page_fault()</tt>:
<nowiki>
void page_fault(registers_t regs)
// to
void page_fault(registers_t *regs)</nowiki>
And in
<tt>isr_handler()</tt> and <tt>irq_handler()</tt>, change
<nowiki>
handler(regs);</nowiki>
to
<nowiki>
handler(®s);</nowiki>
This fixes a problem where the syscall hander won't get called.
=== Problem 4: Missing documentation around set_kernel_stack ===
<tt>KERNEL_STACK_SIZE</tt> needs defining in <tt>task.h</tt>:
<nowiki>
#define KERNEL_STACK_SIZE 2048 // Use a 2kb kernel stack.</nowiki>
Also, some code needs to be added to four sections in <tt>task.c</tt>
In <tt>initialise_tasking()</tt>
<nowiki>
current_task->kernel_stack = kmalloc_a(KERNEL_STACK_SIZE);</nowiki>
In <tt>fork()</tt>
<nowiki>
current_task->kernel_stack = kmalloc_a(KERNEL_STACK_SIZE);</nowiki>
In <tt>switch_task()</tt>
<nowiki>
set_kernel_stack(current_task->kernel_stack+KERNEL_STACK_SIZE);</nowiki>
In <tt>switch_to_user_mode()</tt>
<nowiki>
set_kernel_stack(current_task->kernel_stack+KERNEL_STACK_SIZE);</nowiki>
=== Problem 5: find_smallest_hole() bug in heap code causing fork() to page fault ===
This bug from the heap chapter may not hit you until now. The bug results in the newly allocated <tt>kernel_stack</tt> messing up the page directory, causing <tt>clone_directory()</tt> to fail in <tt>fork()</tt>. See the heap section above for the details.
== Conclusion ==
The tutorial isn't bad as an example, but its design
[[Category:OS Development]]
[[Category:Troubleshooting]]
[[Category:FAQ]]
|