C++ Bare Bones

From OSDev.wiki
Jump to navigation Jump to search
Difficulty level

Medium
Kernel Designs
Models
Other Concepts
In this tutorial we will compile a simple C++ kernel and boot it. The theory can be found in C++.

WAIT! Have you read Getting Started, Beginner Mistakes, and some of the related OS theory?

Preface

This tutorial assumes you have a compiler/assembler/linker toolchain capable of handling ELF files. On a Windows machine, you are strongly encouraged to set up a GCC Cross-Compiler, as it removes all the various toolchain-specific issues you might have ("PE operation on a non-PE file", "unsupported file format", and a number of others). While *nix machines already have ELF capable toolchains, you are still encouraged to build a cross-compiler, as it keeps you from relying on things you shouldn't (header and library files, for example).

To make starting an OS easy, we will be using a lot of existing parts: GRUB will be the bootloader, and the kernel will be in ELF format. GRUB (a Multiboot compliant boot loader) puts the system in the correct state for your kernel to start executing. This includes enabling the A20 line (to give you access to all available memory addresses) and switching the system to 32-bit Protected Mode, giving you access to a theoretical 4GiB of memory. We will not use a flat binary but a kernel in ELF format, so that we can tell GRUB where to load which part in memory.

Overview

Even when using GRUB, some setup is required before entering a main() type function. The most basic setup to get an ELF format kernel to be booted by GRUB consists of three files:

  • loader.s - assembler "glue" between bootloader and kernel
  • kernel.c - your actual kernel routines
  • linker.ld - for linking the above files

C++ specifics

loader.s takes over over control from the Multiboot bootloader, and jumps into the kernel proper.

Using C++ requires you to take a few extra sections into account when compared to the C variant of this tutorial:

Section Name Description
ctor C++ static/global constructors Constructors for static objects used in C++ (must be explicitely called, see below)
dtor C++ static/global destructors Destructors for static objects used in C++ (ditto, though the usefulness is open for discussion)
gnu.linkonce GCC vague linkages Sections dedicated for GCC's vague linking (see the documentation for more information)

Because there is no environment executing your kernel (you can't expect the bootloader to do this), you have to execute your own constructors (and possibly destructors). Both are described below. Note that you should call the destructors in reverse order of construction to make sure objects depending on the existence of other objects do not try to access already destroyed objects.

The vague linking sections should be split across multiple sections: you should put them in text, rodata, data and bss. You will need a linker script only later on, but to understand the other source, have a look at it now (tip: you can use 'i586-elf-ld --verbose' to see LD's standard linker script for comparison):

ENTRY(_start)

SECTIONS
{
    . = 0x00100000;

    .text ALIGN(0x1000) :
    {
        *(.text)
        *(.gnu.linkonce.t*)
    }

    .rodata ALIGN(0x1000) :
    {
        start_ctors = .;
        *(SORT(.ctors*))  /* Note the "SORT" */
        end_ctors = .;

        start_dtors = .;
        *(SORT(.dtors*))
        end_dtors = .;

        *(.rodata*)
        *(.gnu.linkonce.r*)
    }

    .data ALIGN(0x1000) :
    {
        *(.data)
        *(.gnu.linkonce.d*)
    }

    .bss :
    {
        sbss = .;
        *(COMMON)
        *(.bss)
        *(.gnu.linkonce.b*)
        ebss = .;
    }

    /DISCARD/ :
    {
        *(.comment)
        *(.eh_frame) /* discard this, unless you are implementing runtime support for C++ exceptions. */
    }
}

The script might require modification to suit your kernel's needs. Note that forgetting to add the vague linking sections might result in GRUB randomly not being able to load your kernel anymore (e.g. after modifying the most trivial code). It might also explain sudden rises in executable size by 50 kB and other issues.

Note that the .ctor and .dtor sections need to be properly aligned. The linker script shown here does that by placing them at the beginning of the .rodata section, which is aligned at a page boundary.

loader.s

loader.s takes over control from the Multiboot bootloader, calls the constructors, and jumps into the kernel proper.

NASM

global _start                           ; making entry point visible to linker
global magic                            ; we will use this in kmain
global mbd                              ; we will use this in kmain

extern kmain                            ; kmain is defined in kmain.cpp

extern start_ctors                      ; beginning and end
extern end_ctors                        ; of the respective
extern start_dtors                      ; ctors and dtors section,
extern end_dtors                        ; declared by the linker script

; setting up the Multiboot header - see GRUB docs for details
MODULEALIGN equ  1<<0                   ; align loaded modules on page boundaries
MEMINFO     equ  1<<1                   ; provide memory map
FLAGS       equ  MODULEALIGN | MEMINFO  ; this is the Multiboot 'flag' field
MAGIC       equ    0x1BADB002           ; 'magic number' lets bootloader find the header
CHECKSUM    equ -(MAGIC + FLAGS)        ; checksum required

section .text

align 4
    dd MAGIC
    dd FLAGS
    dd CHECKSUM

; reserve initial kernel stack space
STACKSIZE equ 0x4000                    ; that's 16k.

_start:
    mov  esp, stack + STACKSIZE         ; set up the stack
    mov  [magic], eax                   ; Multiboot magic number
    mov  [mbd], ebx                     ; Multiboot info structure

    mov  ebx, start_ctors               ; call the constructors
    jmp  .ctors_until_end
.call_constructor:
    call [ebx]
    add  ebx,4
.ctors_until_end:
    cmp  ebx, end_ctors
    jb   .call_constructor

    call kmain                          ; call kernel proper

    mov  ebx, end_dtors                 ; call the destructors
    jmp  .dtors_until_end
.call_destructor:
    sub  ebx, 4
    call [ebx]
.dtors_until_end:
    cmp  ebx, start_dtors
    ja   .call_destructor

    cli
.hang:
    hlt                                 ; halt machine should kernel return
    jmp  .hang

section .bss

align 4
magic: resd 1
mbd:   resd 1
stack: resb STACKSIZE                   ; reserve 16k stack on a doubleword boundary

Assemble using:

nasm -f elf -o loader.o loader.s

GAS

.global _start                          # making entry point visible to linker

# setting up the Multiboot header - see GRUB docs for details
.set ALIGN,    1<<0                     # align loaded modules on page boundaries
.set MEMINFO,  1<<1                     # provide memory map
.set FLAGS,    ALIGN | MEMINFO          # this is the Multiboot 'flag' field
.set MAGIC,    0x1BADB002               # 'magic number' lets bootloader find the header
.set CHECKSUM, -(MAGIC + FLAGS)         # checksum required

.align 4
.long MAGIC
.long FLAGS
.long CHECKSUM

# reserve initial kernel stack space
stack_bottom:
.skip 16384                             # reserve 16 KiB stack
stack_top:
.comm  mbd, 4                           # we will use this in kmain
.comm  magic, 4                         # we will use this in kmain

_start:
    movl  $stack_top, %esp               # set up the stack, stacks grow downwards
    movl  %eax, magic                   # Multiboot magic number
    movl  %ebx, mbd                     # Multiboot data structure

    mov  $start_ctors, %ebx             # call the constructors
    jmp  2f
1:
    call *(%ebx)
    add  $4, %ebx
2:
    cmp  $end_ctors, %ebx
    jb   1b

    call kmain                          # call kernel proper

    mov  $end_dtors, %ebx               # call the destructors
    jmp  4f
3:
    sub  $4, %ebx
    call *(%ebx)
4:
    cmp  $start_dtors, %ebx
    ja   3b

    cli
hang:
    hlt                                 # halt machine should kernel return
    jmp   hang

Assemble using:

i586-elf-as -o loader.o loader.s

kmain.cpp

This is not exactly your average int main() (which is one of the reasons why you should not call it like that). Most notably, you only have the freestanding library available, not the hosted version (i.e., <ciso646>, <cstddef>, <cfloat>, <limits>, <climits>, <cstdint>, <cstdlib>, <new>, <typeinfo>, <exception>, <initializer_list>, <cstdalign>, <cstdarg>, <cstdbool>, <type_traits>, and <atomic>). Welcome to kernel land.

You also need to declare C style linkage for the kernel entry function, so that its name will not get mangled to C++ linkage style and you can call it from your multiboot header Assembly file.

#include <cstdint>

extern "C" void kmain(void)
{
   extern "C" uint32_t magic;
   /* Uncomment the following if you want to be able to access the multiboot header */
   //extern "C" void *mbd;

   if (magic != 0x2BADB002)
   {
      /* Something went not according to specs. Print an error */
      /* message and halt, but do *not* rely on the multiboot */
      /* data structure. */
   }

   /* You could either use multiboot.h */
   /* (http://www.gnu.org/software/grub/manual/multiboot/multiboot.html#multiboot_002eh) */
   /* or do your offsets yourself. The following is merely an example. */ 
   //char * boot_loader_name =(char*) ((long*)mbd)[16];

   /* Print a letter to screen to see everything is working: */
   std::uint8_t *videoram = (std::uint8_t*) 0xb8000;
   videoram[0] = 65; /* character 'A' */
   videoram[1] = 0x07; /* light grey (7) on black (0). */
}

The options for g++ are slightly more numerous than when building a C kernel.

i586-elf-g++ -o kmain.o -c kmain.cpp -Wall -Wextra \
    -nostdlib -fno-exceptions -fno-rtti -fno-stack-protector -ffreestanding

Notes

  • The flags -Wall -Wextra are not exactly required, but using them will certainly help you later on. They might seem to be a pest, but remember: The compiler is your friend!
  • You may be able to use gcc instead of the i586-elf-gcc from the cross-compiler, but this does not work out of the box on at least Windows and 64-bit linux.
  • The reasons for disabling exceptions, RTTI and the stack protector is because they usually require runtime support (which you don't have in a basic kernel). For a more thorough explanation, see the C++ article. There is also an article about the stack smashing protector if you want to enable -fstack-protector.
  • The option -nostdlib is equivalent to passing both -nodefaultlibs and -nostartfiles. You just need to pass -nostdlib.
  • Previous versions of this tutorial suggested passing the -fno-builtin option, but this is actually harmful as you most certainly want compiler builtins (without the __builtin_ prefix) as it lets the compiler understand your code better.

linker.ld

The contents of this file were already presented above.

Link using:

i586-elf-g++ -T linker.ld -o kernel.bin -ffreestanding -nostdlib -fno-exceptions -fno-rtti -fno-stack-protector loader.o kmain.o -lgcc

Note: We are using the compiler to do the linking, as it allows it do do additional processing. We are linking against libgcc, as gcc will emit calls to that library (whether you like that or not). You also need to pass the compilation flags to the link line, as they may also be used during linking (such as -nostdlib).

The file kernel.bin is now your kernel (all other files are no longer needed).

Refer to Bare Bones#Booting the kernel for further instructions.

Questions

Are those .ctors and .dtors compiler-specific or is there an ABI for them?
It's defined in the ELF ABI for System V platforms, but it's used by most Unices. The concept of using a constructor/destructor list for bootup is not so much specified by any C++ ABI, but it is used in most implementations.
Does anyone know how to control the order of the static ctors?
The name mangling algorithm, combined with the "SORT" directive in the linker script, determine the order. Thanks to SORT, the order is deterministic and depends on the lexicographical order of class names. This order doesn't necessarily coincide with source code order or programmer's intention, however. For this exact reason, they should be avoided at all costs.

See Also

Articles