Raspberry Pi Bare Bones

Revision as of 21:06, 19 January 2013 by osdev>Mrvn (Created page with "{{In Progress}} ==Intro== This is a tutorial on bare-metal [OS] development on the Raspberry Pi. This tutorial is written specifically for the Raspberry Pi Model B Rev 2 beca...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
This page is a work in progress.
This page may thus be incomplete. Its content may be changed in the near future.

Intro

This is a tutorial on bare-metal [OS] development on the Raspberry Pi. This tutorial is written specifically for the Raspberry Pi Model B Rev 2 because the author has no other hardware to test on. But so far the models are basically identical for the purpose of this tutorial (Rev 1 has 256MB ram, Model A has no ethernet).

This is the authors very first ARM system and we learn as we write without any prior knowledge about arm. Experience with the GNU toolchain (gcc, make, ... very important) and C language (incredibly important, including how to use inline asm) is assumed and required. This is not a tutorial about how to build a kernel but a simple intro in how to get started on the RPi.

Bare minimum kernel

Lets start with a minimum of 4 files. The kernel is going to use a subset of C++, meaning C++ without exceptions and without runtime types. The main function will be in main.cc. Before the main function can be called though some things have to be set up using assembly. This will be placed in boot.S. On top of that we also need a linker script and a Makefile to build the kernel and need to create an include directory for later use.

main.c

/* main.c - the entry point for the kernel */

#include <stdint.h>

#define UNUSED(x) (void)(x)

// kernel main function, it all begins here
void kernel_main(uint32_t r0, uint32_t r1, uint32_t atags) {
    UNUSED(r0);
    UNUSED(r1);
    UNUSED(atags);
}

This simply declares an empty kernel_main function that simply returns. The GPU bootloader passes arguments to the kernel via r0-r2 and the boot.S makes sure to preserve those 3 registers. They are the first 3 arguments in a C function call. I'm not sure what R0 and R1 are for but r2 contains the address of the ATAGs (more about them later). For now the UNUSED() makro makes sure the compiler doesn't complain about unused variables.

boot.S

/* boot.S - assembly startup code */

// To keep this in the first portion of the binary.
.section ".text.boot"

// Make Start global.
.globl Start

// Entry point for the kernel.
// r15 -> should begin execution at 0x8000.
// r0 -> 0x00000000
// r1 -> 0x00000C42
// r2 -> 0x00000100 - start of ATAGS
// preserve these registers as argument for kernel_main
Start:
	// Setup the stack.
	mov	sp, #0x8000

	// Clear out bss.
	ldr	r4, =_bss_start
	ldr	r9, =_bss_end
	mov	r5, #0
	mov	r6, #0
	mov	r7, #0
	mov	r8, #0
1:
	// store multiple at r4.
	stmia	r4!, {r5-r8}

	// If we're still below bss_end, loop.
	cmp	r4, r9
	blo	1b

	// Call kernel_main
	ldr	r3, =kernel_main
	blx	r3

	// halt
halt:
	wfe
	b	halt

The section ".text.boot" will be used in the linker script to place the boot.S as the verry first thing in out kernel image. The code initializes a minimum C environment, which means having a stack and zeroing the BSS segment, before calling the kernel_main function. Note that the code avoids using r0-r2 so the remain valid for the kernel_main call.

link-arm-eabi.ld

* link-arm-eabi.ld - linker script for arm eabi */
ENTRY(Start)

SECTIONS
{
    /* Starts at LOADER_ADDR. */
    .text 0x8000 :
    _text_start = .;
    _start = .;
    {
        KEEP(*(.text.boot))
        *(.text)
    }
    . = ALIGN(4096); /* align to page size */
    _text_end = .;
    .rodata:
    _rodata_start = .;
    {
	*(.rodata)
    }
    . = ALIGN(4096); /* align to page size */
    _rodata_end = .;
    .data :
    _data_start = .;
    {
        *(.data)
    }
    . = ALIGN(4096); /* align to page size */
    _data_end = .;
    .bss :
    _bss_start = .;
    {
        bss = .;
        *(.bss)
    }
    . = ALIGN(4096); /* align to page size */
    _bss_end = .;
    
    _end = .;
}

There is a lot of text here but don't despair. The script is rather simple if you look at it bit by bit.

ENTRY(Start) declares the entry point for the kernel image. That symbol was declared in the boot.S file. Since we are actually booting a binary image I think the entry is completly irelevant. But it has to be there in the elf file we build as intermediate file. Declaring it makes the linker happy.

SECTIONS declares, well, sections. It decides where the bits and pieces of our code and data go and also sets a few symbols that help us track the size of each section.

    . = 0x8000;
    _start = .;

The "." denotes the current address so the first line tells the linker to set the current address to 0x8000, where the kernel starts. The current address is automatically incremented when the linker adds data. The second line then creates a symbol "_start" and sets it to the current address.

After that sections are defined for text (code), read-only data, read-write data and BSS (0 initialized memory). Other than the name the sections are identical so lets just look at one of them:

    _text_start = .;
    .text : {
        KEEP(*(.text.boot))
        *(.text)
    }
    . = ALIGN(4096); /* align to page size */
    _text_end = .;

The first line creates a _text_start symbol for the section. The second line opens a .text section for the output file which gets closed in the fifth line. Lines 3 and 4 declare what sections from the input files will be placed inside the output .text section. In our case ".text.boot" is to be placed first followed by the more general ".text". ".text.boot" is only used in boot.S and ensures that it ends up at the beginning of the kernel image. ".text" then contains all the remaining code. Any data added by the linker automatically increments the current addrress ("."). In line 6 we explicitly increment it so that it is aligned to a 4096 byte boundary (which is the page size for the RPi). And last line 7 creates a _text_end symbol so we know where the section ends.

What are the _text_start and _text_end for and why use page alignment? The 2 symbols can be used in the kernel source and the linker will then place the correct addresses into the binary. As an example the _bss_start and _bss_end are used in boot.S. But you can also use the symbols from C by declaring them extern first. While not required I made all sections aligned to page size. This later allows mapping them in the page tables with executable, read-only and read-write permissions without having to handle overlaps (2 sections in one page).

    _end = .;

After all sections are declared the _end symbol is created. If you ever want to know how large your kernel is at runtime you can use _start and _end to find out.

Makefile

# Makefile - build script */

# build environment
PREFIX ?= /usr/local/cross
ARMGNU ?= $(PREFIX)/bin/arm-none-eabi

# source files
SOURCES_ASM := $(wildcard *.S)
SOURCES_C   := $(wildcard *.c)

# object files
OBJS        := $(patsubst %.S,%.o,$(SOURCES_ASM))
OBJS        += $(patsubst %.c,%.o,$(SOURCES_C))

# Build flags
DEPENDFLAGS := -MD -MP
INCLUDES    := -I include
BASEFLAGS   := -O2 -fpic -pedantic -pedantic-errors -nostdlib
BASEFLAGS   += -nostartfiles -ffreestanding -nodefaultlibs
BASEFLAGS   += -fno-builtin -fomit-frame-pointer -mcpu=arm1176jzf-s
WARNFLAGS   := -Wall -Wextra -Wshadow -Wcast-align -Wwrite-strings
WARNFLAGS   += -Wredundant-decls -Winline
WARNFLAGS   += -Wno-attributes -Wno-deprecated-declarations
WARNFLAGS   += -Wno-div-by-zero -Wno-endif-labels -Wfloat-equal
WARNFLAGS   += -Wformat=2 -Wno-format-extra-args -Winit-self
WARNFLAGS   += -Winvalid-pch -Wmissing-format-attribute
WARNFLAGS   += -Wmissing-include-dirs -Wno-multichar
WARNFLAGS   += -Wredundant-decls -Wshadow
WARNFLAGS   += -Wno-sign-compare -Wswitch -Wsystem-headers -Wundef
WARNFLAGS   += -Wno-pragmas -Wno-unused-but-set-parameter
WARNFLAGS   += -Wno-unused-but-set-variable -Wno-unused-result
WARNFLAGS   += -Wwrite-strings -Wdisabled-optimization -Wpointer-arith
WARNFLAGS   += -Werror
ASFLAGS     := $(INCLUDES) $(DEPENDFLAGS) -D__ASSEMBLY__
CFLAGS      := $(INCLUDES) $(DEPENDFLAGS) $(BASEFLAGS) $(WARNFLAGS)
CFLAGS      += -std=c99

# build rules
all: kernel.img

include $(wildcard *.d)

kernel.elf: $(OBJS) link-arm-eabi.ld
	$(ARMGNU)-ld $(OBJS) -Tlink-arm-eabi.ld -o $@

kernel.img: kernel.elf
	$(ARMGNU)-objcopy kernel.elf -O binary kernel.img

clean:
	$(RM) -f $(OBJS) kernel.elf kernel.img

dist-clean: clean
	$(RM) -f *.d

# C.
%.o: %.c Makefile
	$(ARMGNU)-gcc $(CXXFLAGS) -c $< -o $@

# AS.
%.o: %.S Makefile
	$(ARMGNU)-g++ $(ASFLAGS) -c $< -o $@

And there you go. Try building it. A minimum kernel that does absolutely nothing.

Hello World kernel

Lets make the kernel do something. Lets say hello to the world using the serial port.

main.c

/* main.c - the entry point for the kernel */

#include <stdint.h>

#include <uart.h>

#define UNUSED(x) (void)(x)

const char hello[] = "\r\nHello World\r\n";
const char halting[] = "\r\n*** system halting ***";

// kernel main function, it all begins here
void kernel_main(uint32_t r0, uint32_t r1, uint32_t atags) {
    UNUSED(r0);
    UNUSED(r1);
    UNUSED(atags);

    uart_init();
    
    uart_puts(hello);

    // Wait a bit
    for(volatile int i = 0; i < 10000000; ++i) { }

    uart_puts(halting);
}

include/mmio.h

/* mmio.h - access to MMIO registers */

#ifndef MMIO_H
#define MMIO_H

#include <stdint.h>

// write to MMIO register
static inline void mmio_write(uint32_t reg, uint32_t data) {
    uint32_t *ptr = (uint32_t*)reg;
    asm volatile("str %[data], [%[reg]]"
	     : : [reg]"r"(ptr), [data]"r"(data));
}

// read from MMIO register
static inline uint32_t mmio_read(uint32_t reg) {
    uint32_t *ptr = (uint32_t*)reg;
    uint32_t data;
    asm volatile("ldr %[data], [%[reg]]"
		 : [data]"=r"(data) : [reg]"r"(ptr));
    return data;
}

#endif // #ifndef MMIO_H


include/uart.h

/* uart.h - UART initialization & communication */

#ifndef UART_H
#define UART_H

#include <stdint.h>

/*
 * Initialize UART0.
 */
void uart_init();

/*
 * Transmit a byte via UART0.
 * uint8_t Byte: byte to send.
 */
void uart_putc(uint8_t byte);

/*
 * print a string to the UART one character at a time
 * const char *str: 0-terminated string
 */
void uart_puts(const char *str);

#endif // #ifndef UART_H

uart.c

/* uart.c - UART initialization & communication */
/* Reference material:
 * http://www.raspberrypi.org/wp-content/uploads/2012/02/BCM2835-ARM-Peripherals.pdf
 * Chapter 13: UART
 */

#include <stdint.h>
#include <mmio.h>
#include <uart.h>

enum {
    // The GPIO registers base address.
    GPIO_BASE = 0x20200000,

    // The offsets for reach register.

    // Controls actuation of pull up/down to ALL GPIO pins.
    GPPUD = (GPIO_BASE + 0x94),

    // Controls actuation of pull up/down for specific GPIO pin.
    GPPUDCLK0 = (GPIO_BASE + 0x98),

    // The base address for UART.
    UART0_BASE = 0x20201000,

    // The offsets for reach register for the UART.
    UART0_DR     = (UART0_BASE + 0x00),
    UART0_RSRECR = (UART0_BASE + 0x04),
    UART0_FR     = (UART0_BASE + 0x18),
    UART0_ILPR   = (UART0_BASE + 0x20),
    UART0_IBRD   = (UART0_BASE + 0x24),
    UART0_FBRD   = (UART0_BASE + 0x28),
    UART0_LCRH   = (UART0_BASE + 0x2C),
    UART0_CR     = (UART0_BASE + 0x30),
    UART0_IFLS   = (UART0_BASE + 0x34),
    UART0_IMSC   = (UART0_BASE + 0x38),
    UART0_RIS    = (UART0_BASE + 0x3C),
    UART0_MIS    = (UART0_BASE + 0x40),
    UART0_ICR    = (UART0_BASE + 0x44),
    UART0_DMACR  = (UART0_BASE + 0x48),
    UART0_ITCR   = (UART0_BASE + 0x80),
    UART0_ITIP   = (UART0_BASE + 0x84),
    UART0_ITOP   = (UART0_BASE + 0x88),
    UART0_TDR    = (UART0_BASE + 0x8C),
};

/*
 * delay function
 * int32_t delay: number of cycles to delay
 *
 * This just loops <delay> times in a way that the compiler
 * wont optimize away.
 */
static void delay(int32_t count) {
    asm volatile("1: subs %[count], %[count], #1; bne 1b"
	     : : [count]"r"(count));
}
    
/*
 * Initialize UART0.
 */
void uart_init() {
    // Disable UART0.
    mmio_write(UART0_CR, 0x00000000);
    // Setup the GPIO pin 14 && 15.
    
    // Disable pull up/down for all GPIO pins & delay for 150 cycles.
    mmio_write(GPPUD, 0x00000000);
    delay(150);

    // Disable pull up/down for pin 14,15 & delay for 150 cycles.
    mmio_write(GPPUDCLK0, (1 << 14) | (1 << 15));
    delay(150);

    // Write 0 to GPPUDCLK0 to make it take effect.
    mmio_write(GPPUDCLK0, 0x00000000);
    
    // Clear pending interrupts.
    mmio_write(UART0_ICR, 0x7FF);

    // Set integer & fractional part of baud rate.
    // Divider = UART_CLOCK/(16 * Baud)
    // Fraction part register = (Fractional part * 64) + 0.5
    // UART_CLOCK = 3000000; Baud = 115200.

    // Divider = 3000000/(16 * 115200) = 1.627 = ~1.
    // Fractional part register = (.627 * 64) + 0.5 = 40.6 = ~40.
    mmio_write(UART0_IBRD, 1);
    mmio_write(UART0_FBRD, 40);

    // Enable FIFO & 8 bit data transmissio (1 stop bit, no parity).
    mmio_write(UART0_LCRH, (1 << 4) | (1 << 5) | (1 << 6));

    // Mask all interrupts.
    mmio_write(UART0_IMSC, (1 << 1) | (1 << 4) | (1 << 5) |
		    (1 << 6) | (1 << 7) | (1 << 8) |
		    (1 << 9) | (1 << 10));

    // Enable UART0, receive & transfer part of UART.
    mmio_write(UART0_CR, (1 << 0) | (1 << 8) | (1 << 9));
}

/*
 * Transmit a byte via UART0.
 * uint8_t Byte: byte to send.
 */
void uart_putc(uint8_t byte) {
    // wait for UART to become ready to transmit
    while(true) {
        if (!(mmio_read(UART0_FR) & (1 << 5))) {
	    break;
	}
    }
    mmio_write(UART0_DR, byte);
}

/*
 * print a string to the UART one character at a time
 * const char *str: 0-terminated string
 */
void uart_puts(const char *str) {
    while(*str) {
    uart_putc(*str++);
}

Booting the kernel

Do you still have the SD card with the original raspian image on it from when you where testing the hardware above? Great. So you already have a SD card with a boot partition and the required files. If not then download one of the original raspberry boot images and copy them to the SD card.

Now mount the first partition from the SD card and look at it:

bootcode.bin  fixup.dat     kernel.img            start.elf
cmdline.txt   fixup_cd.dat  kernel_cutdown.img    start_cd.elf
config.txt    issue.txt     kernel_emergency.img

Simplified when the RPi powers up the ARM cpu is halted and the GPU runs. The GPU loads the bootloader from rom and executes it. That then finds the SD card and loads the bootcode.bin. The bootcode handles the config.txt and cmdline.txt (or does start.elf read that?) and then runs start.elf. start.elf loads the kernel.img and at last the ARM cpu is started running that kernel image.

So now we replace the original kernel.img with out own, umount, sync, stick the SD card into RPi and turn the power on. Your minicom should then show the following:

Hello World                                                                     
                                                                                
*** system halting ***

Echo kernel

The hello world kernel shows how to do output to the UART. Next lets do some input. And to see that the input works we simply echo the input as output. A simple echo kernel.

In include/uart.h add the following:

/*
 * Receive a byte via UART0.
 *
 * Returns:
 * uint8_t: byte received.
 */
uint8_t uart_getc();

In uart.c add the following:

/*
 * Receive a byte via UART0.
 *
 * Returns:
 * uint8_t: byte received.
 */
uint8_t uart_getc() {
    // wait for UART to have recieved something
    while(true) {
	if (!(mmio_read(UART0_FR) & (1 << 4))) {
	    break;
	}
    }
    return mmio_read(UART0_DR);
}

And last in main.c put the following:

const char hello[] = "\r\nHello World, feel the echo\r\n";
b
// kernel main function, it all begins here
void kernel_main(uint32_t r0, uint32_t r1, uint32_t atags) {
    UNUSED(r0);
    UNUSED(r1);
    UNUSED(atags);

    UART::init();

    UART::puts(hello);

    while(true) {
	UART::putc(UART::getc());
    }
}