Kernel Debugging

From OSDev.wiki
Revision as of 14:13, 14 September 2014 by Columbus (talk | contribs) (Added to Category Troubleshooting)
Jump to navigation Jump to search

Humans make mistakes. Some of these mistakes may end up being part of your OS. Since bugs are more difficult to find than to fix, this page provides a list of common techniques that can be used to isolate bugs in your OS.

Debug statements and log files

The first solution is probably the easiest, and depends on what kind of information you want to get back from your debugger.

The problem with using a debugger such as ddd or gdb is that they require an OS to run....kinda useless when it's the OS itself that you want to debug.

Debugging is essentially being able to probe the contents of a variable at a specific breakpoint. When your program hits the breakpoint, you can probe the variable.

This can also be achieved without using a debugger, by instead inserting a line of code to write to the screen or to a log of some kind. This gives you the contents of the variable that you are interested in - but it means knowing in advance what variable to check, and when, and implies recompiling the kernel every time you want to check a different set of variables... but it is the simplest solution.

Pseudo-Breakpoints

In places where a full print or logging function is not feasible (such as when trying to isolate a single erroneous assembly language instruction), you can create a kind of 'pseudo-breakpoint' by inserting a HLT instruction into the code. These can be used to perform a binary space isolation (often referred to as a 'binary chop') through the code. The idea is to place the halt instruction at a point roughly halfway through the part of the code suspected to be at fault; if the CPU halts before the error occurs, then you know that the error is after the breakpoint, otherwise, it must be in the code before breakpoint. Repeat this procedure until the error is isolated. Unfortunately, this only works if the result of the error can be differentiated from the halt instruction itself, and it does little in the case of a problem occurring more than one repetition into loop, such as an array overrun.

Use a virtual machine

A virtual machine is a program that simulates another computer (Java coders should be familiar with the concept).

There are a number of virtual machines that can simulate x86 machines, my favorite is Bochs (http://bochs.sourceforge.net). Bochs is capable of setting breakpoints in any kind of software (even if it is compiled without debugging info!), and provides an additional "debugging out port" you can easily access from within your kernel code to print debug messages.

The main downside to using a virtual machine like this is that all the code is displayed in assembler (or binary depending on what machine you choose) - instead of the C/C++ source you originally wrote. Also, simulating a virtual machine is slower than an actual machine, and the VM might not even behave exactly like the "real" hardware.

That being said, there are also a lot of other advantages to using a VM. For example, you don't have to reboot to test your new OS, you just start the VM.

Another virtual machine called Simics (https://www.simics.net) is capable not only of breakpoints and displaying register information, but it is also capable of opening a port for use with debugging with DDD (the simics command is 'gdb-remote'). Using this combination, it is possible to see your C source code as you step through the OS! However, the Bochs virtual machine is much faster at executing the OS than Simics and thus serves as a better virtual machine to run the OS, while Simics is the better debugger for those hard to find problems.

Using the serial port

Writing logfiles with QEMU

QEMU allows you to redirect everything that you send to COM1 port to a file on your host computer. To enable this feature, you have to add the following flag when launching QEMU:

-serial file:serial.log

... while "serial.log" is the path to the output file. Once you have this feature enabled, you can write log entries by simply writing characters to the COM1 port (reading from the file over the serial port is not supported).

On real hardware

When your real computer resets due to a programming error, anything you might have put on the screen will instantly vanish. If you're tampering with the video card, you will often find yourself with no visual debugging method at all. If you have a pair of computers connected with a null-modem cable, you can instead send all debug statments over the serial port instead and record them on your development machine that is more stable. Using an actual serial terminal works just as well. It requires a bit of additional cabling, but it works fairly simple and can prove to be a very good replacement for a VM log.

With remote debugger / GDB

Since serial works two ways, you can also control your kernel remotely in case of problems. This can be a simple interface, but you can also attach GDB onto the serial port and potentially get a full blown debugger running.

This is however rather tricky, since it requires additional hardware, and special support coded into your kernel. You might want to read the kernel hacking how-to and (at minimum) chapter 20 of the GDB manual, and chances are likely that your debugger will introduce even more bugs at first.

Use gdb with Qemu

You can run Qemu to listen for a "gdb connection" before it starts executing any code to debug it.

qemu -s -S <harddrive.img>

...will setup Qemu to listen on port 1234 and wait for a gdb connection to it. Then, from a remote or local shell:

gdb 
(gdb) target remote localhost:1234 

(Replace localhost with remote IP / URL if necessary.) Then start execution:

But that's not all, you can compile your source code under gcc with debugging symbols using "-g". This will add all the debugging symbols in the kernel image itself (Thus making it bigger ). There is also a way to put all of the debugging information in a separate file using the "objcopy" tool, which is part of the GNU binutils package.

objcopy --only-keep-debug kernel.elf kernel.sym

This will put the debugging information into a file called "kernel.sym". After that to strip your executable of debugging information you can do

objcopy --strip-debug kernel.elf

Or alternatively, if you are using a flat binary as your kernel image, you can do

objcopy -O binary kernel.elf kernel.bin

To produce a flat binary which can be debugged using the previously extracted debug information

You can import the symbols in gdb by pointing gdb to the file containing debug information

(gdb) symbol-file kernel.elf             ;kernel.elf is the actual unstripped kernel image in this case

From there, you can see the actual C source code as it runs line per line! (Use the stepi instruction in gdb to execute the code line per line.)

Example :

$ qemu -s -S c.img
warning: could not open /dev/net/tun: no virtual network emulation
Waiting gdb connection on port 1234 

(gdb) target remote localhost:1234
Remote debugging using localhost:1234
0x0000fff0 in ?? ()
(gdb) symbol-file kernel.b
Reading symbols from kernel.b...done.

(gdb) break kmain                        ; This will add a break point to any function in your kernel code.
Breakpoint 1 at 0x101800: file kernel/kernel.c, line 12.

(gdb) continue

Breakpoint 1, kmain (mdb=0x341e0, magic=0) at kernel/kernel.c:12
12      {

The above started code execution, and will stop at kmain specified in the "break kmain" above. You can view registers at anytime with this command

(gdb) info registers

I won't start explaining all the nice things about gdb, but as you can see, it is a very powerful tool for debugging OSes.

GUI frontends

While GDB provides a text-based user interface (available via the `-tui` command line option or by entering `wh` at the GDB prompt), you might want to use one of the available GUI frontents to GDB. These include but are not limited to:

* kdbg
* insight
* ddd
* VisualKernel

Attaching to a QEMU session works similar to the command line GDB described above.

Develop in hosted environment

Another possibility, which is also a great architectural exercise, is to code every software module in a hosted environment like Linux, and then port it to your OS. You can do this for kernel code too, not just usermode programs.

Suppose you want to develop your VFS interface implementation. Your already created the interface for block devices (doesn't matter if you already implemented it in your kernel). In this case, you can implement your block device interface as a set of wrappers that adapts your interface to POSIX calls. You will then implement your VFS interface (i.e., the code that will manage the filesystem drivers in your kernel) on top of those wrappers. You will then test&debug your implementation all in the hosted environment, and when it is mature, you link it into your real kernel instead of into your hosted implementations. You will finally test your newly introduced code, now in the freestanding environment to ensure it works there as well.

Now, the Pros. First of all, you can use your favourite debugger. You can also use unit testing, for example, which is far better than testing software by hand, if you use the right method.

There are some Cons on this approach. For example, you are far from your target environment when you code like this. This is further aggravated by the fact that so-called freestanding environments are dramatically more sensitive to undefined behaviour, specially uninitialized variables. You can work around this limitation by asking the compiler to perform aggressive optimization while testing hosted, which make software more sensible to undefined behaviour, too. However, as the best debug environment is the final target environment, you will still want to test your code when you introduce in into your real kernel.

Another Con that will probably scare most people is that this approach requires you to consistently plan your interfaces beforehand. Depending on your specific requirements, you may still be able to avoid a too long planning phase. For example, if you want to throw away the hosted implementations once you get the modules working properly, then you don't have to bother maintaining the same interfaces forever.

Using an IDE

You can debug Linux kernel modules with Visual Studio if you use the VisualKernel plugin. Here's a tutorial showing a normal debugging session: http://visualkernel.com/tutorials/kgdb/

Related Threads