Monday, September 21, 2009

Debugging the Linux Kernel

If you've ever tried to debug your own Linux kernel, you have my deepest sympathy. It isn't fun, nor is it easy. Thankfully, there is a great tool that can help with this: VMware Workstation lets you record and replay a virtual machine and all interaction with it while a debugger is attached to it. Simply install your custom Linux kernel inside a virtual machine and debug it from your desktop.

The blog Debugging the Virtual World is a great source for information on how to set up your system for kernel debugging. Unfortunately, this guide is based on the older 6.0 version of Workstation, and updates to 6.5 have made parts of the document unnecessary. There is other material out there to help you use VMware to debug applications from your Windows development environment. Linux support is a bit lacking.

So for those who use and love Linux, I hope to make this debugging process a bit easier. Plus, I have a few additional ways I'd like to present to ease the debugging process.

Update: Recently VMware Workstation 7.0 was released. Once I manage to get a license for it, I'll present an updated how-to.

Basic Setup

First, you will need a copy of VMware Workstation 6.5. I haven't tried other versions, but I'd be doubtful. You will also need to be proficient with GDB for debugging. I use Emacs or DDD to drive, but that's just my preference. Socat is a tool I use to access unix named sockets, which VMware uses for serial output.

You will need to extract the Linux source tree on the host machine, since you will be debugging there. Let's say you stash it in a directory $LINUX. I use rsync to copy this tree inside the VM where it is built and installed. After this happens, I need to copy some files back out, and I put them in a directory $VMSHARE, but more on this later.

VMware

Workstation 6.5 is already configured to record and replay. The other setup required is to enable the built-in GDB debug stub. This lets an external GDB process control Workstation like a remote process. To enable this, add the following to the virtual machine's .vmx definition file:

debugStub.listen.guest32 = "TRUE"
That's for 32-bit guests. If you run 64-bit, try:
debugStub.listen.guest64 = "TRUE"
I also like to configure my system to dump kernel output to a serial port, so that if the kernel panics the full report is saved. To do this, make sure the VM has a serial port, and configure it to output from server to an application. This really means "to a named socket," which I dump to the console with:

socat unix-connect:/path/to/serial stdio

Configuring Linux in the VM

I'll assume you're doing this because you have a hacked kernel you built yourself and you wish to debug. Make sure when configuring you select:

General setup --->
  Configure standard kernel features (for small system) --->
    [*] Load all symbols for debugging/ksymoops
    [*]   Include all symbols in kallsyms
    [*]   Do an extra kallsyms pass
Kernel hacking --->
  [*] Compile the kernel with debug info

You probably want to add other kernel hacking options, but that's up to your need. Once the kernel is built, copy the uncompressed vmlinux binary into your host's $VMSHARE directory. I do this all with a simple build script in my VM that looks something like this:

#!/bin/sh
cd $HOME/linux-2.6.26 &&
rsync -auvC -e ssh --include core  host:$LINUX $HOME  &&
make &&
sudo make install &&
scp vmlinux host:$VMSHARE &&
echo "Done!"

To configure Linux to send its output to the serial port, add some lines to your kernel command line in /boot/grub/menu.lst, so that it looks something like this:

title Debug Kernel
  kernel /vmlinuz ... console=ttyS0,115200n8 console=tty0

Using GDB

Once you have this system set up, you still have to work with GDB: not the easiest thing to do. I've put together a script that defines some extra commands that make GDB more usable for me. This script lives in my source tree on the host as $LINUX/.gdbinit so it gets loaded whenever I launch GDB.

The following code is heavily tied to the 32-bit x86 architecture. If you're working on a different architecture, let these examples inspire you to do something similar.

I assume you know how to use GDB already; if not, there are bound to be some useful guides out there on the net somewhere. I'm mostly going to go over my own script and the new commands it defines.

Basic commands

First, to start debugging after loading GDB, you need to tell it to connect to VMware. You do this with the connect command when the virtual machine is running. This also prints out the kernel build number and date, so you can make sure you're debugging the correct build. To release control of the debugger, use the detach command.

Eventually, you'll want to examine the kernel's current variable. Unfortunately for x86 architectures, current is a macro written in assembly, so GDB can't access it. You can however use the convenience variable $current, which is refreshed every time the debugger stops.

While debugging, you can use the loc command to have GDB print out information on the current process, and if you are debugging a recording, the timestamp of your current location. This also gets printed out when you stop after a continue command. The information printed is the current process's PID and some relevant flags. You change exactly which flags you want to look at by adjusting the __tsk_flags command as necessary. Just follow the example to add other flags: TIF_* flags are in thread_info_32.h, and PF_* flags are in sched.h). When working correctly, it gives you a status line that looks something like this:

Current pid=(2734) flags=(TIF_NEED_RESCHED ) VM position=632864

When you are replaying, the monitor x commands can be useful. First, monitor offset displays your location within a replay (also shown by loc). Knowing that location, you can quickly jump back to the same point on a subsequent replay by using monitor stopat.

And now, the full $LINUX/.gdbinit script. Make sure that you copy and paste this correctly. All lines that look like they end in "\" really have a trailing space that must be preserved, and GDB is very picky about this!

file $VMSHARE/vmlinux

# Recompute $current at every stop.  Valid only inside kernel.
define set-current
  set $current=*((struct task_struct**)((unsigned long)$esp&0xfffff000))
end
define hook-stop
  set-current
end

# Reconnect to the VM after issuing "detach."
define connect
  target remote localhost:8832
  echo Kernel version =
  output/s init_uts_ns.name.version[0]@36
  echo \n
end

# Print extra information about the current process and replay location
define loc
  if $esp >= 0xc0000000
    # Extra information if in kernel
    echo Current \ 
    echo pid=(
    output $current->pid
    echo ) \ 
    echo flags=(
    __tsk_flags $current
    echo ) \ 
  end
  echo VM position=
  monitor position
end
define hookpost-continue
  loc
end

# helper
define __test_tsk_thread_flag
  if (*((unsigned long)($arg0)->stack + 0x8) & (1U << ($arg1))) != 0
    echo $arg2\ \ 
  end
end
# helper
define __test_tsk_flag
  if (($arg0)->flags & ($arg1)) != 0
    echo $arg2\ \ 
  end
end

# Print out the task flags I'm interested in
define __tsk_flags
  __test_tsk_thread_flag ($arg0) 2 TIF_NEED_RESCHED
  __test_tsk_flag ($arg0) 0x00000004 PF_EXITING
end

# Print flags for the given task_struct
define tsk_flags
  echo flags=(
  __tsk_flags $arg0
  echo )\n
end

Hopefully, this collection of GDB functions helps ease the pain of debugging, if only a little bit.

Further Reading

0 comments:

Post a Comment