Wednesday, December 2, 2009

Debugging Kernel Modules with GDB 7

When you're debugging your Linux kernel, every try to browse into a module? That only works well if you can read assembly easily. GDB has debug info for the linked-in part of the kernel, but being separately-compiled entities, modules are a big unknown. There is a command in GDB to fix this: add-symbol-file, which informs GDB of the symbols in the given file. You just have to tell GDB where that file lives in the process's address space.

Figuring that out isn't necessarily straightforward. VMware's solution (from their tech note Replay Debugging on Linux) isn't very satisfactory. To paraphrase, you need to go into the guest, examine files in /sys/modules/<foo>/sections to extract the module's segment offsets, and then build an appropriate GDB command line from that. That's fine, unless your load addresses change every now and then (preventing you from hard-coding this into your GDB init script). And of course the guest system isn't exactly accessible when a debugger is attached.

I thought I'd get around this issue by having GDB look at the running kernel to figure out where it should locate the module. Linux keeps a linked list of all loaded modules in a list named modules. So GDB just needs to traverse this list until it finds the named module, then extract the info and build its own command line. Sounds easy, right?

I fought with GDB's scripting language to implement this, and it won. The killer: I couldn't figure out how to do a string compare in it. Anyway, I shelved this idea and waited. Since then, GDB 7 has been released with support for Python scripting, and I couldn't help but take another look at this problem.

The script below adds the new GDB command add-kernel-module. It takes one argument, the file name of the kernel module to load. The code does about what I mentioned above: it walks kernel data structures, extracts relevant info, and builds a GDB command line from it. It's not very pretty code, but it works.

Be sure to name this "vmlinux-gdb.py" so that it can be automatically loaded when you debug your kernel.

import os.path

long_type = gdb.lookup_type('unsigned long')

# These functions are counterparts to the Linux kernel macros
def offset(gdb_type, member):
    '''Return the offset of a member within a structure.'''
    ptr = gdb.Value(0).cast(gdb_type.pointer())
    return ptr[member].address.cast(long_type)

def list_entry(list_head_ptr, gdb_type, off):
    '''Return the base element that holds a list_head struct.'''
    base = list_head_ptr.cast(long_type) - off
    return base.cast(gdb_type.pointer())

def listhead_iter(list_head, typename, member):
    '''Iterate over elements of a list_head.  The first argument should
be a *reference* to the main list_head.'''
    type = gdb.lookup_type(typename)
    off = offset(type, member)
    pos = list_head['next']
    while pos != list_head.address:
        yield list_entry(pos, type, off)
        pos = pos['next']

class AddKernModCmd(gdb.Command):
    '''Load symbols for a currently-running kernel module.'''

    def __init__(self):
        super(AddKernModCmd, self).__init__('add-kernel-module',
                                            gdb.COMMAND_FILES,
                                            gdb.COMPLETE_FILENAME)

    def invoke(self, filename, from_tty):
        if filename == '':
            print 'USAGE: add-kernel-module FILE.ko'
            return
        if not os.path.isfile(filename):
            if os.path.isfile(filename + '.ko'):
                filename += '.ko'
            else:
                print 'Could not find module file %s' % filename
                return
        base = os.path.splitext(os.path.basename(filename))[0]
        try:
            frame = gdb.selected_frame()
            modules = frame.read_var("modules")
        except:
            print 'A running kernel must be attached in order to get section information'
            return
        for mod in listhead_iter(modules, 'struct module', 'list'):
            if mod['name'].string() == base:
                # Found it!  Now, get the section attributes
                sect = mod['sect_attrs']
                num_sect = int(str(sect['nsections']))
                attrs = [ sect['attrs'][i] for i in range(num_sect) ]
                adict = dict([ (a['name'].string(), str(a['address']))
                               for a in attrs ])
                cmd = 'add-symbol-file %s %s' %(filename,adict['.text'])
                for sec in ('.bss', '.data', '.init.text'):
                    if sec in adict:
                        cmd += ' -s %s %s' % (sec, adict[sec])
                gdb.execute(cmd)
                return
        print 'Module %s not found' % base

AddKernModCmd()

Tuesday, December 1, 2009

"Fixing" my PowerBook's Lid

I love my little PowerBook G4, but it's starting to show its age. In particular, the case itself has taken some wear over the years. Dings in the aluminum don't matter that much, but one piece of damage has seriously affected its usability.

Macs have always worked well with power management. You close the lid and it goes to sleep. When you open the lid again, everything just pops up. It's power management without thinking. Macs were doing this before Windows thought to copy them and before Linux knew how to resume. (Suspend is easy; it's brining things back that's tricky.)

Anyway, this handy feature worked fine, up until my Mac wouldn't stay shut. There is a little spring-loaded magnetic hook that would pop down from the top of the case when the lid was close to the base, and that would hold the laptop shut. That hook broke off earlier this year. So the case will naturally rest right on the edge of the open/shut sensor. If I travel with it in my backpack, it goes in and out of sleep every time I shift my weight. It's strange picking up my bag and then hearing the beep that tells me I have new mail. You can imagine what this does to my battery life.

I finally figured out how to "fix" my PowerBook without disassembling the case. There's a simple command line that will prevent the laptop from waking up when the case is opened:

sudo pmset -a lidwake 0

After running this, my laptop won't wake unless I hit the power button on the keyboard. The pmset command is a utility for changing power settings. It performs the same changes that the Energy Saver preference pane does, plus a few. The "lidwake" setting doesn't show up in Energy Saver, but you can still set it directly.

Thursday, October 22, 2009

UMich LDAP for Thunderbird

At the University of Michigan, setting up Thunderbird to query the university's LDAP directory has always been a mystery. The basic directory server settings for Thunderbird are easy enough to get right:

Hostname: ldap.itd.umich.edu
Base DN: dc=umich,dc=edu

But for some reason, Thunderbird would still refuse to work. I've started looking a bit more closely at OpenLDAP lately, and now I finally figured out how to get it partially working. First, some background...

To find someone in an LDAP directory, a client constructs a query that tells the directory what entry to look for. For those not steeped in LDAP, a query looks something like "(cn=Benjamin*)" to search for records showing a common name that starts with 'Benjamin'. Other attributes you could search for are given name, surname, or mail address, to name a few.

The default query for Thunderbird, when you type someone's name into the Address Book search box, looks something like: "(|(mail=*benjamin*)(cn=*benjamin*)(givenName=*benjamin*)(sn=*benjamin*))". Anything that has "benjamin" as a substring in any of those four attributes will match. That's a pretty wide net to cast.

And our administrators don't allow it. Querying on the "mail" attribute when there is a "*" in the parameter, or on "givenName" for any parameter will result in an "Administrative limit exceeded" error. So when you OR all those together, of course the server rejects it.

The solution is to have Thunderbird change how it constructs its queries. I haven't found out how to do this for the search box, but it is possible to get address auto-completion working. According to an ancient email, you can do this by adding the following line to Thunderbird's "prefs.js" file:

user_pref("ldap_2.servers.UMich.autoComplete.filterTemplate",
          "(|(cn=*%v*)(mail=%v)(uid=%v))");

Then enable auto-complete via LDAP in the Preferences | Composition | Addressing dialog. You should change "UMich" to reflect whatever Thunderbird has internally named your LDAP server profile (look through prefs.js to figure that out). This directs Thunderbird to search for substrings of common names and exact matches for your mail address or uniqname. The latter two parts are of limited use, but the first seems to return good results. You still hit an admin limit if your search is too generic though (i.e. "ben").

I bet there's a similar hidden preference setting to adjust the main search box, but I doubt I'll find it unless I start reading through more XPCOM code than I care to do.

Sunday, October 4, 2009

Nothing is New

I have read of a gentleman who owned a so fine house in London, and when he went for months of summer to Switzerland and lock up his house, some burglar come and broke window at back and got in. Then he went and made open the shutters in front and walk out and in through the door, before the very eyes of the police. Then he have an auction in that house, and advertise it, and put up big notice. And when the day come he sell off by a great auctioneer all the goods of that other man who own them. Then he go to a builder, and he sell him that house, making an agreement that he pull it down and take all away within a certain time. And your police and other authority help him all they can. And when that owner come back from his holiday in Switzerland he find only an empty hole where his house had been.

I found this passage in Bram Stoker's Dracula, published in 1897. It reminded me of a scam I first heard of being used on Craigslist. Someone looking to rent a room in another town finds a nice place online, sends in a deposit, and then shows up to find that the owner of the house knows nothing about a room for rent. The poster, who is not the owner of the house, gets away with whatever deposit was paid. It looks like pretending to own some property and selling it off is not by any measure a new fraud!

Tuesday, September 22, 2009

Algebra Economics

Consider a TI-83+ calculator. Made in 1999, a minor update to a 1996 product. 6 MHz processor. While I don't have a good source for historical price data, one web site claims that a 1999 price for this hardware is about $90. I figure that's about right; I remember buying a different model graphing calculator for something near that.

In the last decade, processors have gone from 300 MHz Pentiums with a feature size of 800 nm to 3 GHz 8-threaded Core i7s having 45nm features. A typical amount of RAM for a desktop system has gone from 128 MB of PC100 to 2 GB of DDR3-1066. And with all of this progress, the price of desktops has slowly been going down. Something like this ran through my head as I saw that the TI-83+ in my hand cost $110 today.

There is one feature of the TI-83+ that suddenly became relevant to me: you can bring this model into ACT, SAT, and AP exams. This is one of the fixed list of calculators certified appropriate for use on these all-important measures of a high-school student's worth. High school math programs typically mandate which calculator all its students purchase, and I have a suspicion that a significant percentage of all US schools require that exact model.

Since students are not free to select a calculator based on open competition for features and usability, why would the price ever drop? I should have remembered my own high school classes, particularly economics. Price reflects what people are willing to pay for a good, not how much it costs to make it.

Monday, September 21, 2009

Debugging the Linux Kernel

If you've ever tried to debug your own Linux kernel, you have my deepest sympathy. It isn't fun, nor is it easy. Thankfully, there is a great tool that can help with this: VMware Workstation lets you record and replay a virtual machine and all interaction with it while a debugger is attached to it. Simply install your custom Linux kernel inside a virtual machine and debug it from your desktop.

The blog Debugging the Virtual World is a great source for information on how to set up your system for kernel debugging. Unfortunately, this guide is based on the older 6.0 version of Workstation, and updates to 6.5 have made parts of the document unnecessary. There is other material out there to help you use VMware to debug applications from your Windows development environment. Linux support is a bit lacking.

So for those who use and love Linux, I hope to make this debugging process a bit easier. Plus, I have a few additional ways I'd like to present to ease the debugging process.

Update: Recently VMware Workstation 7.0 was released. Once I manage to get a license for it, I'll present an updated how-to.

Basic Setup

First, you will need a copy of VMware Workstation 6.5. I haven't tried other versions, but I'd be doubtful. You will also need to be proficient with GDB for debugging. I use Emacs or DDD to drive, but that's just my preference. Socat is a tool I use to access unix named sockets, which VMware uses for serial output.

You will need to extract the Linux source tree on the host machine, since you will be debugging there. Let's say you stash it in a directory $LINUX. I use rsync to copy this tree inside the VM where it is built and installed. After this happens, I need to copy some files back out, and I put them in a directory $VMSHARE, but more on this later.

VMware

Workstation 6.5 is already configured to record and replay. The other setup required is to enable the built-in GDB debug stub. This lets an external GDB process control Workstation like a remote process. To enable this, add the following to the virtual machine's .vmx definition file:

debugStub.listen.guest32 = "TRUE"
That's for 32-bit guests. If you run 64-bit, try:
debugStub.listen.guest64 = "TRUE"
I also like to configure my system to dump kernel output to a serial port, so that if the kernel panics the full report is saved. To do this, make sure the VM has a serial port, and configure it to output from server to an application. This really means "to a named socket," which I dump to the console with:

socat unix-connect:/path/to/serial stdio

Configuring Linux in the VM

I'll assume you're doing this because you have a hacked kernel you built yourself and you wish to debug. Make sure when configuring you select:

General setup --->
  Configure standard kernel features (for small system) --->
    [*] Load all symbols for debugging/ksymoops
    [*]   Include all symbols in kallsyms
    [*]   Do an extra kallsyms pass
Kernel hacking --->
  [*] Compile the kernel with debug info

You probably want to add other kernel hacking options, but that's up to your need. Once the kernel is built, copy the uncompressed vmlinux binary into your host's $VMSHARE directory. I do this all with a simple build script in my VM that looks something like this:

#!/bin/sh
cd $HOME/linux-2.6.26 &&
rsync -auvC -e ssh --include core  host:$LINUX $HOME  &&
make &&
sudo make install &&
scp vmlinux host:$VMSHARE &&
echo "Done!"

To configure Linux to send its output to the serial port, add some lines to your kernel command line in /boot/grub/menu.lst, so that it looks something like this:

title Debug Kernel
  kernel /vmlinuz ... console=ttyS0,115200n8 console=tty0

Using GDB

Once you have this system set up, you still have to work with GDB: not the easiest thing to do. I've put together a script that defines some extra commands that make GDB more usable for me. This script lives in my source tree on the host as $LINUX/.gdbinit so it gets loaded whenever I launch GDB.

The following code is heavily tied to the 32-bit x86 architecture. If you're working on a different architecture, let these examples inspire you to do something similar.

I assume you know how to use GDB already; if not, there are bound to be some useful guides out there on the net somewhere. I'm mostly going to go over my own script and the new commands it defines.

Basic commands

First, to start debugging after loading GDB, you need to tell it to connect to VMware. You do this with the connect command when the virtual machine is running. This also prints out the kernel build number and date, so you can make sure you're debugging the correct build. To release control of the debugger, use the detach command.

Eventually, you'll want to examine the kernel's current variable. Unfortunately for x86 architectures, current is a macro written in assembly, so GDB can't access it. You can however use the convenience variable $current, which is refreshed every time the debugger stops.

While debugging, you can use the loc command to have GDB print out information on the current process, and if you are debugging a recording, the timestamp of your current location. This also gets printed out when you stop after a continue command. The information printed is the current process's PID and some relevant flags. You change exactly which flags you want to look at by adjusting the __tsk_flags command as necessary. Just follow the example to add other flags: TIF_* flags are in thread_info_32.h, and PF_* flags are in sched.h). When working correctly, it gives you a status line that looks something like this:

Current pid=(2734) flags=(TIF_NEED_RESCHED ) VM position=632864

When you are replaying, the monitor x commands can be useful. First, monitor offset displays your location within a replay (also shown by loc). Knowing that location, you can quickly jump back to the same point on a subsequent replay by using monitor stopat.

And now, the full $LINUX/.gdbinit script. Make sure that you copy and paste this correctly. All lines that look like they end in "\" really have a trailing space that must be preserved, and GDB is very picky about this!

file $VMSHARE/vmlinux

# Recompute $current at every stop.  Valid only inside kernel.
define set-current
  set $current=*((struct task_struct**)((unsigned long)$esp&0xfffff000))
end
define hook-stop
  set-current
end

# Reconnect to the VM after issuing "detach."
define connect
  target remote localhost:8832
  echo Kernel version =
  output/s init_uts_ns.name.version[0]@36
  echo \n
end

# Print extra information about the current process and replay location
define loc
  if $esp >= 0xc0000000
    # Extra information if in kernel
    echo Current \ 
    echo pid=(
    output $current->pid
    echo ) \ 
    echo flags=(
    __tsk_flags $current
    echo ) \ 
  end
  echo VM position=
  monitor position
end
define hookpost-continue
  loc
end

# helper
define __test_tsk_thread_flag
  if (*((unsigned long)($arg0)->stack + 0x8) & (1U << ($arg1))) != 0
    echo $arg2\ \ 
  end
end
# helper
define __test_tsk_flag
  if (($arg0)->flags & ($arg1)) != 0
    echo $arg2\ \ 
  end
end

# Print out the task flags I'm interested in
define __tsk_flags
  __test_tsk_thread_flag ($arg0) 2 TIF_NEED_RESCHED
  __test_tsk_flag ($arg0) 0x00000004 PF_EXITING
end

# Print flags for the given task_struct
define tsk_flags
  echo flags=(
  __tsk_flags $arg0
  echo )\n
end

Hopefully, this collection of GDB functions helps ease the pain of debugging, if only a little bit.

Further Reading

Tuesday, August 18, 2009

Courgette

Google has recently publicized a new method for compressing Chrome patches, which they call "Courgette". They argue that the generic bsdiff tool isn't efficient enough for them, and they can save a lot of bandwidth by reducing patch size. That makes it more practical for them to patch more often, automatically.

They offer one example of the effect of this new method for compression. For a 9.9MB full update, the bsdiff patch is 6.8% of that size (688KB), and a Courgette update is 0.76% (77KB). This number is certainly impressive, but it is only one example. I'd like to know if it is reasonable to expect this level of compression on all updates. Existing work in this area does consider updates to many different packages, across many types of updates, and the particular update does make a difference.

Though they compare against bsdiff, their technique sounds a lot like Exediff. The Exediff algorithm goes something like this. First, try to match instructions in the original executable against instructions in the updated version. All unmatched instructions are "primary differences," and these changes must be encoded in the patch file. Then, based on these changes, predict how the matched instructions should have changed. When the change is predictable, it doesn't need to be encoded in the update. Otherwise, the change is a "secondary difference" and must be include in the update. Exediff uses knowledge of the instruction set to predict how matched instructions will change, if at all. Despite this, the author of bsdiff showed that across many different types of patches, Exediff and bsdiff have similar performance.

So, both Exediff and Courgette appear to do the same thing. How then does Courgette do so much better at compressing? It looks like the "adjust" step is where the magic happens. Courgette changes the updated binary so that it has fewer differences, letting the patch be smaller. That updated binary is then reconstructed by the client, not the original update.

This leads to some interesting questions, the first concerning repeated patching. For instance, if I have a base version A which I patch to version B, how should I generate the patch to version C? Do I generate the binary for C so that the patch B -> C is minimized, or should the patch A -> C be minimized? What happens for patch M, or Z, or AAA?

Deciding which patch to optimize would seem to depend on which upgrade path is more popular. With the goal of decreasing total bandwidth usage, you could use server logs (or other usage data) to predict an initial distribution of versions currently in use. Based on historical data, you could then predict which users are going to update their software before the next patch is released. The aggregate of this data gives you a distribution of patch downloads. Pick the most popular one, and optimize that. Possibly, the others will be similarly small. Or, given such a distribution, can that information be used in the adjustment process to generate a binary that minimizes changes over a set of patches, rather than one in particular?

My guess is that Courgette optimizes each patch against only the latest version, assuming that most everyone will be running it. With auto-updating software, a user would have to fail to open the program for an extended amount of time, longer than it takes to push out two patches. This may not be likely.

I also wonder what side effects the binary modifications might have on the updated binary. By reordering or moving code, could it get slower?

These are interesting points that I hope Google engineers may address if they ever decide to publish a paper describing Courgette in greater depth. I hope they do.

Thursday, June 18, 2009

SSH Master

I just discovered (via blog) an amazing setting for OpenSSH: the MasterController. You can now designate an SSH invocation to run a "master" session. Then if I run SSH again connecting to the same computer, a new connection will be opened over that master session, without making a new TCP connection, establishing a new session key, or reauthenticating. This greatly speeds up connection speeds, making things like distcc a lot more practical.

I've know this was possible from using the commercial SSH for Windows on school computers, I just had no clue how to get it working from the Unix world until now.

Wednesday, June 17, 2009

Chrome parallelism

If you haven't used Google's new Chrome browser, it's definitely worth looking at. The team at Google has written a browser that tries to address some of the shortcomings of other modern browsers. Each part of a web page, the renderer, the JavaScript engine, plugins, etc. all operate in their own processes. This lets each component be isolated from one another in separate address spaces. So when Flash crashes, or Acrobat gets exploited, other tabs keep on going. (I believe the latest Internet Explorer also does something similar to this.) Having separate processes also helps reduce memory fragmentation.

Compare this with the dominant method for writing parallel programs: shared memory with locks and condition variables. That system is fast, but plagued with problems. Shared memory doesn't enforce exclusive access: the programmer is responsible for getting the right locks. When she doesn't, data races happen. If she gets the ordering of some locks wrong, deadlocks could happen. If locks are too fine-grained, she might introduce atomicity violations. And if one thread fails, what can you do if it's holding locks and has left shared state inconsistent?

A simpler life exists: do not share memory, and use explicit communication between concurrent processes. Instantly, locking goes away. No deadlocks, no data races. Failure handling becomes simpler, since no state can be corrupted.

This kind of model is used in the Erlang language, and microkernels have featured these benefits for years. Erlang was developed to be exceptionally fault-tolerant, and still drives critial hardware systems (in addition to Facebook's IM application). These systems haven't been used extensively in part because shared memory is fast. (And perhaps because of Erlang's syntax.) As a result, we researchers try to replay data races quickly and design fault-tolerant wrappers for Linux kernel modules.

I'm glad to see a significant desktop application decide that fault-tolerance and isolation is more important than raw speed.

Tuesday, June 16, 2009

Restaurant deals and BBQ

This week in Ann Arbor, restaurants all along main street will be wooing townies to sample their meals by offering fixed-price menus. Waltz in and be treated to a selection of the restaurant's finest three-course meal for only $25. It's all part of Ann Arbor Restaurant Week.

My wife and I were looking through the menus online when we began to ask ourselves, "Just how much are we saving by eating out for this special event?" After searching through a couple of menus, it turns out to be wrong for us. Restaurants aren't offering special menu items, so we just priced the same meal. Some places offered choices, so we chose the most expensive items to compare.

We found that most places only save us a couple of dollars off the equivalent meal. In one case, the fixed price dinner (with no options) would cost $1 more than normal. But it's worse than that. When we go out, we would order less food than 2 fixed meals: we typically split appetizers and deserts. Taking that into account, there wasn't a single restaurant we looked at that presented us a true savings.

So what is the use of Restaurant Week? It's all just advertising, pure and simple.

Usually, I hate letting myself be influenced by advertising, but this time, the advertisers were successful. Not in the way they intended though.

We had been trying to pick out our favorite place for BBQ baby-back ribs in Ann Arbor, and we thought this would be a good time to experiment. We headed home with a half-rack of ribs from our two top-runners: Grizzly Peak and The Blue Tractor. They're owned by the same parent company, but they do serve different food. We'd been going back an forth on this issue for a while now, so we finally made the direct comparison.

The winner: Blue Tractor sauce on Grizzly Peak meat. Grizzly Peak's meat was more tender and better quality, but the sauce was really too sweet. Blue Tractor had a nice spicy sauce, but its meat was a little tough. Since meat quality could vary by the night and by chef, we can't definitively say which place serves better ribs, but our bet's on Blue Tractor.

Saturday, May 30, 2009

Dead Camera

Today, I picked up my camera from Best Buy. During my honeymoon a couple months ago, it took a dip into a melted snow cone, and it hasn't turned on since. I bought an extended warranty with the camera, so I figured I would see what Best Buy could do.

After dropping off the camera, it took Best Buy (through their Geek Squad service) two weeks to ship the camera to a central repair location before once looked at it. I imagine it took a competent person about 10 minutes to look at it before deciding that "corrosive damage" wasn't covered by my warranty and the cost to repair it exceeded its value. Three weeks later, I get my camera back.

All right, it was questionable whether accidental damage was ever meant to be covered. But still, what is the point of the extended warranty?

As for Geek Squad, seriously, taking weeks just to tell me that my damage is not covered is unacceptable. If it wasn't covered, it would have been nice to tell me that when I tried to get it repaired. It may be that centralized repair is the only way Best Buy can function, but I would hardly call this "functioning." There is a better way: ignore the Geek Squad and stay local.

Now, I get to see just how difficult it is to disassemble my camera and clean it myself!

Friday, May 29, 2009

Hello World

Hi all, and welcome to my blog.

This site isn't about much in particular, just whatever I feel like sharing at the time. I'm a graduate student in computer science living in Michigan, so you're likely to see a very geeky theme throughout all these posts. That's me!

The title comes from three parts of my life, though not of equal importance. I love π. It's official: I even have a t-shirt proclaiming it. I like apples, both the kind you make cider out of (yay autumn in Michigan) and the kind you program. I'm also a big foodie, finding solace in making a nice bowl of chili. This blog is kind of a big mixture for all the interesting things going in my life.