Wasted time, but fun

[ from the arrrgh-damn-versioning-we-meet-again dept. ]

I just spent 2 days trying to fix a potential bug in the PCMCIA driver for my PRISM2 based wifi card (Sitecom GmBH) in my Apple Powerbook, only to discover it was somewhat a non-bug. As it would turn out, I had a few loadable modules (LKMs) kicking around which were compiled with slightly different set of options (bizzarely though, nothing which I would have thought would have needed a rebuild anyway) which resulted in somewhat random wait_queue corruption in khelper.

I’ve been experiencing a few problems with Powerbook Linux, and as part of my resolutions to make myself a better person – I’ve decided to actually fix stuff that really bothers me. Like having PCMCIA die on any cardctl eject event – or randomly when putting the Powerbook to sleep. Generally I would recommend running Linux on one of these since the support is usually pretty good – but sometimes things do go wrong, and for those who care, here’s the “411″ on this recent non-bug.

Linux has various parts of the hotplug subsystem and events subsystem, which work together when a new device is inserted before it becomes available. Linux has a permanent kernel_thread helper running called khelper. This thread is occasionally used as a schedulable context for stuff that call_usermode_helper wants to run – like /sbin/hotplug or modprobe. It’s a pretty cool solution to the problem of calling out to userspace for help. khelper is responsible for running up hotplug when I insert my wifi card, which then configures the netdevice and hooks up ifplugd to watch for net events. Incidentally, for those who don’t have ifplugd watching their network sockets – where have you been?

I have recently been seeing kernel oopses on card removal, similar to this one:

Oops: kernel access of bad area, sig: 11 [#1]
PREEMPT
NIP: 4800024C LR: C001A3F4 SP: CFF97ED0 REGS: cff97e20 TRAP: 0400    Not tainted
MSR: 40001032 EE: 0 PR: 0 FP: 0 ME: 1 IR/DR: 11
TASK = c0570660[4] 'khelper' THREAD: cff96000Last syscall: 120
GPR00: 4800024D CFF97ED0 C0570660 C00305BC 00000003 00000000 00000000 00000000
GPR08: 00009032 C00305C8 C1275D80 CFF96000 CFF96000 00000000 00000000 00000000
GPR16: 00000000 00000000 00000000 00000000 00000000 CFFA68A0 CFF96000 CFFA68B8
GPR24: 00000003 00000000 00000000 00000001 C1275D84 38A5A768 7C00F028 CFF97ED0
NIP [4800024c] 0x4800024c
LR [c001a3f4] __wake_up_common+0x58/0xa4
Call trace:
 [c001a5c4] complete+0x58/0x98
 [c0030800] __call_usermodehelper+0x78/0x88
 [c0030ca0] worker_thread+0x1ac/0x27c
 [c0035b0c] kthread+0xb8/0xc0
 [c000b490] kernel_thread+0x44/0x60
note: khelper[4] exited with preempt_count 1

Note how the LR (Link Register) contains c001a3f4, the address of __wake_up_common+0x58 (hex offset 0×58 bytes), a function which is apprently 0xa4 hex bytes in length. On PowerPC, the Link Register serves as a saved instruction pointer one can return to in a function epilogue by doing a blr, or similar. So the cause of the exception was whatever happened at that address. It turns out that this was a branch instruction. Here’s the offending function:

/*
 * The core wakeup function.  Non-exclusive wakeups (nr_exclusive == 0) just
 * wake everything up.  If it's an exclusive wakeup (nr_exclusive == small +ve
 * number) then we wake all the non-exclusive tasks and one exclusive task.
 *
 * There are circumstances in which we can try to wake a task which has already
 * started to run but is not in state TASK_RUNNING.  try_to_wake_up() returns
 * zero in this (rare) case, and we handle it by continuing to scan the queue.
 */
static void __wake_up_common(wait_queue_head_t *q, unsigned int mode,
                             int nr_exclusive, int sync, void *key)
{
        struct list_head *tmp, *next;

        list_for_each_safe(tmp, next, &q->task_list) {
                wait_queue_t *curr;
                unsigned flags;
                curr = list_entry(tmp, wait_queue_t, task_list);
                flags = curr->flags;
                if (curr->func(curr, mode, sync, key) &&
                    (flags & WQ_FLAG_EXCLUSIVE) &&
                    !--nr_exclusive)
                        break;
        }
}

The offending line turns out to be the call to curr->func(curr, mode, sync, key) since this dereferences an entry in the waitqueue and attempts to run the func element. func is the wakeup function, which is usually default_wake_function but presumably might be changed for some special applications (example sought) . Suppose this wait queue entry existed on a dud stackframe somewhere or was otherwise trashed and the function pointed off in to space, as is the case with this oops. Then the kernel would get very unhappy calling it.

What followed was over a day of exploring with the PCMCIA subsystem and going through the source to cs.c and ds.c in an attempt to understand what might have registered itself for a wakeup. Eventually I did a full recompile with spinlock and wait queue debugging, which failed to run properly without recompiled modules – that was a fairly obvious mistake – but when I did re-install modules, I eventually figured it must have been that I had missed this step on a previous build.

So that’ll teach me to always be sure I do a make modules_install step on any new kernel.

Jon.

One Response to “Wasted time, but fun”

  1. Paul Reiber says:

    Hardly “Wasted Time” at all – thank you for the write-up, Jon!

    This tour through a bit of the kernel and how you debugged things is a gem.

    Regards,
    -pbr

Leave a Reply