Ludicrously Technical – Kernel ABI Tracking

Part of the “Ludicrously Technical” series from Jon Masters.

The Linux kernel is, fundamentally, a collection of functions that happen to live in privileged memory and have certain magical abilities not of the regular application variety. But the kernel is also extensible at runtime, through loadable modules. These .ko files are simply complex ELF objects, containing a variety of code (that we dynamically link into the running kernel memory) and meta data. Meta includes symbol dependency information – checksums (modversions) for individual kernel symbols provided and/or used by a given module.

You can use GNU nm to visualize symbolic dependencies:

[jcm@jcmlaptop ~]$ nm -gnu /lib/modules/2.6.18-8.1.4.el5/extra/ipw3945/ipw3945.ko | sort -k 2
U alloc_ieee80211
U __alloc_skb
U autoremove_wake_function
U __const_udelay
U __create_workqueue
U _ctype
U del_timer_sync
U destroy_workqueue
U dev_kfree_skb_any

These undefined symbols are needed by the Intel IPW3945 WiFi driver, in order for it to be loaded into the RHEL5 kernel on my funky Intel laptop. Each of these has a checksum, which the driver requires:

[jcm@jcmlaptop ~]$ /sbin/modprobe –dump-modversions /lib/modules/2.6.18-8.1.4.el5/extra/ipw3945/ipw3945.ko | sort -k 2
0×1757d1f7 alloc_ieee80211
0×9aebf873 __alloc_skb
0xc8b57c27 autoremove_wake_function
0xeae3dfd6 __const_udelay
0×4efd93a9 __create_workqueue
0×8d3894f2 _ctype
0×0c659d5a del_timer_sync
0×0b1ddd1b destroy_workqueue
0×149a799f dev_kfree_skb_any

So, one of the jobs of module-init-tools, in concert with the kernel’s in-kernel linker (actually, these days, it’s mostly Rusty’s in-kernel magic, but I like to think I’m involved…) is to handle all of these symbol dependencies, and match them against the running kernel…before we try linking the module into the running kernel. The aim is to ensure binary compatibility between a module and a Linux kernel. Because binary compatibility results in fewer cases of non-functional WiFi. And this is a good thing when you’re packaging your WiFi driver on a CD as an RPM package for folks to add to their systems.

ABI compatibility is funky stuff. Enterprise Linux vendors typically ensure that the ABI on their kernel (at least, the part visible to third parties) won’t change enough to break certain third party modules. And that means tracking all of those symbolic dependencies, and making sure that they don’t constantly change. Welcome to my world. You’re going to miss having so much hair on your head, well, maybe you will. ABI checksum information (modversions) are generated during the kernel build process, using genksyms. The magic happens in the scripts/Makefile.build file:

# When module versioning is enabled the following steps are executed:
# o compile a .tmp_.o from .c
# o if .tmp_.o doesn’t contain a __ksymtab version, i.e. does
# not export symbols, we just rename .tmp_.o to .o and
# are done.
# o otherwise, we calculate symbol versions using the good old
# genksyms on the preprocessed source and postprocess them in a way
# that they are usable as a linker script
# o generate .o from .tmp_.o using the linker to
# replace the unresolved symbols __crc_exported_symbol with
# the actual value of the checksum generated by genksyms

cmd_cc_o_c = $(CC) $(c_flags) -c -o $(@D)/.tmp_$(@F) $< cmd_modversions =
if $(OBJDUMP) -h $(@D)/.tmp_$(@F) | grep -q __ksymtab; then
$(CPP) -D__GENKSYMS__ $(c_flags) $<
| $(GENKSYMS) $(if $(KBUILD_SYMTYPES),
-T $(@D)/$(@F:.o=.symtypes)) -a $(ARCH)
> $(@D)/.tmp_$(@F:.o=.ver);

$(LD) $(LDFLAGS) -r -o $@ $(@D)/.tmp_$(@F)
-T $(@D)/.tmp_$(@F:.o=.ver);
rm -f $(@D)/.tmp_$(@F) $(@D)/.tmp_$(@F:.o=.ver);
else
mv -f $(@D)/.tmp_$(@F) $@;
fi;
endif

In English (version 1.0, the original kind), this means that if we have a kernel that uses modversioning metadata, then we’ll end up compiling each C source file in the kernel, looking for exported symbols. During a kernel compile, modpost adds in export information for exported kernel symbols, pre-pending them with “__ksymtab” (how do I know this? I just do, and you can know this stuff too, if you spend too much time on the kernel build process), and we can rip this out using objdump (to rip out symbols starting with __ksymtab) when we check for exported symbols. If a given file doesn’t export any symbols, we don’t care any more and we just move it into a finished state. Done. But we care about exports.

Kernel compiled code that contains exported symbols (via EXPORT_SYMBOL, EXPORT_SYMBOL_GPL, and its friends) needs a genksyms run to generate checksum data. That’s what the call to $(CPP) – the C compiler’s pre-processor – is used for. We get GCC to spew out a bunch of horrible crap, which we’ll then shove into genksyms, as we generate magical checksum metadata. Here’s what that $(CPP) call might expand to, on a typical Linux system (RHEL5 in my case, because I run RHEL5 on my laptop…and you should too…RHEL5 rocks my world) when building the ieee80211 core module, used by IPW3945:

[jcm@jcmlaptop linux-2.6.18.i686]$ gcc -m32 -E -D__GENKSYMS__ -nostdinc -isystem /usr/lib/gcc/i386-redhat-linux/4.1.1/include -D__KERNEL__ -Iinclude -include include/linux/autoconf.h -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Wstrict-prototypes -Wundef -Werror-implicit-function-declaration -Os -pipe -msoft-float -fno-builtin-sprintf -fno-builtin-log2 -fno-builtin-puts -mpreferred-stack-boundary=2 -march=i686 -mtune=generic -mtune=generic -mregparm=3 -ffreestanding -Iinclude/asm-i386/mach-generic -Iinclude/asm-i386/mach-default -fomit-frame-pointer -g -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -DMODULE -D”KBUILD_STR(s)=#s” -D”KBUILD_BASENAME=KBUILD_STR(ieee80211_module)” -D”KBUILD_MODNAME=KBUILD_STR(ieee80211_module)” net/ieee80211/ieee80211_module.c | ./scripts/genksyms/genksyms
__crc_alloc_ieee80211 = 0×1757d1f7 ;
__crc_free_ieee80211 = 0xa27818cd ;
__crc_escape_essid = 0xa9fb135f ;

We instruct GCC to run as a preprocessor, set a few quadrillion command line flags (because it’s always fun to play with GCC flags you don’t get to use often enough) and shove the output from the pre-processor into genksyms. The data input into genkysms is effectively a complete tree of dependencies and definitions for a given symbol. We can ask genksyms to give us more useful output, tracking a given symbol’s ABI dependencies. To do this, add a -D to the above command pipe, and you’ll get the following wonderful output:

Export alloc_ieee80211 == <struct net_device { char name [ 16 ] ; struct hlist_node { struct hlist_node * next , * * pprev ; } name_hlist ; unsigned long mem_end ; unsigned long mem_start ; unsigned long base_addr ; unsigned int irq ; unsigned char if_port ; unsigned char dma ; unsigned long state ; struct net_device * next ; int ( * init ) ( struct net_device * ) ; <etc.>

The output is long, very long. And soft. It’s soft, strong, and very very long. Like Andrex. But you can see the structs, prototypes and other randomness that forms a given checksum – in this case, the checksum for alloc_ieee80211. Determining ABI breaks and fixing them can involve a wonderous iterative process of running genksyms in debug mode, looking at the output, running a mental diff, and looking for which dependent structs or function prototypes broke. I’d make it easier, but I like overlying complex horrible crap. And coffee. I like horribly complex crap, horrible amounts of coffee, and ludicrous quantities of Californian blueberries too.

Jon.

Leave a Reply