Thursday, May 11, 2006

AIX Kernel Debug - II

After a not so brief hiatus, I am back. Exams got over recently, and now I have a bit of free time to update the kernel debugging notes. So this one is about viewing kernel data using kdb.

The AIX kdb pages [link] list three different ways in which you can view kernel data from userspace. If its global data, and has been duly exported with a .exp file at link time, then its as simple as using the variable name. However, since global data is not usually exported this way, and we do not want to recompile a module just to debug, we would look at the other options.

Both the other methods require using a mapfile. A mapfile is a list of symbols in the kernel module, and their addresses (only relative), that can be generated while linking the module ( using the -bloadmap:{mapfile} option.)

Once that is done, this is how a mapfile looks like

--- -------- ------ -- -- -- ----- ------------------------- -------------------------------------------------
I ER S1 getgidx /lib/syscalls.exp{/unix}
I ER S2 getuidx /lib/syscalls.exp{/unix}
I ER S3 _system_configuration /lib/syscalls.exp{/unix}
000007A4 PR LD S58 <.qlog_write>
000007E4 PR LD S59 .qlog_disable_console_logging
00000824 PR LD S60 .qlog_enable_console_logging
00000864 PR LD S61 .qlog_get_log_types
... 000033DC 000004 2 TC SD S506 <> 000033E0 0001EC 3 RW CM S507 g_qlog_cntl qlog.c(qlib.lib) 000035CC 000004 2 RW CM S508 kern_config kernel_config.c(kconf.lib) 000035D0 00002C 3 RW CM S509 <_$static_bss> sc_dnlc.c(dnlc.lib)
000035FC 000004 2 RW CM S510 workflow_id kernel_config.c(kconf.lib)

Understandably, there are several classes of symbols in this file, indicating program code, glues, readonly and readwrite data. The PR LD or the GL LD class indicates program code or glue code, and is the text segment. The offsets that appear here are relative to the offset of the program entry point, which you specified as -binit:{ entrypoint } while linking.

So if you know the load address of the entry point, you can calculate the addresses for each function. And the lke command lets you do that, in kdb

(0)> lke

1 024F8000 02A1C000 0000F6E0 00000242 /usr/local/solidcore/s3/modules/scdrv
2 024F8F00 029D6000 00009348 00000252 random32/usr/lib/drivers/random
3 0219DF00 029B6000 000009AC 00100248 /unix
4 024F8E00 0286A000 001229D0 00000252 nfs.ext32/usr/lib/drivers/nfs.ext

So to find out where scdrv loads, Use the number that appears in the first coloum

(0)> lke 1

1 024F8000 02A1C000 0000F6E0 00000242 /usr/local/solidcore/s3/modules/scdrv
le_flags....... TEXT DATA DATAEXISTS
le_next........ 024F8F00 le_lex......... 00000000
le_fp.......... 00000000 le_fh.......... 00000000
le_file........ 02A1C000 le_filesize.... 0000F6E0
le_data........ 02A280D8 le_datasize.... 00003608
le_ldr......... 02A2C000 le_ldr_size.... 0x1141 (4417)
le_exports..... 00000000 le_entrypoint.. 02A2B268
le_usecount.... 1 le_loadcount... 1
le_ndepend..... 1 le_maxdepend... 1
le_filename.... 024F8060 le_depend.... @ 024F805C
TOC@........... 02A2B34C
process trace backs
.scdrv 02A1C160 .scdrv_fini 02A1C2A0

This address, when added to the relative addresses in the map file, gives us the load location of the functions.

Monday, March 20, 2006

Relinking binaries

AIX never ceases to amaze. During the course of brushing up for a presentation on AIX, this turned up in the ld man page.

The ld command can relink a program without requiring that you list all input object files again. For example, if one object file from a large program has changed, you can relink the program by listing the new object file and the old program on the command line, along with any shared libraries required by the program.

AIX is THE only Unix that can do that. Doesn't that means limitless possibilities ? It turns the whole idea of binary editing upside down. As an example, let us say we have a "test" program consisting of 2 different functions main and testfn, contained in main.c and testfn.c.

I want to override testfn. I can do that so easily, by writing testfn in a new file, say newfn.c and relinking the binary.

$ xlc -o test newfn.c test

Can we use this to extract code from binaries ? Don't know, lets figure out.

Wednesday, January 18, 2006

AIX Crash dumps - II

Once you have a crash dump, there are several things you might like to do. If you are fiddling with filesystems, for example, you would like to be able to print vnodes and gnodes.

In kdb, you first need to tell kdb the header that defines your structure. In case of struct vnode, its sys/vnode.h. So we invoke kdb as

# kdb -i /usr/include/sys/vnode.h

and issue the print command

0>print vnode address

Which should print something like

struct vnode {
ushort v_flag = 0x0000;
ushort v_flag2 = 0x0000;
ulong32int64_t v_count = 0x00000000;
int v_vfsgen = 0x00000000;
union Simple_lock {
simple_lock_data _slock = 0x00000000;
struct lock_data_instrumented *_slockp = 0x00000000;
} v_lock;
struct vfs *v_vfsp = 0x31349808;
struct vfs *v_mvfsp = 0x00000000;
struct gnode *v_gnode = 0x13C823E0;
struct vnode *v_next = 0x00000000;
struct vnode *v_vfsnext = 0x13987F38;
struct vnode *v_vfsprev = 0x13D1AAE8;
union v_data {
void *_v_socket = 0x00000000;
struct vnode *_v_pfsvnode = 0x00000000;
} _v_data;
char *v_audit = 0x00000000;
} foo[0];

AIX crash dumps

These are supposed to be my working notes on Crash dump analysis on AIX.

Step 1:
OK, so when the system panics, you will hear the periodic beeps typical of the RS/6000 (if you are sitting close by). The beeps would go on as the machine dumps core, and if you do not want a core and have a slow machine, you could better restart by pressing the system reset button on the machine

[ There is a Step 0: where you make the system panic, but the details are left as an excercise to the reader :). See sysdumpstart(1) for more details. ]

Step 2:
When the system now boots, it may prompt you about saving the core dump before going into normal boot process. If you have a place to save ( a spare partition) , do save the dump here, or else press 99 to continue.

[ All of this depends on your configuration. Running sysdumpdev will allow you to examine and alter your dump settings. The Primary dump device is /dev/hd6 by default, which is also the AIX paging volume. Other settings include compression and ... (well, RTFM) ]

Step 3: (If you pressed 99 in last step, or else skip)
Now when the system is up, you use savecore -f /directory_path to save the core in a specific place. This would leave you a compressed .Z file, and a kernel image.

Step 4:
Uncompress the core using gunzip.

Step 5:
Run kdb as

# kdb vmcore.1 vmunix.1

And you would get something like this from kdb in return

19) mirdd [5 entries]
20) kbddd [2 entries]
21) mousedd [2 entries]
Component Dump Table has 913 entries
0000000000001000 0000000002147040 start+000FD8
000000002FF3B400 000000002FF80A98 __ublock+000000
000000002FF22FF4 000000002FF22FF8 environ+000000
000000002FF22FF8 000000002FF22FFC errno+000000
00000000E0000000 00000000F0000000 sys_resource+000000
raddr.....0000000000C00000 eaddr.....0000000000C00000
size..............00000000 align.............00000000
valid..1 ros....0 fixlmb.1 seg....1 wimg...2

raddr.....0000000001000000 eaddr.....0000000001000000
size..............00000000 align.............00000000
valid..1 ros....0 fixlmb.1 seg....1 wimg...2
Dump analysis on POWER_PC POWER_604 machine with 1 available CPU(s) (32-bit registers)
Processing symbol table...

Step 6:
Now we would see what really caused the panic. Run stat, and you will get some info on the machine and the core

(0)> stat
POWER_PC POWER_604 machine with 1 available CPU(s) (32-bit registers)

sysname... AIX
nodename.. fundu
release... 3
version... 5
build date Apr 10 2005
build time 21:52:04
label..... yes
machine... 00081AAA4C00
nid....... 081AAA4C
time of crash: Wed Jan 18 04:29:05 2006
age of system: 15 hr., 52 min., 38 sec.
xmalloc debug: disabled

And the real thing, a backtrace of the function call

CPU 0 CSA 2FF3B400 at time of crash, error code for LEDs: 30000000
pvthread+004D80 STACK:
[0000A5E0].test_and_set+000020 ()
[00215020]slock_ppc+000320 (??, ??)
[00009554].simple_lock+000054 ()
[001F52A4]j2_rename+000140 (??, ??, ??, ??, ??, ??, ??)
[029CC17C]sc_vop_rename+000088 (??, ??, ??, ??, ??, ??, ??)
[002F8D54]vnop_rename+0000DC (??, ??, ??, ??, ??, ??, ??)
[0033C4C4]rename+00035C (2FF22DD3, 2FF22DE2)
[00003AD8].sys_call+000000 ()
[kdb_get_memory] no real storage @ 2FF226C8

There are more steps, which would be added after I am done with the debugging of the core at hand.

[ As a sidenote: The stat command is also the AIX equivalent of dmesg, which you would certainly miss, if you have used other unixes, such as Linux and HP-UX. It would show you the last few kernel printfs, still in memory, and that includes messages generated using bsdlog ]