One of the most useful approaches when debugging a Linux system, is the ability to investigate the target at the point of failure.
Typically for userspace issues this does not pose much of a challenge since the application can often times be restarted and the failing use case re-run to obtain more information. The same however, cannot be said for kernel problems. A kernel “oops” will typically take the target down with it, giving back very terse but useful information in the form of a backtrace.
While this may be sufficient to begin debugging the problem, it falls short from giving enough information about the state of the system e.g. memory usage, data structures, memory contents, open file descriptors etc.
Typically for userspace applications we can obtain a core dump which we can parse using GDB and fully inspect the memory contents of the offending application. This ends up being wonderfully useful as we can obtain the state of the application to help pinpoint the exact problem; but how can we do this for kernel?
In this series of articles, I will focus on how to obtain and parse a kernel crashdump. We will begin by first discussing the mechanism by which we obtain the crashdump.
Kexec, is a kernel mechanism which can be used as an “in kernel” bootloader. Essentially we can use kexec to load another kernel “on top” of the existing kernel without going through the BIOS.
Using KEXEC we can use the Kdump mechanism to obtain a crashdump. In a nutshell the flow can be described as below:
3. Detailed Flow
1. Boot the main kernel and reserve some portion of RAM for an additional “crash” kernel to be loaded at runtime
2. Load the crash kernel into the reserved portion of memory via the kexec system call
3. Upon panic/stall/sysrq-trigger kexec will boot the crashkernel
4. The memory map for the crash kernel will be *only* the reserved memory to avoid overwriting the main kernel’s memory
5. Once the crash kernel has booted, the interface /proc/vmcore will be exposed and can be used to collect the dump of the main kernel’s memory
6. Once the vmcore file is obtained, we can pair it with the vmlinux and System.map files to parse it using GDB or crash-tool
In part 2 of this series, we will discuss how to enable this mechanism to obtain a crashdump.
Additional Resources and Attributions:
Kdump information: https://www.kernel.org/doc/Documentation/kdump/kdump.txt
Above Image: By V4711This vector graphics image was created with Adobe Illustrator. (Own work) [CC BY-SA 4.0], via Wikimedia Commons
- Bit Bang GPIO UART from Linux Kernel
- Debugging the Linux Kernel via Crashdumps Part 2