Quick and Dirty Guide to Building a Kernel Development Environment
Note:
- This article assumes that the reader is trying to build and run a linux kernel with qemu, on linux, in order to develop kernel modules.
- The author is not an experienced kernel developer. These are just notes to document their learning, and to keep for future reference.
- If you are only interested in setting up a minimal environment, feel free to skip to the Building the kernel section.
Background Knowledge
Setting up a development environment in which the developer is able to freely and safely abuse the kernel load and unload kernel modules is the first step in kernel hacking. There are other approaches to setting up an environment, such as installing the kernel on actual hardware and using a debug interface, or setting up a virtual machine, etc. In this article we explore how to run a locally built linux kernel using qemu
.
In order to achieve this, there are two main components we must be concerned with:
- The linux kenrel
- The initramfs (
busybox
+ config files + α)
Before diving into building the environment, I would like to dedicate a few paragraphs to exploring the aforementioned components. If you are not interested, please feel free to skip forward to the relevant section.
If we take a look at the contents of /boot
, you will see something like the following (the names and content will differ depending on the distro and architecture):
$ cd /boot/
$ ls
amd-ucode.img EFI grub initramfs-linux-fallback.img initramfs-linux.img vmlinuz-linux
$ file vmlinuz-linux
vmlinuz-linux: Linux kernel x86 boot executable bzImage, version 6.11.8-artix1-2 (linux@artixlinux) #1 SMP PREEMPT_DYNAMIC Fri, 15 Nov 2024 18:25:29 +0000, RO-rootFS, swap_dev 0XC, Normal VGA
$ file initramfs-linux.img
initramfs-linux.img: ASCII cpio archive (SVR4 with no CRC)
What we are concerned with is vmlinuz-linux
and initramfs-linux.img
(again, the name will probably differ depending on your environment). These are the compressed linux kernel and the initramfs. A compressed kernel? I initially assumed this was due to historic reasons, when disk space was quite constrained (still constrained in embedded systems, but I’m not sure if they use kernel compression). However, there seems to be additional reasons besides that 1. During the boot procedure, the kernel is decompressed into memory by the bootloader. The bootloader then hands control over to the loaded kernel.
We can try running this kernel with qemu
, just to see what happens.
$ qemu-system-x86_64 -kernel /boot/vmlinuz-linux -m 512 -enable-kvm -append "console=ttyS0" -nographic
[ 0.324458]
[ 0.324922] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
[ 0.325795] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.11.8-artix1-2 #1 1400000003000000474e5500ee8cc4000e512
[ 0.327027] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
[ 0.328049] Call Trace:
[ 0.328320] <TASK>
[ 0.328546] dump_stack_lvl+0x5d/0x80
[ 0.328958] panic+0x118/0x2ca
[ 0.329296] mount_root_generic+0x1ce/0x270
[ 0.329754] prepare_namespace+0x1e7/0x230
[ 0.330225] kernel_init_freeable+0x26f/0x2a0
[ 0.330697] ? __pfx_kernel_init+0x10/0x10
[ 0.331149] kernel_init+0x1a/0x150
[ 0.331511] ret_from_fork+0x34/0x50
[ 0.331888] ? __pfx_kernel_init+0x10/0x10
[ 0.332335] ret_from_fork_asm+0x1a/0x30
[ 0.332749] </TASK>
[ 0.333020] Kernel Offset: 0xf600000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbffffff)
[ 0.334181] ---[ end Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0) ]---
# Ctrl-a-x to exit
Wonderful.
As the console log suggests, the kernel panicked because it was not able to find a root fs to mount. That is, the file system that is mounted on /
. This is quite natural since we didn’t specify a disk to qemu
. qemu
had loaded and ran the kernel, the kernel did some initialization, and attempted to mount the root fs, but didn’t find anything and crashed. This is where the initramfs enters the scene.
$ reset # run reset if the output has become corrupted
$ qemu-system-x86_64 -kernel /boot/vmlinuz-linux -initrd /boot/initramfs-linux.img -m 512 -enable-kvm -append "console=ttyS0" -nographic
...
Starting udevd version 256.1-1-artix-g34ba18b^
:: running hook [udev]
:: Triggering uevents...
:: running hook [keymap]
:: Loading keymap...kbd_mode: KDSKBMODE: Inappropriate ioctl for device
done.
:: running hook [encrypt]
ERROR: device '' not found. Skipping fsck.
:: mounting '' on real root
mount: /new_root: fsconfig system call failed: fuseblk: Bad value for 'source'.
dmesg(1) may have more information after failed mount system call.
ERROR: Failed to mount '' on real root
You are now being dropped into an emergency shell.
sh: can't access tty; job control turned off
[rootfs ~]# ls
VERSION etc keymap.utf8 run
bin hooks lib sbin
buildconfig init lib64 sys
config init_functions new_root tmp
dev kernel proc usr
early_cpio keymap.bin root var
[rootfs ~]# QEMU: Terminated
So, who is this initramfs person, and what role do they play in the boot process?
The initramfs is a compressed file system (a cpio
file that is gzipped), containing various tools, kernel modules, configuration files, and other files required to complete the boot procedure. As its name suggests, the initramfs is a ram based file system 2. That is, the contents of the initramfs is extracted out into memory at boot-time, and is mounted on /
as a temporary file system. Later on, the actual root file system will be mounted over it.
This naturally causes one to beg the question why the initramfs is necessary. At first glance, it seems to make sense just to mount the root file system directly instead of having an intermediary. Indeed, apparently there are setups which do not require an initramfs. The short and admittedly vague answer is that there are some operations that must be performed in order to mount the root file system in the first place (the root file system cannot be accessed at this point in the boot process).
An example of this is would be full disk encryption. Assuming the setup was done with LVM on LUKS 3, we may have something like the following:
# Taken from the arch wiki
+-----------------------------------------------------------------------+ +----------------+
| Logical volume 1 | Logical volume 2 | Logical volume 3 | | Boot partition |
| | | | | |
| [SWAP] | / | /home | | /boot |
| | | | | |
| /dev/MyVolGroup/swap | /dev/MyVolGroup/root | /dev/MyVolGroup/home | | |
|_ _ _ _ _ _ _ _ _ _ _ _|_ _ _ _ _ _ _ _ _ _ _ _|_ _ _ _ _ _ _ _ _ _ _ _| | (may be on |
| | | other device) |
| LUKS encrypted partition | | |
| /dev/sda1 | | /dev/sdb1 |
+-----------------------------------------------------------------------+ +----------------+
In this scenario, /dev/sda1
must be decrypted, and the logical volumes must be reconstructed. After this is done, Logical volume 2 is mounted on /
, Logical volume 3 on /home
, and so on and so forth…except the related tools and configuration files live in /
, which lives in a logical volume, in the encrypted partition… Even to make things worse, the kernel code to handle the various file systems are usually a kernel module – for instance, on my laptop which uses ext4, it is dependent on ext4.ko
– which also lives on the root file system.
The initramfs packages all of these into a compressed file system. One of the files it contains – or rather, must contain – is init
, which kicks off this stage of the boot process. After decompressing the initramfs, the kernel executes init
, which is responsible for performing the various steps to enable the root partition to be mounted.
At this point you may be wondering if this is really necessary for booting the kernel with qemu
. While we are certainly not doing anything complicated with our disk setup – in fact, we don’t even have one – it’s simply part of the boot process. As demonstrated earlier, if an initramfs is not given to qemu
, the kernel will panic.
From this point onward, we will be building the development environment. Specifically, we will:
- Obtain the source and build the linux kernel.
- Create an initramfs based on
busybox
.
Building the Kernel
The kernel can be obtained either from https://www.kernel.org/ or https://github.com/torvalds/linux. The version of the kernel to obtain depends on the programmer’s needs. If their project requires a specific version of the kernel, they will simply choose that. For the purpose of this article, I will be using version 6.11.8.
$ curl --tlsv1.2 --proto https -L https://cdn.kernel.org/pub/linux/kernel/v6.x/linux-6.11.8.tar.xz | tar xvJ
Before building the kernel a configuration has to be generated. This can be done through running make menuconfig
, which will present the user with a ncurses
based menu to configure the build process (note, ncurses
may not be installed by default). However, there are also a few default configurations we can choose from. These configurations have the format of <target-hardware>_defconfig
.
$ find -name "*_defconfig"
./arch/arm/configs/mvebu_v5_defconfig
./arch/arm/configs/aspeed_g4_defconfig
./arch/arm/configs/sama5_defconfig
./arch/arm/configs/mmp2_defconfig
./arch/arm/configs/multi_v5_defconfig
./arch/arm/configs/versatile_defconfig
./arch/arm/configs/nhk8815_defconfig
./arch/arm/configs/omap2plus_defconfig
./arch/arm/configs/clps711x_defconfig
./arch/arm/configs/davinci_all_defconfig
....
./arch/x86/configs/i386_defconfig
./arch/x86/configs/x86_64_defconfig
./arch/um/configs/i386_defconfig
./arch/um/configs/x86_64_defconfig
We will be using the x86_64_defconfig, and building the kernel.
$ make x86_64_defconfig
HOSTCC scripts/kconfig/conf.o
HOSTLD scripts/kconfig/conf
#
# configuration written to .config
#
$ make -j $(nproc)
More than likely, there will be some tools required for the build process that will not be present. If the bulid process fails due to the lack of a tool, simply install it and restart the build process.
CHKSHA1 include/linux/atomic/atomic-arch-fallback.h
/bin/sh: line 1: bc: command not found <----
CHKSHA1 include/linux/atomic/atomic-instrumented.h
make[2]: *** [Kbuild:24: include/generated/timeconst.h] Error 127
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [/home/vilroi/linux-6.11.8/Makefile:1194: prepare0] Error 2
make[1]: *** Waiting for unfinished jobs....
LD /home/vilroi/linux-6.11.8/tools/objtool/arch/x86/objtool-in.o
LD /home/vilroi/linux-6.11.8/tools/objtool/objtool-in.o
LINK /home/vilroi/linux-6.11.8/tools/objtool/objtool
make: *** [Makefile:224: __sub-make] Error 2
If everything goes well, you should have a new vmlinux
in the project root, and a compressed kernel named arch/x86/boot/bzImage
.
AS arch/x86/boot/compressed/piggy.o
LD arch/x86/boot/compressed/vmlinux
OBJCOPY arch/x86/boot/vmlinux.bin
ZOFFSET arch/x86/boot/zoffset.h
AS arch/x86/boot/header.o
LD arch/x86/boot/setup.elf
OBJCOPY arch/x86/boot/setup.bin
BUILD arch/x86/boot/bzImage
Kernel: arch/x86/boot/bzImage is ready (#1)
Building busybox
Similar to the kernel, the version of busybox to use is case dependent. The author will be using version 1.37.0, as that is the latest version at the time of writing.
$ curl --tlsv1.2 --proto https -L https://busybox.net/downloads/busybox-1.37.0.tar.bz2 | tar xvj
We will be taking the default configuration, and will be creating a statically linked binary.
$ make defconfig
$ make menuconfig
Settings --->
--- Build Options
[*] Build static binary (no shared libs)
$ make -j $(nproc)
If everything goes well, there should be a copy of busybox in the project root directory.
$ ls
applets configs INSTALL make_single_applets.sh scripts
applets_sh console-tools klibc-utils miscutils selinux
arch coreutils libbb modutils shell
archival debianutils libpwdgrp networking size_single_applets.sh
AUTHORS docs LICENSE NOFORK_NOEXEC.lst sysklogd
busybox e2fsprogs loginutils NOFORK_NOEXEC.sh testsuite
busybox_ldscript.README.txt editors mailutils printutils TODO
busybox_unstripped examples Makefile procps TODO_unicode
busybox_unstripped.map findutils Makefile.custom qemu_multiarch_testing util-linux
busybox_unstripped.out include Makefile.flags README
Config.in init Makefile.help runit
$ file busybox
busybox: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), statically linked, BuildID[sha1]=48fec1fa1fbb728549d04720023fa61c7ce79409, for GNU/Linux 4.4.0, stripped
At the time of writing, there are two known issues that hinders the build process. The reader may also encounter these issues, so I will list the workarounds below.
The first error occurs when trying to run make menuconfig
. One of the scripts which checks for the availability for ncurses
will claim that it isn’t installed, even after installing it.
$ make menuconfig
...
HOSTCC scripts/kconfig/lxdialog/yesno.o
HOSTLD scripts/kconfig/lxdialog/lxdialog
*** Unable to find the ncurses libraries or the
*** required header files.
*** 'make menuconfig' requires the ncurses libraries.
***
*** Install ncurses (ncurses-devel) and try again.
***
make[2]: *** [/home/vilroi/src/busybox-1.37.0/scripts/kconfig/lxdialog/Makefile:15: scripts/kconfig/lxdialog/dochecklxdialog] Error 1
make[1]: *** [/home/vilroi/src/busybox-1.37.0/scripts/kconfig/Makefile:14: menuconfig] Error 2
make: *** [Makefile:444: menuconfig] Error 2
The script in question seems to check for ncurses
by compiling a simple c program which links to it 4. A fixed version of this can be found here 5.
The other issue is with compiling tc
.
networking/tc.c: In function ‘cbq_print_opt’:
networking/tc.c:236:27: error: ‘TCA_CBQ_MAX’ undeclared (first use in this function); did you mean ‘TCA_CBS_MAX’?
236 | struct rtattr *tb[TCA_CBQ_MAX+1];
| ^~~~~~~~~~~
| TCA_CBS_MAX
networking/tc.c:236:27: note: each undeclared identifier is reported only once for each function it appears in
networking/tc.c:249:16: error: ‘TCA_CBQ_RATE’ undeclared (first use in this function); did you mean ‘TCA_TBF_RATE64’?
249 | if (tb[TCA_CBQ_RATE]) {
| ^~~~~~~~~~~~
| TCA_TBF_RATE64
networking/tc.c:255:16: error: ‘TCA_CBQ_LSSOPT’ undeclared (first use in this function)
255 | if (tb[TCA_CBQ_LSSOPT]) {
| ^~~~~~~~~~~~~~
networking/tc.c:256:61: error: invalid application of ‘sizeof’ to incomplete type ‘struct tc_cbq_lssopt’
256 | if (RTA_PAYLOAD(tb[TCA_CBQ_LSSOPT]) < sizeof(*lss))
| ^
networking/tc.c:261:16: error: ‘TCA_CBQ_WRROPT’ undeclared (first use in this function)
This seems to be a known issue, and one of the members in the mailing list suggests either disabling the build of tc
, or using newer headers 6. I will be opting for the former solution.
--- Applets
Networking Utilities --->
[ ] tc (8.3 kb) <-- Toggle off
For those who haven’t worked with busybox
before, busybox
has the interesting property of acting as a different command based on its name. For instance, we could make a copy of the built busybox
binary named ls
and date
, and then try to run it.
$ cp busybox ls
$ cp busybox date
$ ./ls
AUTHORS applets_sh e2fsprogs networking
Config.in arch editors printutils
INSTALL archival examples procps
LICENSE busybox findutils qemu_multiarch_testing
Makefile busybox.links include runit
Makefile.custom busybox_ldscript.README.txt init scripts
Makefile.flags busybox_unstripped klibc-utils selinux
Makefile.help busybox_unstripped.map libbb shell
NOFORK_NOEXEC.lst busybox_unstripped.out libpwdgrp size_single_applets.sh
NOFORK_NOEXEC.sh configs loginutils sysklogd
README console-tools ls testsuite
TODO coreutils mailutils tmp
TODO_unicode date make_single_applets.sh util-linux
_install debianutils miscutils
applets docs modutils
$ ./date
Tue Nov 19 15:28:21 PST 2024
$
Building an initramfs
First things first, we run make install
in the busybox
project root directory. This creates a new directory _install
which contains a simple file system.
$ make install
$ mkdir _install/{dev,proc,sys}
$ ls _install/
bin dev linuxrc proc sbin sys usr
Behind the scenes, make install
runs an install script which amongst other things creates a symbolic link to the busybox
binary built earlier, with the name of the various utilities.
$ cd _install
$ ls bin/
arch chown dnsdomainname fsync ipcalc lsattr mpstat printenv run-parts stty vi
ash conspy dumpkmap getopt kbd_mode lzop mt ps scriptreplay su watch
base32 cp echo grep kill makemime mv pwd sed sync zcat
base64 cpio ed gunzip link mkdir netstat reformime setarch tar
busybox cttyhack egrep gzip linux32 mknod nice resume setpriv touch
cat date false hostname linux64 mktemp pidof rev setserial true
chattr dd fatattr hush ln more ping rm sh umount
chgrp df fdflush ionice login mount ping6 rmdir sleep uname
chmod dmesg fgrep iostat ls mountpoint pipe_progress rpm stat usleep
$ file bin/*
bin/arch: symbolic link to busybox
bin/ash: symbolic link to busybox
bin/base32: symbolic link to busybox
bin/base64: symbolic link to busybox
bin/busybox: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), statically linked, BuildID[sha1]=48fec1fa1f
bb728549d04720023fa61c7ce79409, for GNU/Linux 4.4.0, stripped
bin/cat: symbolic link to busybox
bin/chattr: symbolic link to busybox
There is one crucial program that is missing: init
. After the kernel decompresses the initramfs
, the kernel finds init
and runs it. If there is no init
, the boot process will fail.
Technically speaking, init
can be any program as long as it’s named init
. It could be a shell script, or it could also be a compiled binary. I assume it could even be written in python or javascript, but it will probably require the user to package the language runtime as well as its dependencies. To keep things simple, I believe either a shell script or a statically compiled binary would be preferred. Before presenting the actual init
script that will be used, I will write a fake one just to get the point across.
package main
import (
"fmt"
"time"
)
func main() {
fmt.Printf("Happy Kernel Hacking ❣🐧\n")
time.Sleep(1000000 * time.Second)
}
I will package the program above into an initramfs
…
$ go build
$ cp init ../busybox-vilroi/_install/ && cd ../busybox-vilroi/_install/
$ find . -print0 | cpio --null -ov --format=newc | gzip -9 > ../../initramfs.cpio.gz
. 15:17:57 [393/413]
./proc
./linuxrc
./usr
./usr/sbin
./usr/sbin/setfont
./usr/sbin/i2cget
./usr/sbin/nbd-client
...
… and run it.
$ qemu-system-x86_64 -kernel linux-6.11.8/arch/x86_64/boot/bzImage -initrd ./initramfs.cpio.gz --enable-kvm --append "console=ttyS0" -nographic -m 512
...
[ 0.899106] netconsole: network logging started
[ 0.899787] cfg80211: Loading compiled-in X.509 certificates for regulatory database
[ 0.901277] modprobe (47) used greatest stack depth: 14248 bytes left
[ 0.902398] modprobe (48) used greatest stack depth: 14072 bytes left
[ 0.903489] Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
[ 0.904380] Loaded X.509 cert 'wens: 61c038651aabdcf94bd0ac7ff06c7248db18c600'
[ 0.905379] platform regulatory.0: Direct firmware load for regulatory.db failed wi2
[ 0.906567] cfg80211: failed to load regulatory.db
[ 0.907201] ALSA device list:
[ 0.907629] No soundcards found.
[ 0.909783] Freeing unused kernel image (initmem) memory: 2680K
[ 0.910496] Write protecting the kernel read-only data: 26624k
[ 0.911471] Freeing unused kernel image (rodata/data gap) memory: 1204K
[ 0.952527] x86/mm: Checked W+X mappings: passed, no W+X pages found.
[ 0.953190] Run /init as init process
Happy Kernel Hacking ❣🐧
The actual init
script is the following, a modified version of the one written by KernW0lf 7:
#!/bin/sh
mount -t devtmpfs none /dev
mount -t proc none /proc
mount -t sysfs none /sys
exec /bin/sh
Place the above along side the other files in _install
. If there are other files you would like to package, such as kernel moduels, you will place it in here as well.
In order to build the initramfs
, we run the following in _install/
.
$ find . -print0 | cpio --null -ov --format=newc | gzip -9 > ../../initramfs.cpio.gz
Running the Kernel
At this point, we have all of the components required to run the kernel. Pass the path to the kernel and initramfs
with the -kernel
and -initrd
options respectively.
qemu-system-x86_64 -kernel linux-6.11.8/arch/x86_64/boot/bzImage -initrd ./initramfs.cpio.gz --enable-kvm --append "console=ttyS0" -nographic -m 512
[ 0.906029] platform regulatory.0: Direct firmware load for regulatory.db failed wi2
[ 0.907180] cfg80211: failed to load regulatory.db
[ 0.907825] ALSA device list:
[ 0.908271] No soundcards found.
[ 0.909397] Freeing unused kernel image (initmem) memory: 2680K
[ 0.910050] Write protecting the kernel read-only data: 26624k
[ 0.910977] Freeing unused kernel image (rodata/data gap) memory: 1204K
[ 0.951878] x86/mm: Checked W+X mappings: passed, no W+X pages found.
[ 0.952529] Run /init as init process
/bin/sh: can't access tty; job control turned off
~ # [ 1.296525] input: ImExPS/2 Generic Explorer Mouse as /devices/platform/i8042/s3
~ # whoami
whoami: unknown uid 0
~ # uname -a
Linux (none) 6.11.8 #1 SMP PREEMPT_DYNAMIC Tue Nov 19 13:53:55 PST 2024 x86_64 GNU/Linx
~ #
Links and Resources
- https://www.kernel.org/
- https://codingpoeta.github.io/posts/linux-kernel-compile-and-debug/
- https://kernw0lf.github.io/posts/qemu_debug/
- https://www.kernel.org/doc/html/latest/filesystems/ramfs-rootfs-initramfs.html
- https://blog.packagecloud.io/how-to-extract-and-disassmble-a-linux-kernel-image-vmlinuz/
-
https://www.kernel.org/doc/html/latest/filesystems/ramfs-rootfs-initramfs.html ↩︎
-
https://wiki.archlinux.org/title/Dm-crypt/Encrypting_an_entire_system#LVM_on_LUKS ↩︎
-
https://aur.archlinux.org/cgit/aur.git/tree/esp8266-rtos-sdk-aur-ncurses-fix.patch?h=esp8266-rtos-sdk ↩︎
-
https://lists.busybox.net/pipermail/busybox/2024-July/090827.html ↩︎