SDB:Debugging boot hang
Situation
You try to boot a system but it hangs while booting.
Procedure
Initcall debug
In grub, edit your kernel command line, remove both quiet and splash=whatever options and replace them with debug and initcall_debug.
Then capture the output somehow:
- Either use some console -- serial or netconsole (if network is already available at the point of hang)
- As a last resort you can use your camera
If there is a crash that can be seen now, create a bug in [bugzilla.opensuse.org] with this log.
Otherwise continue with the next section noting the last few initcall-called without initcall-returned messages. In other words you are looking for module init functions which didn't return at the point of hang.
Example
The grub shows:
kernel /boot/vmlinuz-2.6.150 root=/dev/rootdisk quiet other_option splash=silent
You change it to:
kernel /boot/vmlinuz-2.6.150 root=/dev/rootdisk other_option debug initcall_debug
Then the log shows something like:
calling e1000_init_module+0x0/0x82 [e1000] @ 338 e1000: Intel(R) PRO/1000 Network Driver - version 7.3.21-k8-NAPI e1000: Copyright (c) 1999-2006 Intel Corporation. calling parport_pc_init+0x0/0xa5 [parport_pc] @ 330 calling ppdev_init+0x0/0xc8 [ppdev] @ 374 initcall parport_pc_init+0x0/0xa5 [parport_pc] returned 0 after 91246 usecs ppdev: user-space parallel port driver initcall ppdev_init+0x0/0xc8 [ppdev] returned 0 after 281605 usecs e1000 0000:00:03.0: eth0: (PCI:33MHz:32-bit) 52:54:00:12:34:56
There we can see that the kernel called e1000_init_module from e1000 module, parport_pc_init from parport_pc and ppdev_init from ppdev. The latter two instantly returned zero (no error), but e1000_init_module didn't return.
So we note that e1000's module load function doesn't return and continue with the next section.
bash as init
Similar to the previous section, edit grub entry to boot with bash as your init. This time, it is enough to add init=/bin/bash.
In many cases this will help to track the problem down, because only the core drivers are loaded (by initrd).
When you boot with bash as init, the system offers you a shell. There are several steps to check:
- Try to check whether the failing module really hangs the system
From the previous Example, you will just modprobe e1000 and see what happens.