PXE Diskless Node

From openSUSE

Contents

Diskless Node with PXE Boot


This howto describes how to create a diskless computing node that uses PXE to boot up and mounts its root partition from NFS. This setup is useful for some specific usecases, like thin clients (like LTSP) , dumb terminals, cluster nodes, etc.

The test envorinment was set with an openSUSE 10.3 and the client was set in a Virtual Machine in VMware, and will be running opensuse 10.3 too. Let us call the desktop box "server" and the diskless VM will be called "client" (doh).

The setup consists of having a server which will provide the boot service (tftp+dhcp), so the client can boot via network using the PXE protocol. The kernel the diskless client uses to boot is modified so it will be able to mount the root from nfs, so you need to compile a custom kernel with nfs root support and dhcp support. The diskless root file system is installed using "yast2 dirinstall" (need to try using kiwi for that) and exported via nfs to the client. Once its booted, its ready to do some hard work, like number crunching in a cluster.


Let us divide this howto in two parts:

  1. Setting up the server
  2. Configuration and caveats in the client.

You can find better information on these links:

  • Explains diskless setup, tftp dir, etc

http://www.schnozzle.org/~coldwell/diskless/

  • Initial tftp/pxe config

http://en.opensuse.org/SDB:Network_Installation_of_SuSE_Linux_via_PXE_Boot

http://en.opensuse.org/SuSE_install_with_PXE_boot

  • Nfs mounting options, kernel compiling:

http://www.gentoo.org/doc/en/diskless-howto.xml

  • Alternative root partition setup with aufs (requires initrd patch)

http://jengelh.hopto.org/linux/adm_dxsuse.php

  • Kernel docs about nfsroot

/usr/src/linux/Documentation/nfsroot.txt


1) Setting up the server

Warning
This setup has an inherent problem in that the NFS directory is shared among all clients (which is good enough for testing, but not suitable for a real environment), which inevidently leads to problems (programs reporting they already run or don't run anymore even if they do). Alternative file handling (e.g. aufs/unionfs) exist, but are not discussed in this article at all, instead you can find more information in the gentoo wiki link, and in the hopto liink in the beginning of the article

We start assuming you have a normal opensuse 10.3 installation.

First, lets start installing the software we will need:

  • tftp, yast2-tftp-server
  • syslinux
  • dhcp-server
  • nfs-kernel-server, yast2-nfs-server
  • gcc, make, kernel-sources (oh and have qt3-devel if you want to use make xconfig)

There are other tftp servers, setups, boot methods (BOOTP, RARP), but then they are all out of scope. If you can follow this you can adapt the procedures to your favourite softwares).

1.1) Configure dhcp server

This is very simple:

  • Set the interface in /etc/sysconfig/dhcpd:
 DHCPD_INTERFACE="eth0"
  • Edit /etc/dhcpd.conf, its well commmented, you can quickly figure out what to edit. Add these sections:
 # internal subnet 192.168.1.0/24
 subnet 192.168.1.0 netmask 255.255.255.0 {
   range 192.168.1.200 192.168.1.220;
   option domain-name-servers 1.2.3.4;
   option domain-name "mydomain";
   option routers 192.168.1.1;
   option broadcast-address 192.168.1.255;
   default-lease-time 60000;
   max-lease-time 720000;
 }
 
 # dhcp address reservation based on MAC
 # pxelinux.0 is the bootloader (syslinux)
 # 192.168.1.101 is the ip of the master node (nfs, dhcp and tftp/pxe server)
 host diskless-node {
   hardware ethernet a0:b0:c0:d0:e0:00;
   fixed-address 192.168.1.201;
   server-name "diskless-node";
   next-server 192.168.1.101;
   filename "pxelinux.0";
 }

Be sure the dhcp service is running: rcdhcpd status

1.2) Setting the tftp environment

We need to setup the tftp server and then put the files in their places.

  • First go to "yast2 tftp-server", just click on enable and finish. It will enable the tftp service in xinetd and the files will be put in /tftpboot
  • Next step is:
 cp /usr/share/syslinux/pxelinux.0 /tftpboot/
 mkdir /tftpboot/pxelinux.cfg
  • Create the default configuration file with "vi /tftpboot/pxelinux.cfg":
 # cat /tftpboot/pxelinux.cfg
 default linux
 prompt   1
 timeout  30
 
 # Install Linux
 label linux
   kernel linux
   append ip=dhcp root=/dev/nfs nfsroot=192.168.1.101:/suse/103 splash=silent showopts 3

All we are missing at this point is our custom kernel.

You could test right now if the tftp server is working (remember to have no firewall by now, and xinetd running):

 $ tftp 127.0.0.1
 tftp> get somefile


1.3) Compiling the custom kernel

Warning
This section is bogus. A kernel compile is not needed! (Use mkinitrd instead.)

You can find better howtos on kernel compiling in other places, lets keep it small in here:

 $ cd /usr/src/linux
 $ cp /boot/config-2.6.22.13-0.3-default .config
 $ make xconfig

We need at this point to add support to the kernel to use nfs-root and do network configuration at boot-time. Additionally, we are not using an initrd, so also included support to e1000 intel nic built-in.

In the config screen, you will need to activate this options: (tick options > show all options and choose also the split view option in the menu)

  • Go to Networking and tick:
IP: kernel level autoconfiguration (IP_PNP)
IP: DHCP support (IP_PNP_DHCP)
  • Go to device-drivers -> Network Device Support -> Ethernet (1000 Mbit) and tick:
Intel(R) PRO/1000 Gigabit Ethernet support (E1000)
Intel(R) PRO/1000 PCI-Express Gigabit Ethernet support (E1000E)
  • Go to File systems -> Network File Systems and tick:
NFS file system support (NFS_FS)
Root file system on NFS (ROOT_NFS)

When you read tick it means to mark it to be compiled built-in, which shows as a "check" sign in the boxes, and not the small dot in the box, which means its compiled as module. You may add more options you need here, or create a custom initrd for the kernel (andif you do, please add a section about it in here). Now save and close the xconfig screen.

You can check if you did the stuff correct (they should be all set to y):

 # grep  ROOT_NFS .config
 CONFIG_ROOT_NFS=y
 # grep  IP_PNP .config
 CONFIG_IP_PNP=y
 CONFIG_IP_PNP_DHCP=y
 CONFIG_IP_PNP_BOOTP=y
 CONFIG_IP_PNP_RARP=y
 # grep -i E1000  .config
 CONFIG_E1000=y
 CONFIG_E1000_NAPI=y
 # CONFIG_E1000_DISABLE_PACKET_SPLIT is not set
 CONFIG_E1000E=y

Now compile the kernel:

 # make bzImage

When its done copy the kernel to the tftp root:

 # cp /usr/src/linux/arch/i386/boot/bzImage /tftpboot/linux

At this point the PXE/tftp/dhcp environment is all set. You could fire VMware, create a vm and tell it to use PXE boot in BIOS and you should see the kernel booting. It will probably stall trying to mount the root filesystem.

1.4) Create the client environment and export it via nfs

Now we need to create the root filesystem of the diskless client. To achieve this, Ive used the opensuse 10.3 KDE CD, which I had here. I went to "yast repositories", disabled my sources, and Ive added the ISO I have as the only active source.

Next step is to fire "yast2 dirinstall". Edit the target directory, software selection and tick to run yast at first boot, and leave "create image" unselected, and choose next. Now it will install opensuse 10.3 in the directory specified.

While it installs, go to "yast2 nfs_server" to set the nsf exports. Tell the yast module to start nfs server and add to the exports the directory in which youa re installing the VM root. The options used were "fsid=0,rw,no_root_squash,sync,no_subtree_check". Click on finish.

You could now test with "yast nfs" if you can mount the share you are exporting.

This is the end of the server configuration. Now we need to go to the client.

2) Setting the client, configuration quirks

2.1) Create a VM in VMware

Create a linux VM, add a 100MB disk as file (it doesnt let you create a diskless one, but you can delete later). Edit the settings and remove the hard drive and the cdrom of the machine.

Now close vmware, find your VM definition file (VM.vmx) and edit it, adding the line that defines the e1000 card *after* the line shown:

 (...)
 Ethernet0.present = "TRUE"
 Ethernet0.virtualDev = "e1000"
 (...)


Now you are almost ready to boot the diskless VM via PXE. Dont boot it right now, it needs some fixes

2.2) Fixing some quirks

At the point of item 2.1, I have booted the VM and saw a couple of errors. So lets list here some of them:

  • No /dev/console in the VM causes the nfs to be mounted read-only and lots of stuff fails at the boot. Go to the /dev directory of the yast2 dirinstall installation and create the console device:
 mknod -m 600 ./console c 5 1

After that you can boot the VM and it should run yast for the first time, so you can configure root password, hardware, etc.

After yast finishes, you can login, and edit fstab:

 # cat fstab
 192.168.1.101:/suse/103  /       nfs     sync,hard,intr,rw,nolock,rsize=8192,wsize=8192  0 0
 proc    /proc   proc    defaults 0 0
 sysfs   /sys    sysfs   noauto 0 0
 debugfs /sys/kernel/debug       debugfs noauto 0 0
 usbfs   /proc/bus/usb   usbfs   noauto 0 0
 devpts  /dev/pts        devpts  mode=0620,gid=5 0 0
  • Modules.dep errors during boot

The compiled kernel was compiled based on the latest YOU kernel, while the installation was done with the openSUSE 10.3 KDE CD, so it has the original kernel. If you download the rpm from the update repository and rpm -Uvh it in the VM, the errors will go away (and the modules are usable too).

  • boot.md throws some errors

This script is not needed, as we dont have hard disks, so I did a:

 # cd /etc/init.d/boot.d
 # mv S05boot.md s05boot.md

Now boot.md will not be executed at startup no more.


2.3) Known issues (aka Unfixed quirks)

  • Yast may crash at the configure network screen (In which you can try to reboot or ctrl+c). It seems to have some problem due to the network being configured early in the boot process. Maybe a problem with udev.
  • Shutdown or halt doesn't complete, it ends in "INIT: no more processes left in this runlevel" (maybe need to change halt procedure due to the nfs root)
  • dbus throws some errors (solution: disable dbus)

3) To-do

  • Use kiwi to crete a trimmed down bootstrap for the client
  • The VM has some quirks that needs to be fixed, like the network script and udev issues