SDB:LXC

Jump to: navigation, search

Updated LXC Info

Icon-warning.png

November 2015
The main content of this document for the most part applies only to LXC up to 12.3.
For the new implementation and integration with libvirt in LEAP and 13.2,
Follow the install steps posted in the Technical Help Forms
[Install - Technical Help Forums Post]
and
Follow this documentation
[SLES 12 and LEAP LXC virtualization docs (online)]


July 2013
Updated LXC on openSUSE information is available in the openSUSE Technical Help Forums - Virtualization You can search based on my Username (TSU2) to view my posts
and specific articles in my wiki
TSU2 wiki

IMO the information in this article should still work, but compared to today's tools is painful.
Any new documentation will certainly cover the topics the original author of this page lists need to be covered.

Regardless of any re-work, I highly recommend preserving the content on this page for archival purposes.

PM (personal message) me if you would like a fully re-worked, updated LXC documentation. It's something on my backburner of needed things to do, but like everything in life is assigned a priority.

.


Preamble

LXC is a form of paravirtualization. Being a sort of super duper chroot jail, it is limited to running linux binaries, but offers essentially native perfromance as if those binaries were running as normal processes right in the host kernel. Which in fact, they are.

LXC is interesting primarily in that:

  • It can be used to run a mere application, service, or a full operating system.
  • It offers essentially native performance. A binary running as an LXC guest is actually running as a normal process directly in the host os kernel just like any other process. In particular this means that cpu and i/o scheduling are a lot more fair and tunable, and you get native disk i/o performance which you can not have with real virtualization (even Xen, even in paravirt mode) This means you can containerize disk i/o heavy database apps. It also means that you can only run binaries that the host kernel can execute. (ie: you can run Linux binaries, not another OS like Solaris or Windows)

LXC is a built-in linux kernel feature since kernel 2.6.29 (openSUSE 11.2)
The userspace tools are included in openSUSE since 11.2 in the "lxc" package in the main oss repo.
So, technically openSUSE "supports" LXC as of openSUSE 11.2

However in reality:

  • There is not yet any openSUSE-specific documentation other than this page.
  • There is no openSUSE version of the lxc-fedora or lxc-debian scripts which automate the process of constructing or "installing" a full system as an lxc guest.
  • There are no nice front-end utilities and system integration like we have for Xen.

Until a nice front-end util or YaST module is written, or the lxc-fedora script is ported to create lxc-suse, this HOWTO will describe just one of many ways to set up an LXC host server and virtual systems within it. The hardware server that will run lxc will be referred to as the host and the virtual environments which will run on the host will be referred to as the container or containers. It is assumed that you have already installed openSUSE 11.2 or later on a physical machine, and know how to administer and use it. There are other howto's on the net for using LXC on debian-based systems. This page is specifically to document how to do the same on openSUSE systems.

For the rest of this example the container name will be vps0.

The convention we will invent and then follow is, container config files go under /etc/lxc/<container-name> and container root filesystems go under /lxc/<container-name>.

So in the following example, the two config files associated with this first container will be /etc/lxc/vps0/config and /etc/lxc/vps0/fstab, and the containers root filesystem will be under /lxc/vps0

Networking for the container will be via bridging and veth. The host has a nic named eth0, address 192.168.0.100 on a 192.168.0.0/24 network. Default route and nameserver are both 192.168.0.1. The container will get a virtual eth0 with an ip of 192.168.0.50 and will be connected to the same lan as the host via bridge device in the host (like a virtual hub or switch, no routing or nat).

The container will also have a single console tty that can be used to access the container with "screen" even if ssh or other network access doesn't work.

Ok let's go...

Host Setup

This is stuff you do one time on the "real" system that will host one or more containers.

Install lxc and bridge-utils on the host

zypper in lxc bridge-utils iputils screen inotify-tools kernel-default

You need to be running the "-default" kernel NOT the "-desktop" one.

Unfortunately the "-desktop" kernel is the one that's installed by default these days.

After installing kernel-default above, reboot and select the "-default" kernel from the grub menu. Then uninstall the "-desktop" kernel.

zypper rm kernel-desktop

Setup the control group virtual filesystem on the host

awk 'BEGIN{E=1}($3=="cgroup"){E=0;exit}END{exit E}' /etc/fstab || echo "cgroup /cgroup cgroup defaults 0 0" >>/etc/fstab
M=`awk '($3=="cgroup"){print $2}' /etc/fstab`
[ -d $M ] || mkdir -p $M
mount -t cgroup |grep -qwm1 $M || mount $M

Setup the network bridge device in the host

You should do this at the system console or via serial console, because you will break network connectivity briefly along the way.

On the host system, yast, network devices:

  • add a new interface, type: Bridge, name: br0
  • add eth0 to this bridge
  • set the ip for br0 as if it were eth0 on the host. eth0 becomes "unconfigured" and br0 takes the place of eth0 for the host to reach the lan or internet.
  • set the hostname, default gateway, and nameservers as normal

Example of the bridge definition screen in YaST:


YaST2 - lan @ lxchost

 Network Card Setup
 β”ŒGeneral──Address──────────────────────────────────────────────────────────┐
 β”‚ Device Type                Configuration Name                            β”‚
 β”‚ Bridge▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒↓  br0β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’  β”‚
 β”‚( ) Dynamic Address  DHCP▒▒▒▒▒▒▒▒▒▒↓  DHCP both version 4 and 6▒↓         β”‚
 β”‚(x) Statically assigned IP Address                                        β”‚
 β”‚IP Address           Subnet Mask           Hostname                       β”‚
 β”‚192.168.0.100β–’β–’β–’β–’β–’β–’β–’ /24β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’ β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’β–’             β”‚
 β”‚β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”β”‚
 β”‚β”‚β”ŒBridged Devices───────────────────────────────────────────────────────┐││
 β”‚β”‚β”‚[x] eth0 - 80003ES2LAN Gigabit Ethernet Controller (Copper)           β”‚β”‚β”‚
 β”‚β”‚β”‚[ ] eth1 - 80003ES2LAN Gigabit Ethernet Controller (Copper)           β”‚β”‚β”‚
 β”‚β”‚β”‚                                                                      β”‚β”‚β”‚
 β”‚β”‚β”‚                                                                      β”‚β”‚β”‚
 β”‚β”‚β”‚                                                                      β”‚β”‚β”‚
 β”‚β”‚β”‚                                                                      β”‚β”‚β”‚
 β”‚β”‚β”‚                                                                      β”‚β”‚β”‚
 β”‚β”‚β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜β”‚β”‚
 β”‚β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜β”‚
 β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
[Help]                 [Back]                 [Cancel]                 [Next]

F1 Help  F9 Cancel  F10 Next

Afterwards, at this point, networking on the host should outwardly work the same as with a typical physical nic.
The host networking should look like this:

# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 brd 127.255.255.255 scope host lo
    inet 127.0.0.2/8 brd 127.255.255.255 scope host secondary lo
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:30:48:7d:50:ac brd ff:ff:ff:ff:ff:ff
    inet6 fe80::230:48ff:fe7d:50ac/64 scope link
       valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:30:48:7d:50:ad brd ff:ff:ff:ff:ff:ff
    inet6 fe80::230:48ff:fe7d:50ad/64 scope link
       valid_lft forever preferred_lft forever
4: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN
    link/ether 00:30:48:7d:50:ac brd ff:ff:ff:ff:ff:ff
    inet 192.168.0.100/24 brd 10.0.0.255 scope global br0
    inet6 fe80::230:48ff:fe7d:50ac/64 scope link
       valid_lft forever preferred_lft forever

# brctl show
bridge name     bridge id               STP enabled     interfaces
br0             8000.0030487d50ac       no              eth0

# netstat -rn
Kernel IP routing table
Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
192.168.0.0     0.0.0.0         255.255.255.0   U         0 0          0 br0
127.0.0.0       0.0.0.0         255.0.0.0       U         0 0          0 lo
0.0.0.0         192.168.0.1     0.0.0.0         UG        0 0          0 br0

# cat /etc/resolv.conf
search mydomain.com
nameserver 192.168.0.1

The points to note are that eth0 exists, but has no IP address or routes.

The host now uses br0 as it's nic instead of eth0.

br0 is a bridge device which can include several real and virtual devices and currently includes just the eth0 physical device, so all normal networking on the host should function the same as before. Later, the container will get a virtual nic which will also be added to br0 and that's how the container will reach the outside world. br0 can be thought of as virtual hub or switch.

Container Setup

This is stuff you do one time (per container) to create the container.

[update: the latest lxc package includes an lxc-opensuse template script that automates, improves upon, and essentially obsoletes most of this. This should be replaced with documentation based on using that now.]

Write the lxc config files for the container

CN=vps0           # container name
CR=/srv/lxc/$CN       # /path/to/container_root_fs
CF=/etc/lxc/$CN   # /path/to/container_config_files
IP=192.168.0.50   # container nic IP
PL=24             # container nic ipv4 prefix length

# generate a MAC from the IP
HA=`printf "63:6c:%x:%x:%x:%x" ${IP//./ }`

# create the config directory
mkdir -p $CF

# lxc config file for container.
cat >${CF}/config <<%%
lxc.utsname = $CN
lxc.network.type = veth
lxc.network.flags = up
lxc.network.link = br0
lxc.network.hwaddr = $HA
lxc.network.ipv4 = ${IP}/$PL
lxc.network.name = eth0
lxc.mount = ${CF}/fstab
lxc.rootfs = $CR
%%

# fstab for container
# it's outside of the container filesystem so that the container can not change it
cat >${CF}/fstab <<%%
none ${CR}/dev/pts devpts defaults 0 0
none ${CR}/proc    proc   defaults 0 0
none ${CR}/sys     sysfs  defaults 0 0
none ${CR}/dev/shm tmpfs  defaults 0 0
%%

Populate the container filesystem

There are a few different ways to accomplish this. This will use zypper to install a base system into a directory from an on-line repo, and then modify the result a little so that it works from within the context of a container.

CN=vps0           # container name
CR=/srv/lxc/$CN       # /path/to/container_root_fs
CF=/etc/lxc/$CN   # /path/to/container_config_files
IP=192.168.0.50   # container nic IP address
PL=24             # container nic ipv4 prefix length
GW=192.168.0.1    # container gateway (aka default route)
NS=$GW            # container nameserver

# create root filesystem directory
mkdir -p $CR

# container /dev
mkdir -p ${CR}/dev
cd ${CR}/dev
mknod -m 666 null c 1 3
mknod -m 666 zero c 1 5
mknod -m 666 random c 1 8
mknod -m 666 urandom c 1 9
mkdir -m 755 pts
mkdir -m 1777 shm
mknod -m 666 tty c 5 0
mknod -m 600 console c 5 1
mknod -m 666 tty0 c 4 0
ln -s null tty10
mknod -m 666 full c 1 7
mknod -m 600 initctl p
mknod -m 666 ptmx c 5 2
ln -s /proc/self/fd fd
ln -s /proc/kcore core
mkdir -m 755 mapper
mknod -m 600 mapper/control c 10 60
mkdir -m 755 net
mknod -m 666 net/tun c 10 200

# Use zypper to install a base system into a subdirectory.
# It's almost but not fully non-interactive.
# There are a few prompts you have to respond to manually.
zypper -R $CR ar -f http://download.opensuse.org/distribution/11.4/repo/oss/ oss
zypper -R $CR in -lt pattern base

# Remove some container-unfriendly packages.
zypper -R $CR rm udev hal grub gfxboot

# Put a copy of the fstab inside the container
cp -f ${CF}/fstab ${CR}/etc/fstab

# replacement /etc/init.d/boot
cp ${CR}/etc/init.d/boot ${CR}/etc/init.d/boot.orig
cat >${CR}/etc/init.d/boot <<%%
#! /bin/bash
rm -f /etc/mtab /var/run/*.{pid,lock}
touch /etc/mtab /fastboot
route add default gw $GW
exit 0
%%

# cut down inittab
cat >${CR}/etc/inittab <<%%
id:3:initdefault:
si::bootwait:/etc/init.d/boot
l0:0:wait:/etc/init.d/rc 0
l1:1:wait:/etc/init.d/rc 1
l2:2:wait:/etc/init.d/rc 2
l3:3:wait:/etc/init.d/rc 3
l6:6:wait:/etc/init.d/rc 6
ls:S:wait:/etc/init.d/rc S
~~:S:respawn:/sbin/sulogin
p6::ctrlaltdel:/sbin/init 6
p0::powerfail:/sbin/init 0
cons:2345:respawn:/sbin/mingetty --noclear console screen
%%

# disable yast->bootloader in container
cat >${CR}/etc/sysconfig/bootloader <<%%
LOADER_TYPE=none
LOADER_LOCATION=none
%%

# nic in container
cat >${CR}/etc/sysconfig/network/ifcfg-eth0 <<%%
STARTMODE='auto'
BOOTPROTO='static'
IPADDR='${IP}/${PL}'
%%

# default route in container
echo "default $GW - -" >${CR}/etc/sysconfig/network/routes

# nameserver in container
echo "nameserver $NS" >${CR}/etc/resolv.conf

Use a shell inside the container for final tweaks

Run /bin/bash chrooted into the containers root fs in order to do some final prep like add a login user and set roots password, otherwise you will not actually be able to use the new container even though it may be running fine. Might as well enable ssh while you're at it, though you could also do that later by logging in at the containers "console".

lxchost:~# chroot ${CR}
/# passwd root
Changing password for root.
New Password:
Reenter New Password:
Password changed.
/# useradd -m operator
/# passwd operator
Changing password for operator.
New Password:
Reenter New Password:
Password changed.
/# exit
exit
lxchost:~#

Run

This is stuff you do periodically, whenever you need to, for the rest of the life of the container and/or host systems. Startup, Use, Shutdown.

Start the container and login

Use gnu screen to start the full container (ie: run /sbin/init in it). This allows you to see the "console" messages output by the rc scripts in the container, and allows you to detatch that console to a background process, and reconnect to it later any time. In this example we login as the previously added user operator and be able to sudo to a root shell.

lxchost:~ # screen -S vps0 lxc-start -n vps0
[...]
Starting INET services. (xinetd)                                      done
Starting CRON daemon                                                  done
Master Resource Control: runlevel 3 has been                          reached
Skipped services in runlevel 3:                                       smartd

Welcome to openSUSE 11.2 "Emerald" - Kernel 2.6.31.8-0.1-default (console).


vps0 login: operator
Password: 

/~ su -
Password:
/#

Press "ctrl-a d" to disconnect your terminal from the screen session.

The container will continue running in the background and you can re-connect to the "console" (screen session) at any time.

Access the running container

You can access the running container two ways at this point.
There is a single console login tty that does not require working networking in the container to access.
To get at this console, just reconnect to the screen session:

lxchost:~ # screen -r vps0

(Press "ctrl-a d" to disconnect from the screen session)

Or, you can ssh to the container from the host or from anywhere else on the network:

ssh operator@192.168.0.50

Stop the container

Issue "shutdown" inside the container

First, log in to the container via it's "console" (the screen session that started it) and shut it down with "shutdown -h now". Wait for init to finish.

lxchost:~ # screen -r vps0

Welcome to openSUSE 11.2 "Emerald" - Kernel 2.6.31.8-0.1-default (tty1).

vps0 login: operator
Password:
Last login: Fri Jan 29 17:46:40 UTC 2010 from lxchost on pts/21
Have a lot of fun...
operator@vps0:~> su -
Password: 
vps0:~ # shutdown -h now

Broadcast message from root (console) (Tue Feb  9 20:01:50 2010):

The system is going down for system halt NOW!
vps0:~ #
[...]
Shutting down service (localfs) network  .  .  .  .  .  .  .  .  .   done
Shutting down D-Bus daemon                                           done
Running /etc/init.d/halt.local                                       done
pidofproc: pidofproc: cannot stat /sbin/splash: No such file or directory

INIT: no more processes left in this runlevel

This is what we are waiting for, so now press "ctrl-a d" to disconnect from the screen session.

Stop & destroy the container

NOW, and ONLY now, is it ok to stop the container.

lxchost:~ # lxc-stop -n vps0
lxchost:~ # lxc-destroy -n vps0

Only AFTER doing the above for all containers on the host is it ok to shut down the host. Do NOT shut down the host without first shutting down all containers. It will NOT happen gracefully by itself! The host will shut itself down gracefully and the host will be fine, but to all of the container systems it will be as if the power cord were pulled with no warning.

Notes

Init Script

The home:aljex repo linked below has a package called rclxc that provides /etc/init.d/lxc and symlink /usr/sbin/rclxc that will startup/shutdown all configured containers gracefully.

With this package you can skip the manual start/stop steps and cgroup fstab setup above.

You can shut down the host any time, and the lxc init script will shut down all the containers gracefully before allowing the host to shut down.

  • enable startup of all containers at boot
chkconfig lxc on
  • start/stop all containers manually
rclxc start
rclxc stop
  • start/stop a single container named "vps002" manually
rclxc start vps002
rclxc stop vps002
  • status of lxc

"status" in this case only says if the overall lxc "service" is up or down. There is no actual lxc service, but if any container is running, then the status of "lxc" is considered up. Only when no container is running is lxc considered down. This prevents the host from shutting down while any virtual server is still running.

rclxc status
  • list

You can use "list" or "show" to see the states of one or all individual containers.

rclxc list
rclxc show vps002

The init script starts each container in a screen session so that the virtual servers console is a screen session named after the container.

To list all container consoles

screen -ls

To connect to a containers console

screen -r vps002

To detatch from a containers console, press Ctrl-a d

vsftpd

As of this writing, LXC uses cgroups in a way which causes vsftpd to conflict with LXC because vsftpd uses cgroups internally as well. LXC will be switching to a different scheme which will avoid this problem in the future, but just for now it is necessary to add the following two options to /etc/vsftpd.conf inside any container vps.

isolate=NO
isolate_network=NO

Remember to restart vsftpd after the change.

rcvsftpd restart

Reference: http://www.mail-archive.com/lxc-users@lists.sourceforge.net/msg01111.html

What's still missing

  • smarter/better/simpler network setup directions
  • smarter/better/simpler ways to adjust the container OS files for running in the context of a container.
  • tty10 in /etc/rsyslog.early.conf, /etc/rsyslog.conf, /etc/init.d/boot.klog
    • could be handled by adding "lxc.tty = 12" to the container config file, adding 12 mknod commands to create /dev/tty1 - tty12 & remove the symlink of tty10 to null. Then you could see the tty10 output with lxc-console -n vps0 -t 10
  • openSUSE 11.2 ships with lxc 0.6.3 with no init scripts for starting/stopping the containers when the host reboots. The lxc-0.7.2 package in the link below has such an init script.
  • improve init scripts to handle more types and configurations of containers
  • libvirt. This document describes using the standard lxc-tools package (just called "lxc") to manipulate the containers. There is an entirely other way to do the same thing using libvirt and the virsh command and libvirt xml files instead of lxc-start/lxc-stop and lxc config files. Following all available documentation, and examples from the libvirt website, using lxc via libvirt is broken as of libvirt 0.81

Internal Links

External Links