SDB:Disaster Recovery
Goal: Recreate a destroyed system
You like to be prepared so that if your system got destroyed you can recreate it as much as possible as it was before regardless what exactly was destroyed, from messed up software (like deleted essential files, corrupted file systems, destroyed disk partitioning) up to broken hardware (like defective haddisks or completely destroyed computers).
From that goal "recreate a system" follows:
Disaster recovery means installation (reinstalling from scratch)
The core of disaster recovery is an installer that reinstalls the system from scratch.
The fundamental steps of system installation are:
- Boot an installation system on the "bare metal" where "bare metal" could be also a bare virtual machine.
- In the installation system run an installer that performs the following fundamental steps:
- Prepare persistent storage (disk partitioning with filesystems and mount points).
- Store the payload in the persistent storage (install files).
- Install a boot loader.
- Finally reboot so that the installation system is shut down and the installed system is booted.
In case of an initial system installation "store the payload" means usually to install RPM software packages.
In case of disaster recovery "store the payload" means to restore the files from a backup.
The only real difference between a usual system installation and disaster recovery is the way how the payload is stored.
System configuration (e.g. network, users, services, language, keyboard, and so on) is a separated task that is not mixed up with what is meant with "installation" here. (For example also in YaST the actual installation is kept separated from the configuration - at least to a certain extent.)
System configuration is another difference:
An initial usual system installation requires an additional subsequent system configuration step because the installed RPM software packages result a pristine "raw" system with the default configuration files from the software packages that must be configured as needed.
In contrast disaster recovery results a configured system because the configuration files are restored from the backup.
Image installation:
Another way of system installation (with or without configuration) is the so called "image installation". In case of image installation "store the payload" usually means to store an image of the files (whatever "image" exactly means) in the persistent storage that can result a "raw" system with default configuration files or a configured system with adapted configuration files as needed.
For example KIWI is an image build system that can be used for an image installation (see https://en.wikipedia.org/wiki/KIWI_%28computing%29).
Basics
This article is only about how to recover from a disaster by recreating the destroyed system. Other ways how to cope with a disaster (like high availability solutions via redundantly replicated systems) are not described here.
In this particular case "disaster recovery" means to recreate the basic operating system (i.e. what you had initially installed from an openSUSE or SUSE Linux Enterprise install medium).
In particular special third party applications (e.g. a third party database system which often requires special actions to get it installed and set up) must usually be recreated in an additional separate step.
The basic operating system can be recreated on the same hardware or on fully compatible replacement hardware so that "bare metal recovery" is possible.
Fully compatible replacement hardware is needed
"Fully compatible replacement hardware" basically means that the replacement hardware works with the same kernel modules that are used on the original hardware and that it boots the same way. For example one cannot have a replacement network interface card that needs a different kernel module (possibly with different additional firmware) and one cannot switch from BIOS to UEFI or switch to a different architecture (e.g. from 32-bit to 64-bit) and things like that.
In general "hardware" could be also virtual hardware like a virtual machine with virtual harddisks (cf. the section about "Virtual machines" below). But "fully compatible replacement hardware" means that one cannot switch from real hardware to virtual hardware or from one kind of virtualization environment to a different kind of virtualization environment (e.g. from KVM/QEMU to XEN).
Furthermore "fully compatible replacement hardware" means that one cannot switch from one kind of storage structure to a different kind of storage structure (e.g. from a single disk to two disks or vice versa). Also one cannot switch from one kind of storage hardware to a different kind of storage hardware (e.g. from a SATA disk to a NVMe disk or from local disks to remote storage like SAN). Ideally the disks on replacement hardware should have exactly same size as on the original system (usually a bit bigger disk size should not cause big issues but smaller disk sizes are often rather problematic).
Provisions while your system is up and running
- Create a backup of all your files
- Create a bootable recovery medium that contains a recovery installation system plus a recovery installer for your system
- Have replacement hardware available
- Verify that it works to recreate your system on your replacement hardware
After your system was destroyed
- If needed: Replace broken hardware with your replacement hardware
- Recreate your system with your recovery medium plus your backup
Inappropriate expectations
Words like "just", "simple", "easy" are inappropriate for disaster recovery.
- Disaster recovery is not "easy".
- Disaster recovery is not at all "simple".
- There is no such thing as a disaster recovery solution that "just works".
There is an exception where such words are appropriate:
The simpler the system, the simpler and easier the recovery.
Disaster recovery does not just work
Even if you created the recovery medium without an error or warning, there is no guarantee that it will work in your particular case to recreate your system with your recovery medium.
The basic reason why there is no disaster recovery solution that "just works" is that it is practically impossible to autodetect in a reliable working way all information that is needed to recreate a particular system:
- Information regarding hardware like required kernel modules, kernel parameters,...
- Information regarding storage like partitioning, filesystems, mount points,...
- Information regarding bootloader
- Information regarding network
For example there is the general problem that it is impossible to determine in a reliable way how a running system was actually booted. Imagine during the initial system installation GRUB was installed in the boot sector of the active partition like /dev/sda1 and afterwards LILO was installed manually in the master boot record of the /dev/sda harddisk. Then actually LILO is used to boot the system but the GRUB installation is still there. Also in case of UEFI things like "BootCurrent" (e.g. in the 'efibootmgr -v' output) are not reliable to find out how a system was booted (cf. https://github.com/rear/rear/issues/2276#issuecomment-559865674). Or the bootloader installation on the harddisk may not at all work and the system was actually booted from a removable medium (like CD or USB stick/disk) or the currently running system was not even normally booted but directly launched from another running system via 'kexec' (cf. the below section "Launching the ReaR recovery system via kexec").
In "sufficiently simple" cases disaster recovery might even "just work".
When it does not work, you might perhaps change your system configuration to be more simple or you have to manually adapt and enhance the disaster recovery framework to make it work for your particular case.
No disaster recovery without testing and continuous validation
You must test in advance that it works in your particular case to recreate your particular system with your particular recovery medium and that the recreated system can boot on its own and that the recreated system with all its system services still works as you need it in your particular case.
You must have replacement hardware available on which your system can be recreated and you must try out if it works to recreate your system with your recovery medium on your replacement hardware.
You must continuously validate that the recovery still works on the replacement hardware in particular after each change of the basic system.
Recommendations
Prepare for disaster recovery from the very beginning
Prepare your disaster recovery procedure that fits your particular needs at the same time when you prepare your initial system installation.
See also "Help and Support - Feasible in advance - Hopeless in retrospect" below.
In the end when it comes to the crunch your elaborated and sophisticated system will become useless when you cannot recreate it with your disaster recovery procedure.
The simpler your system, the simpler and easier your recovery (cf. above).
Bottom line what matters in the end:
Regardless how a system was installed and regardless what is used for disaster recovery eventually a disaster recovery installation will be the final system installation.
Cf. the "Essentials about disaster recovery with Relax-and-Recover presentation PDF" link below.
Deployment via recovery installation
After the initial installation (plus configuration) from an openSUSE or SUSE Linux Enterprise install medium set up your system recovery and then reinstall it via your system recovery procedure for the actual productive deployment.
This way you know that your system recovery works at least on the exact hardware which you use for your production system.
Furthermore deployment via recovery installation ensures that in case of a disaster your particular disaster recovery reinstallation does not recreate your system with (possibly subtle but severe) differences (cf. below "The limitation is what the special ReaR recovery system can do") because this way you use one same installer (the recovery installer) both for deployment and for disaster recovery (cf. "Disaster recovery is installation" above).
When you use AutoYaST for deployment, have a look at "Native disaster recovery with AutoYaST" below.
When you use your own homemade way for deployment, the below section about "Generic usage of the plain SUSE installation system for backup and recovery" might be of interest for you.
Prepare replacement hardware for disaster recovery
You must have fully compatible replacement hardware available that is ready to use for disaster recovery.
See above what "fully compatible replacement hardware" means.
"Replacement hardware that is ready to use for disaster recovery" means in particular that its storage devices (harddisks or virtual disks) are fully clean (i.e. your replacement storage must behave same as pristine new storage).
When your replacement storage is not pristine new storage (i.e. when it had been ever used before), you must completely zero out your replacement storage. Otherwise when you try to reinstall your system (cf. "Disaster recovery means ... reinstalling" above) on a disk that had been used before, various kind of unexpected weird issues could get in your way because of whatever kind of remainder data on an already used disk (for example remainders of RAID or partition-table signatures and other kind of "magic strings" like LVM metadata and whatever else).
On non PC compatible architectures there could be issues to boot the Relax-and-Recover (ReaR) recovery system, cf. the below section "Recovery medium compatibility". You may need to prepare your replacement hardware with a small and simple system that supports kexec so that you can launch the ReaR recovery system via kexec, see the below section "Launching the ReaR recovery system via kexec".
Because zeroing out storage space may take a long time on big disks and preparing a system that supports kexec also needs time and testing you should pepare your replacement hardware in advance when you have time to do it in a 'relaxed' way (cf. "Notes on the meaning of 'Relax' in 'Relax-and-Recover'" below).
Be prepared for the worst case
Be prepared that your system recovery fails to recreate your system. Be prepared for a manual recreation from scratch. Always have all information available that you need to recreate your particular system manually. Manually recreate your system on your replacement hardware as an exercise.
Let's face it: Deployment via the recovery installer is a must
The above "recommendations" are actually no nice to have recommendations but mandatory requirements.
As long as you install your system with a different toolset (installation system plus installer) than what you intend to use later in case of emergency and time pressure for disaster recovery, you do not have a real disaster recovery procedure.
It cannot really work when you install your system with a different toolset than what you use to re-install it.
For a real disaster recovery procedure you should use one same installation system and one same installer for all kind of your installations.
At least you must use the same installation system and the same installer for your productive deployment installation and for your disaster recovery re-installation.
And even then still know your system well so that you are always prepared for a manual recreation from scratch.
Maintain a real disaster recovery procedure for your mission critical systems
When you have mission critical systems but you do not have a real disaster recovery procedure for them as described above (i.e. deployment via recovery installation plus continuous validation that your disaster recovery procedure actually works on your replacement hardware), your so called "mission critical systems" cannot be really mission critical for you because in fact you do not sufficiently care about your systems. Usually it is a dead end if your disaster recovery procedure failed to recreate your system on your replacement hardware after your original system was destroyed, see also the section about "Help and Support" below.
RPM packages for disaster recovery
There are two kind of RPM packages which provide frameworks to set up your disaster recovery procedure so that you can recreate your basic operating system: rear and rear-SUSE.
The software in those packages is intended for experienced users and system admins. There is no easy user-frontend (in particular there is no GUI) and in general software for disaster recovery does not behave really foolproof (it runs as 'root' and you need to know what it does).
Since SUSE Linux Enterprise 12 native disaster recovery with AutoYaST together with the new SUSE Installer is possible without the need for special packages like rear or rear-SUSE but that functionality is meanwhile deprecated (see below).
Generic usage of the plain SUSE installation system for backup and disaster recovery does not need any special software, neither rear or rear-SUSE nor AutoYaST and the SUSE Installer.
rear / rear116 / rear1172a / rear118a / rear23a
SUSE Linux Enterprise "rear..." RPM packages for disaster recovery with Relax-and-Recover (ReaR) via the SUSE Linux Enterprise High Availability Extension (SLE-HA):
- "rear" in SUSE Linux Enterprise 11 contains ReaR upstream version 1.10.0
- "rear116" contains ReaR upstream version 1.16 plus special adaptions for btrfs in SUSE Linux Enterprise 12 GA
- "rear1172a" contains ReaR upstream version 1.17.2 plus special adaptions for btrfs in SUSE Linux Enterprise 12 SP1 (rear116 does not work with the default btrfs setup in SUSE Linux Enterprise 12 SP1 because there are substantial changes compared to the default btrfs setup in SUSE Linux Enterprise 12 GA)
- "rear118a" contains ReaR upstream version 1.18 plus lots of ReaR upstream commits towards version 1.19 (basically rear118a is almost ReaR upstream version 1.19) in particluar it contains a SLE12-SP2-btrfs-example.conf file to support btrfs quota setup for snapper that is used since SUSE Linux Enterprise 12 SP2 and another main feature is UEFI support together with the new package ebiso that is used since ReaR version 1.18 for making a UEFI bootable ReaR recovery system ISO image on SUSE Linux Enterprise systems (all ReaR versions we provide up to SUSE Linux Enterprise 12 SP1 only support traditional BIOS, see the "Recovery medium compatibility" section below)
- "rear23a" contains ReaR upstream version 2.4 plus some later ReaR upstream commits. The rear23a RPM package originated from ReaR version 2.3 (therefore its RPM package name) and was initially provided only in SUSE Linux Enterprise 15 GA. Meanwhile a major rear23a RPM package update was done which contains now ReaR upstream version 2.4 plus some later ReaR upstream commits (up to the ReaR upstream git commit cc9e76872fb7de5ddd6be72d4008a3753046a528 cf. the rear23a RPM changelog). The RPM package name rear23a and version 2.3.a are kept so that an installed rear23a RPM package can be updated. In practice ReaR version 2.4 is basically required on POWER architectures (i.e. on 64-bit PPC64 and PPC64LE but not on the old 32-bit PPC which is not supported). ReaR version 1.18/1.19 should somewhat work on POWER but compared to what was enhanced and fixed at ReaR upstream since that time (March 2016) the rear118a RPM package behaves poor on POWER compared to the current (September 2018) rear23a RPM package.
See the ReaR upstream release notes at http://relax-and-recover.org/documentation/ for new features, bigger enhancements, and possibly backward incompatible changes in the various ReaR versions. The ReaR upstream release notes for ReaR version 2.4 http://relax-and-recover.org/documentation/release-notes-2-4 also contain the changes of former ReaR versions.
For one product like SUSE Linux Enterprise 12 we provide several ReaR versions in parallel so that users where version N does not support their particular needs can upgrade to version M but on the other hand users who have a working disaster recovery procedure with version N do not need to upgrade. Therefore the package name contains the version and all packages conflict with each other to avoid that an installed version gets accidentally replaced with another version. See also the "Version upgrades for Relax-and-Recover" section below.
What "rear..." RPM packages are provided for what SUSE Linux Enterprise version:
SUSE Linux Enterprise 11: rear rear116 (since SLE11-SP3) rear23a (as additional package via maintenance update for SLE11-SP4) SUSE Linux Enterprise 12: rear116 rear1172a (since SLE12-SP1) rear118a (since SLE12-SP2) rear23a (as additional package via maintenance update for SLE12-SP2) SUSE Linux Enterprise 15: rear23a (plus maintenance update to its latest content)
openSUSE provides ReaR in a single RPM package named "rear".
What ReaR version is provided in what openSUSE version:
openSUSE 11.4 12.1 12.2 12.3: rear-1.10.0 openSUSE 13.1: rear-1.14 openSUSE:13.2: rear-1.16 openSUSE Leap 42.1 42.2 42.3 rear-1.17.2 openSUSE Leap 15.0 15.1 rear-2.3
The current openSUSE "rear" RPM package is provided in the openSUSE Build Service project "Archiving", see https://build.opensuse.org/package/show/Archiving/rear and see also the "Version upgrades for Relax-and-Recover" section below.
rear-SUSE (outdated - was only available for SUSE Linux Enterprise 11)
rear-SUSE for disaster recovery with AutoYaST for SUSE Linux Enterprise 11.
Disaster recovery with Relax-and-Recover (ReaR)
Relax-and-Recover is a disaster recovery framework.
Relax-and-Recover is the de facto standard disaster recovery framework on Linux in particular for enterprise Linux distributions like Red Hat Enterprise Linux (RHEL) and SUSE Linux Enterprise Server (SLES) via the SUSE Linux Enterprise High Availability Extension (SLE-HA).
Relax-and-Recover is abbreviated ReaR (often misspelled 'Rear' or 'rear'). Regarding Relax-and-Recover the word 'rear' is mainly used when the program /usr/sbin/rear is meant (e.g. programm calls like "rear mkbackup" and "rear recover") and in ReaR file and directory names (e.g. /etc/rear/local.conf). Also RPM package names are usually lowercase so that ReaR RPM packages are named 'rear'.
Relax-and-Recover is written entirely in the native language for system administration: as shell (bash) scripts.
Experienced users and system admins can adapt or extend the ReaR scripts if needed to make things work for their specific cases and - if the worst comes to the worst - even temporary quick and dirty workarounds are relatively easily possible - provided you know ReaR and your system well so that you are prepared for appropriate manual ad hoc intervention.
Professional services and support from Relax-and-Recover upstream are available (see http://relax-and-recover.org/support/).
How disaster recovery with ReaR basically works
Specify the ReaR configuration in /etc/rear/local.conf (cf. "man rear" and /usr/share/rear/conf/examples/ and /usr/share/rear/conf/default.conf) and then run "rear mkbackup" to create a backup.tar.gz on a NFS server and a bootable ReaR recovery ISO image for your system.
The ReaR recovery ISO image contains the ReaR recovery installation system with the ReaR installer that is specific for the system where it was made.
A recovery medium which is made from the ISO image is used to boot the ReaR recovery system on your replacement hardware (the ReaR recovery system runs in main memory via a ramdisk).
In the ReaR recovery system log in as root and run the ReaR installer via "rear recover" which does the following:
- Recreate the basic system storage, in particular disk partitioning with filesystems and mount points.
- Restore the backup from the NFS server.
- Install the boot loader.
Finally remove the ReaR recovery medium and reboot the recreated system.
In "sufficiently simple" cases it "just works" (provided you specified the right configuration in /etc/rear/local.conf for your particular case). But remember: There is no such thing as a disaster recovery solution that "just works". Therefore: When it does not work, you might perhaps change your system configuration to be more simple or you have to manually adapt and enhance the various bash scripts of ReaR to make it work for your particular case.
Notes on the meaning of 'Relax' in 'Relax-and-Recover'
Because there is no such thing as a disaster recovery solution that "just works" and because there is no disaster recovery without testing on actually available replacement hardware, the meaning of 'Relax' is not that one could easily configure /etc/rear/local.conf, just run "rear mkbackup", and simply relax.
The meaning of 'Relax' is: After an experienced admin had set it up (possibly with some needed adaptions) and after it was thoroughly tested and as long as it is continuously validated that the recovery actually works on the replacement hardware (in particular after each change of the basic system), then one can relax.
Additionally the meaning of 'Relax' is that you can spend your time to carefully set up, test, and continuously validate your particular disaster recovery procedure in advance before a disaster happens when you have time for it and when you can do it step by step in a 'relaxed' way.
When later a real disaster happens, even a relatively unexperienced person can do the recovery on the replacement hardware (boot the ReaR recovery system, log in as root, run "rear recover", and finally reboot).
Furthermore the meaning of 'Relax' is that you need to be 'relaxed' and take your time to carefully set up (with some trial an error legwork) and properly validate your particular disaster recovery procedure because eventually your particular disaster recovery installation will become your final system installation, cf. the "Essentials about disaster recovery with Relax-and-Recover presentation PDF" link below.
The limitation is what the special ReaR recovery system can do
The ReaR recovery system with the ReaR installer is totally different compared to the installation system on an openSUSE or SUSE Linux Enterprise install medium with the YaST installer and AutoYaST. This means when ReaR is used to recover your system, a totally different installer recreates your system. Therefore when the initial installation of the basic operating system from an openSUSE or SUSE Linux Enterprise install medium had worked, the special ReaR recovery system may not work in your particular case or it may work but recreate your system with some (possibly subtle but severe) differences.
For example:
The following is only an example for the general kind of issue. That particular issue does no longer happen with current ReaR versions - nowadays other issues of that general kind appear.
In current SUSE systems disks are referenced by persistent storage device names like /dev/disk/by-id/ata-ACME1234_567-part1 instead of traditional device nodes like /dev/sda1 (see /etc/fstab /boot/grub/menu.lst /boot/grub/device.map).
If "rear recover" is run on a system with a new harddisk (e.g. after the disk had failed and was replaced) the reboot may fail because the persistent storage device names are different.
In this case ReaR shows a warning like "Your system contains a reference to a disk by UUID, which does not work".
The fix in the running ReaR recovery system is to switch to the recovered system via "chroot /mnt/local" and therein check in particular the files /etc/fstab, /boot/grub/menu.lst and /boot/grub/device.map and adapt their content (e.g. by replacing names like /dev/disk/by-id/ata-ACME1234_567-part1 with the matching device node like /dev/sda1). After canges in /boot/grub/menu.lst and /boot/grub/device.map the Grub boot loader should be re-installed via "/usr/sbin/grub-install".
See https://github.com/rear/rear/issues/22
Alternatively: If your harddisk layout is sufficiently simple so that you do not need disks referenced by persistent storage device names, you could change your system configuration to be more simple by using traditional device nodes (in particular in /etc/fstab, /boot/grub/menu.lst and /boot/grub/device.map).
The same kind of issue (with different symptoms) can also happen with rear-SUSE.
Relax-and-Recover versus backup and restore
Relax-and-Recover (ReaR) complements backup and restore of files but ReaR is neither a backup software nor a backup management software and it is not meant to be one.
In general backup and restore of the files is external functionality for ReaR.
I.e. neither backup nor restore functionality is actually implemented in ReaR.
ReaR only calls an external tool that does the backup of the files during "rear mkbackup" and its counterpart to do the restore of the files during "rear recover" (by default that tool is 'tar').
It is your task to verify your backup is sufficiently complete to restore your system as you need it.
It is your task to verify your backup can be read completely to restore all your files.
It is your task to ensure your backup is consistent. First and foremost the files in your backup must be consistent with the data that is stored in your ReaR recovery system (in particular what is stored to recreate the storage in files like var/lib/rear/layout/disklayout.conf) because ReaR will recreate the storage (disk partitions with filesystems and mount points) with the data that is stored in the ReaR recovery system and then it restores the files from the backup into the recreated storage which means in particular restored config files must match the actually recreated system (e.g. the contents of the restored etc/fstab must match the actually recreated disk layout). Therefore after each change of the basic system (in particular after a change of the disk layout) "rear mkbackup" needs to be run to create a new ReaR recovery system together with a matching new backup of the files. See also below "What 'consistent backup' means". E.g. you may have to stop certain running programs (e.g. whatever services or things like that) that could change files that get included in your backup before your backup tool is run.
It is your task to ensure your backups are kept safe at a sufficiently secure place. In particular the place where ReaR writes a new backup (e.g. a NFS share or a USB disk) is not a really safe place to also keep old backups (arbitrary things might go wrong when writing there).
Regarding "counterpart to do the restore": To be able to call the restore tool during "rear recover" the restore tool and all what it needs to run (libraries, config files, whatever else) must be included in the ReaR recovery system where "rear recover" is run. For several backup and restore tools/solutions ReaR has already built-in functionality to get the restore tool and all what it needs to run in the ReaR recovery system.
Usually only basic support for the various backup tools is implemented in ReaR (e.g. plain making a backup during "rear mkbackup" and plain restore during "rear recover"). It is very different to what extent support for each individual backup tool is implemented in ReaR because support for each individual backup tool is implemented separated from each other. Therefore for some particular backup tools the current support in ReaR could be even only "very basic" (cf. the below sections "How to adapt and enhance Relax-and-Recover" and "How to contribute to Relax-and-Recover"). In particular for most third party backup tools there is only support for plain backup restore during "rear recover" but no support at all to make a backup during "rear mkbackup" so "rear mkbackup" is useless and only "rear mkrescue" is useful for most third party backup tools ("rear mkbackup" creates the ReaR recovery system and makes a backup while "rear mkrescue" only creates the ReaR recovery system), see "man rear" for more information. To ensure your backup is consistent with the data that is stored in your ReaR recovery system you should make a new backup each time you create a new ReaR recovery system and also the other way round: Each time you make a new backup you should also create a new ReaR recovery system. You must create a new ReaR recovery system when the basic system changed (in particular after a change of the disk layout).
There is basically nothing in ReaR that deals in any further way with what to do with the backup except small things like NETFS_KEEP_OLD_BACKUP_COPY, see below.
Regarding NETFS_KEEP_OLD_BACKUP_COPY:
With the NETFS backup method ReaR writes its files (in particular the backup.tar.gz and the ReaR recovery system ISO image) into a mounted directory that belongs to a network file system (usually NFS).
With empty NETFS_KEEP_OLD_BACKUP_COPY="" a second "rear mkbackup" run overwrites the files in the NFS directory from a previous "rear mkbackup" run. This means when after a successful "rear mkbackup" run one does not save the files in the NFS directory to a permanently safe place, one has no backup if a second "rear mkbackup" run fails. In particular one has no backup while a second "rear mkbackup" run is overwriting the old backup.
With non-empty NETFS_KEEP_OLD_BACKUP_COPY="yes" a second "rear mkbackup" run will not overwrite the files in the NFS directory from a previous "rear mkbackup" run. Instead the second "rear mkbackup" run renames an existing NFS directory into *.old before it writes its files.
Note that "KEEP_OLD_BACKUP_COPY" functionality is not generally available for the various backup and restore tools/solutions where ReaR has built-in support to call their backup and restore functionality.
This means in general:
After a "rear mkbackup" run the user has to do on his own whatever is appropriate in his particular environment how to further deal with the backup and the ReaR recovery system ISO image and the ReaR log file and so on.
Version upgrades with Relax-and-Recover
When you have a working disaster recovery procedure, do not upgrade ReaR and do not change the basic software that is used by ReaR (like partitioning tools, filesystem tools, bootloader tools, ISO image creating tools, and so on).
For each ReaR version upgrade and for each change of a software that is used by ReaR you must carefully and completely re-validate that your particular disaster recovery procedure still works for you.
In contrast when a particular ReaR version does not work for you, try a newer version.
See below the section "First steps with Relax-and-Recover" wherefrom you could get newest ReaR versions or see Downloads at Relax-and-Recover upstream.
Because ReaR is only bash scripts (plus documentation) it means that, in the end, it does not really matter which version of those bash scripts you use. What matters is that the particular subset of ReaR's bash scripts that are actually run for your particular disaster recovery procedure work for you or can be adapted or extended to make it work with as little effort as possible.
When it does not work with an up-to-date ReaR release, try to change your basic system configuration to be more traditional (if possible try to avoid using newest features for your basic system) or you have to manually adapt and enhance ReaR to make it work for your particular case.
For any kind of innovation that belongs to the basic system (e.g. kernel, storage, bootloader, init) the new kind (e.g. udev, btrfs, Grub2 / UEFI / secure boot, systemd) will be there first and afterwards ReaR can adapt step by step to support it.
On the other hand this means: When you have a working disaster recovery procedure running and you upgrade software that is related to the basic system or you do other changes in your basic system, you must also carefully and completely re-validate that your particular disaster recovery procedure still works for you.
Testing current ReaR upstream GitHub master code
It is possible to have several ReaR versions in parallel each one in its own separated directory without conflicts between each other and without conflicts with a normally installed ReaR version via RPM package.
Accordingly you could try out the current ReaR upstream GitHub master code from within a separated directory as a test to find out if things work better with current ReaR upstream master code compared to your installed ReaR version.
Basically "git clone" the current ReaR upstream GitHub master code into a separated directory and then configure and run ReaR from within that directory like:
git clone https://github.com/rear/rear.git mv rear rear.github.master cd rear.github.master vi etc/rear/local.conf usr/sbin/rear -D mkbackup
Note the relative paths "etc/rear/" and "usr/sbin/".
In case of issues with ReaR it is recommended to try out the current ReaR upstream GitHub master code because that is the only place where ReaR upstream fixes bugs, i.e. bugs in released ReaR versions are not fixed by ReaR upstream, cf. the sections "Debugging issues with Relax-and-Recover" and "How to adapt and enhance Relax-and-Recover" below.
First steps with Relax-and-Recover
To get sufficient understanding how ReaR works you need to use it yourself, play around with it yourself, do some trial an error experiments with it yourself, to get used to ReaR which is a mandatory precondition to be able to do a real disaster recovery even in case of emergency and time pressure in a 'relaxed' way, cf. the section "Notes on the meaning of 'Relax' in 'Relax-and-Recover'" above.
For documentation about ReaR see "man rear" and /usr/share/rear/conf/default.conf and the files in /usr/share/doc/packages/rear*/ when you have a rear* RPM package installed or the files in doc/user-guide/ and "man -l doc/rear.8" when you use the current ReaR upstream GitHub master code (cf. above) or online for example:
- http://relax-and-recover.org/documentation/
- https://github.com/rear/rear/tree/master/doc/
- https://github.com/rear/rear/blob/master/doc/rear.8.adoc
- https://github.com/rear/rear/blob/master/doc/user-guide/relax-and-recover-user-guide.adoc
- https://raw.githubusercontent.com/rear/rear/master/usr/share/rear/conf/default.conf
- https://raw.githubusercontent.com/rear/rear/master/doc/rear-release-notes.txt
It is recommend to do your first steps with ReaR as follows:
1.
You need a NFS server machine whereto ReaR can store its backup and ISO image. For example create a /nfs/ directory and export that via NFS. To set up an NFS server you may use the YaST "NFS Server" module (provided by the "yast2-nfs-server" RPM package). You need to adapt the defaults in /etc/exports so that ReaR that runs as root can write its backup and ISO image there like the following example. In particular export it as "rw" like:
/nfs *(rw,root_squash,sync,no_subtree_check)
Usually it should work with "root_squash" (so that on the NFS server a non-root user and group like "nobody:nogroup" is used) but perhaps in some cases you may even need "no_root_squash" (so that on the NFS server the user root can do anything with unlimited permissions). In case of "root_squash" the exported '/nfs' directory on the NFS server must have sufficient permissions so that "nobody:nogroup" can create/write files and sub-directories there and access files in those sub-directories, for example you may allow any users and groups to do that with those permissions:
drwxrwxrwx ... /nfs
2.
On a sufficiently powerful host (minimum is a dual core CPU and 4GB main memory) create two same virtual KVM/QEMU machines with full hardware virtualization (except CPU virtualization because that would make it far too slow) as follows:
- a single CPU
- 1-2 GB memory (RAM)
- a single 10-20 GB virtual harddisk
- a single virtual network interface card
- standard PC compatible architecture with traditional BIOS
Use standard PC compatible architecture (cf. the section about "Non PC compatible architectures" below) - in particular do not use UEFI (unless you want to see how your first steps with ReaR will fail). The meaning of "same virtual KVM/QEMU machines" is as in the "fully compatible replacement hardware is needed" section above. In particular have the same virtual harddisk size on both machines and same type of virtual network interface card. If you use "virt-manager" to create virtual KVM/QEMU machines set the default "OS type" explicitly to "Generic" to get full hardware virtualization so that the virtual harddisk appears like a real harddisk as /dev/sda (if you use some kind of paravirtualization the harddisk appears as /dev/vda or similar). It depends on the virtual networks settings for KVM/QEMU on the host to what extent you can access remote machines from virtual machines. At least the NFS server whereto ReaR will store its backup and ISO image and wherefrom ReaR will restore its backup must be accessible from the virtual machines. The NFS server must run on the host if the virtual network is in so called "isolated mode" (using private IP addresses of the form 192.168.nnn.mmm) where virtual machines can only communicate with each other and with the host but the virtual machines are cut off from the outer/physical network. Actually no physical network is needed for a virtual network in "isolated mode" which means you can do your first steps with ReaR on a single isolated computer that acts as host for the virtual machines and as NFS server. In such an isolated private network the IP address of the host is usually something like 192.168.100.1 or 192.168.122.1 and virtual machines get their IP addresses usually automatically assigned via a special DNS+DHCP server "dnsmasq" that is automatically configured and started as needed and only used for such virtual networks.
3.
On one virtual machine install at least SLES11-SP4 (SLES10 is not supported) or SLES12-SP2 or SLES15 into a single ext3 or ext4 filesystem. For your first steps with ReaR keep it simple and do not use several partitions (like one for the system and an additional "/home" partition). In particular do not use the complicated btrfs default structure in SLES12 and SLES15 - unless you prefer to deal with complicated issues during your first steps with ReaR.
4.
For your first steps with ReaR use a test system that is small (for small and fast backup and restore) but still normally usable with the X Window System. For example install only those software patterns
- Base System
- Minimal System (Appliances)
- X Window System
Additionally install the package "MozillaFirefox" as an example application to check that the system is normally usable before and after recovery. In particular have a package installed that provides the 'dhclient' program that is needed for DHCP in the ReaR recovery system. In recent openSUSE and SLES versions 'dhclient' is provided by the "dhcp-client" RPM package. Furthermore install the package "lsb-release" which is used by ReaR. It is recommended to also install a package that provides the traditional (meanwhile deprecated) networking tools like 'ifconfig / netstat / route' and so on. In recent openSUSE and SLES versions those tools are still provided by the "net-tools-deprecated" RPM package. Finally the plain text editor 'vi' is needed by ReaR which is provided by the "vim" RPM package.
5.
For your first steps with ReaR keep it simple and use only a single network interface "eth0" with DHCP.
6.
Install an up-to-date ReaR version (at least ReaR version 2.4 preferably via the "rear23a" RPM package in SLES11-SP4 or SLES12-SP2 or SLES15). Current ReaR versions are available via the openSUSE build service projects "Archiving" and "Archiving:Backup:Rear" for direct RPM download from
- http://download.opensuse.org/repositories/Archiving/
- http://download.opensuse.org/repositories/Archiving:/Backup:/Rear/
Alternatively you may use the current ReaR upstream GitHub master code as described in the above section "Testing current ReaR upstream GitHub master code". Of course the current ReaR upstream GitHub master code is under continuous development so sometimes it may not work.
7.
Set up /etc/rear/local.conf by using /usr/share/rear/conf/examples/SLE11-ext3-example.conf as template (copy it onto /etc/rear/local.conf) and adapt it as you need (read the comments in that file and see "man rear" and /usr/share/rear/conf/default.conf). If your do not have a /usr/share/rear/conf/examples/SLE11-ext3-example.conf file, install a more up-to-date ReaR version (at least ReaR version 2.4).
8.
Check that MozillaFirefox (or whatever example application you use) is normally usable on the system and do some user-specific settings (e.g. save some bookmarks in MozillaFirefox). It depends on the networking settings for KVM/QEMU on the host of the virtual machine whether or not you can access outer networks (like https://www.suse.com on the Internet) from within the virtual machine or if you can only access files locally on the virtual machine (like /usr/share/pixmaps).
9.
Now run
rear -d -D mkbackup
On your NFS server machine you should get a 'backup.tar.gz' and a 'rear-hostname.iso' in a 'hostname' sub-directory of its exported '/nfs' directory (cf. above about the NFS server setup).
10.
Shut down the system where "rear -d -D mkbackup" was run to simulate that this system got destroyed.
11.
Boot the other virtual machine with that rear-hostname.iso and select on ReaR's boot screen "recover hostname"
(i.e. use the manual recovery - not the automated recovery) and log in as root (no password).
12.
Now the ReaR recovery system runs on the other virtual machine. Therein run the ReaR recovery installer with
rear -d -D recover
You should get the system recreated on the other virtual machine.
13.
Shut down the ReaR recovery system and reboot the recreated system.
14.
Check that MozillaFirefox (or whatever example application you use) is still normally usable in the recreated system and check that your user-specific settings (e.g. the bookmarks in MozillaFirefox) still exist.
You can run "rear recover" from remote via ssh as follows:
In /etc/rear/local.conf set
USE_DHCLIENT="yes"
and something like
SSH_ROOT_PASSWORD="rear"
Never use your original root password here.
On your first virtual machine run
rear -d -D mkbackup
Boot the other virtual machine with the rear-hostname.iso and select on ReaR's boot screen "recover hostname" (i.e. use the manual recovery - not the automated recovery) and log in as root (no password).
Type "ifconfig" or "ip addr" to see the IP in the ReaR recovery system and log in from remote via ssh by using the SSH_ROOT_PASSWORD value and then run
rear -d -D recover
Debugging issues with Relax-and-Recover
Because ReaR is written entirely as bash scripts, debugging ReaR is usual bash debugging.
To debug ReaR run it both with the '-d' option (log debug messages) and with the '-D' option (debugscript mode) to log commands and their arguments as they are executed (via 'set -x'), e.g.: "rear -d -D mkbackup" and "rear -d -D recover".
Afterwards inspect the ReaR log file for further analysis.
The ReaR log files get stored in the /var/log/rear/ directory.
When "rear -d -D recover" finishes the log file is copied into the recovered system there either into the /root/ directory or for newer ReaR versions into the separated /var/log/rear/recover/ directory to keep the log file from the last recovery safe and separated from other ReaR log files so that the log file from the last recovery can be analyzed at any later time if needed.
When "rear -d -D recover" fails, you need to save the log file out of the ReaR recovery system (where "rear -d -D recover" was run and where it had failed) before you shut down the ReaR recovery system - otherwise the log file would be lost (because the ReaR recovery system runs in a ramdisk). Additionally the files in the /var/lib/rear/ directory and in its sub-directories in the ReaR recovery system (in particular /var/lib/rear/layout/disklayout.conf and /var/lib/rear/layout/diskrestore.sh) are needed to analyze a "rear -d -D recover" failure. See the "First steps with Relax-and-Recover" section above how to access the ReaR recovery system from remote via ssh so that you can use 'scp' to get files out of the ReaR recovery system.
To analyze and debug a "rear recover" failure the following information is mandatory:
- ReaR version ("/usr/sbin/rear -V")
- Operating system version ("cat /etc/os-release" or "lsb_release -a" or "cat /etc/rear/os.conf")
- ReaR configuration files ("cat /etc/rear/local.conf" and/or "cat /etc/rear/site.conf")
- Hardware (PC or PowerNV BareMetal or ARM) or virtual machine (KVM guest or PoverVM LPAR) of the original system and if different also what is used as replacement hardware (or virtual machine)
- System architecture (x86 compatible or PPC64/PPC64LE or what exact ARM device) of the original system (must be same for what is used as replacement system)
- Firmware (BIOS or UEFI or Open Firmware) and bootloader (GRUB or ELILO or Petitboot) on the original system (must be same for what is used on the replacement hardware)
- Storage (local disk or SSD) and/or SAN (FC or iSCSI or FCoE) and/or multipath (DM or NVMe) on the original system and if a bit different also what is used on the replacement hardware (should be as same as possible for what is used as replacement storage)
- The output of "lsblk -ipo NAME,KNAME,PKNAME,TRAN,TYPE,FSTYPE,LABEL,SIZE,MOUNTPOINT" on the original system
- The output of "findmnt -a -o SOURCE,TARGET,FSTYPE -t btrfs,ext2,ext3,ext4,xfs,reiserfs,vfat" on the original system
- Debug log file of "rear -d -D mkbackup" or "rear -d -D mkrescue" (in /var/log/rear/) that matches the "rear recover" failure (i.e. debug log from the original system of the "rear -d -D mkbackup" or "rear -d -D mkrescue" command that created the ReaR recovery system where "rear recover" failed)
- Debug log file of "rear -d -D recover" (in /var/log/rear/) from the ReaR recovery system where "rear recover" failed (see above how to save the log file out of the ReaR recovery system)
- Contents of the /var/lib/rear/ directory and in its sub-directories from the ReaR recovery system where "rear recover" failed (see above how to save files out of the ReaR recovery system)
To only show what scripts would be run (i.e. what scripts would be "sourced" by the ReaR main script /usr/sbin/rear) for a particular rear command (without actually executing them), use the '-s' option (simulation mode), e.g.: "rear -s mkbackup" and "rear -s recover".
How to adapt and enhance Relax-and-Recover
Because ReaR is written entirely as bash scripts, adapting and enhancing ReaR is basically "usual bash scripting".
Often bash scripting is used primarily as some kind of workaround to get something done in a quick and dirty way.
Do not adapt and enhance ReaR in this way - except you know for sure that your particular adaptions and enhancements could never ever be useful for anybody else because you know for sure that nobody else could also be hit by your particular issue why you made your particular adaptions and enhancements.
You got ReaR as free software and you benefit from it.
If you have an issue with ReaR, adapt and enhance it so that also others could benefit from your adaptions and enhancements.
This means you should contribute to ReaR upstream (see How to contribute to Relax-and-Recover) as follows:
How to contribute to Relax-and-Recover
- report your issue at ReaR upstream so that also others know about it: https://github.com/rear/rear/issues
- implement your adaptions and enhancements in a backward compatible way so that your changes do not cause regressions for others
- provide comments in the source code of your adaptions and enhancements that explain what you did and why you did it so that others can easily understand the reasons behind your changes (even if all is totally obvious for you, others who do not know about your particular use case or do not have your particular environment may understand nothing at all about your changes)
- follow the ReaR coding style guide: https://github.com/rear/rear/wiki/Coding-Style
- submit your adaptions and enhancements to ReaR upstream so that others benefit from your work: http://relax-and-recover.org/development/
When you submit your adaptions and enhancements so that ReaR upstream can accept them you will benefit even more from ReaR because this is the only possible way that your adaptions and enhancements will be available also in further ReaR releases and that others who also benefit from your adaptions and enhancements could keep your adaptions and enhancements up to date for future ReaR releases.
In contrast when you do your adaptions and enhancements only on your own, you are left on your own.
If you have an issue with ReaR, but you are unable to adapt and enhance it yourself, you may let others do it for you: http://relax-and-recover.org/support/sponsors
Disaster recovery with rear-SUSE / RecoveryImage (outdated)
For SUSE Linux Enterprise 11 the rear-SUSE package provides the bash script RecoveryImage which creates a bootable ISO image to recover your system.
Experienced users and system admins can adapt or extend the RecoveryImage script to match even special needs.
To create the bootable ISO image RecoveryImage does usually the following:
- Run "rear mkbackuponly" to store a backup.tar.gz on a NFS server.
- Run AutoYaST clone_system.ycp to make an autoinst.xml file.
- Make a bootable system recovery ISO image which is based on an install medium, for example a SUSE Linux Enterprise install DVD plus autoinst.xml so that AutoYaST can recover this particular system. In particular a so called 'chroot script' is added to autoinst.xml which is run by AutoYaST to restore the backup from the NFS server.
RecoveryImage has several command line options to specify various kind of alternative behaviour to match various kind of different disaster recovery procedures (see "man RecoveryImage").
A recovery medium which is made from the ISO image would run AutoYaST with autoinst.xml to recreate the basic system, in particular the partitioning with filesystems and mount points.
Then AutoYaST runs the 'chroot script' to fill in the backup data into the recreated basic system.
After the backup was restored, AutoYaST installs the boot loader.
Then the recreated system boots for its very first time and AutoYaST does the system configuration, in particular the network configuration. Finally the configured system moves forward to its final runlevel so that all system services should then be up and running again.
rear-SUSE uses the backup method of ReaR (via "rear mkbackuponly") but the recovery image is made in a totally different way.
I.e. same backup but totally different way of system recovery.
With rear-SUSE the recovery of the basic system (i.e. partitioning, filesystems, mount points, boot loader, network configuration,...) is delegated to AutoYaST and AutoYaST delegates the particular tasks to the matching YaST modules.
The crucial point is that autoinst.xml controls what AutoYaST does so that via autoinst.xml experienced users and system admins can control how their particular systems are recreated.
In "sufficiently simple" cases it "just works", but remember: There is no such thing as a disaster recovery solution that "just works". Therefore: When it does not work, you might perhaps change your system configuration to be more simple or you have to manually adapt and enhance your autoinst.xml to make it work for your particular case.
Basic reasoning behind RecoveryImage
- The recovery medium is based on a pristine openSUSE or SUSE Linux Enterprise install medium. Therefore when the initial installation of the basic operating system from an openSUSE or SUSE Linux Enterprise install medium had worked in your particular case, it should also be possible to recreate your particular basic operating system from the recovery medium.
- AutoYaST can be used for an automated installation of various different kind of systems. Therefore with an appropriate autoinst.xml it should also be possible to recreate (almost) any system, in particular your basic operating system.
The limitation is what AutoYaST via the matching YaST modules can do
For example:
It depends on the particular openSUSE or SUSE Linux Enterprise product which filesystems are supported by YaST.
If an unsupported filesystem is used on your system, AutoYaST (via the matching YaST module) cannot recreate this filesystem.
The same kind of issue also happens with ReaR. As of this writing (Feb. 2016) ReaR supports those filesystems: ext2, ext3, ext4, vfat, xfs, reiserfs, btrfs.
Some examples of filesystems which are not supported by YaST or ReaR (to only name some more known ones): AFS, GFS, GPFS, JFFS, Lustre, OCFS2, StegFS, TrueCrypt, UnionFS,...
In many cases this means to remove sections regarding unsupported filesystems from autoinst.xml so that what is related in your system to such filesystems cannot be recreated. In this case you need to manually recreate such filesystems and what depends on them (at least all files on unsupported filesystems).
Native disaster recovery with AutoYaST (deprecated)
AutoYaST together with the new SUSE Installer in SUSE Linux Enterprise 12 provide first basic features so that experienced users and system admins can implement native disaster recovery with AutoYaST without the need for additional special stuff like the rear or rear-SUSE RPM packages.
Native disaster recovery with AutoYaST works basically when you know how to do it but it has some rough edges which are described in more detail below.
A precondition before you set up native disaster recovery with AutoYaST is that you are a somewhat experienced AutoYaST user. It is recommended that you have already installed some systems with AutoYaST before you do the next step and set up native disaster recovery with AutoYaST.
It is a precondition for native disaster recovery with AutoYaST that you can install the system with AutoYaST. If AutoYaST cannot install a system, it also cannot be used for disaster recovery (cf. "Basic reasoning behind" and "The limitation is what the special ReaR recovery system can do" above). Native disaster recovery with AutoYaST is the direct successor of the ideas behind disaster recovery with rear-SUSE/RecoveryImage.
What is this feature about?
The idea is that a system administrator wants to recover/recreate a system from a backup which is a tarball of the whole system. The recreated system uses same (or fully compatible) hardware. AutoYaST is used only for partitioning with filesystems and mount points and for bootloader installation. In this szenario the administrator does not want AutoYaST to touch the files at all after they have been restored from the tarball. For example, he does not want AutoYaST to reconfigure network, users, services, language, keyboard, or anything else.
How is this feature used?
To tell AutoYaST that you want this to happen, you have to take the autoinst.xml that was created when the system was initially installed and modify this file in some aspects (see below). Furthermore you must make the backup tarball available via network and also have to prepare a shell script that does the actual work (in particular the actual backup restore).
How does this work step by step?
1.
Install a SUSE Linux Enterprise 12 system and make a copy of the original /root/autoinst.xml that was by default created automatically at the end of the installation for later use.
2.
Configure the system (e.g. network, users, services, language, keyboard, and so on) and verify that the system works as you need it (also after a reboot).
3.
Make a backup of the whole system and store the backup on another host in your internal network, e.g. on a NFS share using a command like "tar -czvf /path/to/nfs/share/backup.tar.gz --one-file-system /" (for btrfs see below).
4.
Create a backup restore script that does the actual work (see below) and store it on another host in your internal network and make it available for an AutoYaST installation (e.g via HTTP, see below).
5.
Modify your copy of the original autoinst.xml (see below) and store it on another host in your internal network and make it available for an AutoYaST installation (e.g via HTTP, see below).
6.
Reinstall your system with AutoYaST. Provide the "autoyast" parameter for the installation to specify the location of your modified autoinst.xml (e.g. "autoyast=http://my_internal_http_server/autoinst.xml", see below).
What is special to backup files on btrfs?
When doing "tar -czvf /path/to/nfs/share/backup.tar.gz --one-file-system /" only files that are on the same filesystem as "/" will be archived.
Because btrfs subvolumes appear as file system boundaries, files in btrfs subvolumes are excluded by "tar --one-file-system" so that such files must be explictly listed to be included to be in the backup like:
tar -czvf ... --one-file-system / /path/to/subvolume1 /path/to/subvolume2 ...
For the SUSE Linux Enterprise 12 GA btrfs subvolumes the command may look like:
tar -czvf ... --one-file-system / /home /usr/local /opt/ /srv /tmp \ /var/spool /var/opt /var/lib/pgsql /var/lib/mailman /var/lib/named \ /var/tmp /var/log /var/crash /boot/grub2/x86_64-efi /boot/grub2/i386-pc \
See the "btrfs" section below why btrfs snapshot subvolumes should not be included in a normal backup. Perhaps for each btrfs snapshot subvolume a separated backup may make sense.
How does a backup restore script look like?
Its mandatory functionality is to restore the backup. When the backup is on a NFS share it mounts the NFS share and then it untars the backup into "/mnt", the location where the the filesystem(s) of the to-be-installed system are mounted during installation with AutoYaST when the script is run as so called "software image installation script" that replaces the usual software installation (see below).
In general a backup should be complete (i.e. it should contain all files). But some files must not be restored when a system is recreated e.g. /etc/fstab that is created correctly for the new system by the AutoYaST installer. Therefore such files are excluded from being restored.
All other functionality is optional or needed to work around the current rough edges in native disaster recovery with AutoYaST.
After the backup restore it does (in its current state) some hacks as it creates the initrd and installs the bootloader. This is hacky as these are tasks that AutoYaST wants to do and currently there is no way to tell it not to do it. The bootloader-part of the script can be deactivated and it will work too (in the tested scenarios) but for some reason, the creation of initrd by AutoYaST did not work properly and is therefore needed in the script for now. AutoYaST will try to create the initrd anyway and will always show an error message which can be ignored if you let the script do it.
Here is a tentative example of such a shell script. It has worked in a few tested scenarios. It has some debugging triggers where it can stop until a particular file is manually deleted so that one can inspect the recreated system at those states.
#! /bin/bash set -x # adapt as you need: backup_nfs_share="my_internal_nfs_server:/path/to/nfs/share" backup_nfs_file="backup.tar.gz" make_initrd="yes" #bootloader_install_device="/dev/sda2" #backup_wait_before_restore="/var/tmp/backup_wait_before_restore" #backup_wait_after_restore="/var/tmp/backup_wait_after_restore" backup_mountpoint="/var/tmp/backup_mountpoint" backup_restore_exclude="/var/tmp/backup_restore_exclude" # wait so that one can inspect the system before the actual work is done here: test -n "$backup_wait_before_restore" && touch $backup_wait_before_restore while test -e "$backup_wait_before_restore" do echo "waiting until $backup_wait_before_restore is removed (sleeping 10 seconds)" sleep 10 done # start the actual work: mkdir $backup_mountpoint || exit 1 # mount the NFS share where the backup_nfs_file is stored: mount -o nolock -t nfs $backup_nfs_share $backup_mountpoint || exit 2 # change from the installation system to the new installed target system filesystem root: cd /mnt || exit 3 # exclude old files from the backup to avoid damage of the new installed target system: echo 'etc/fstab' >$backup_restore_exclude || exit 4 echo 'etc/mtab' >>$backup_restore_exclude || exit 5 # exclude var/adm/autoinstall to have the current logs from this run of this script: echo 'var/adm/autoinstall' >>$backup_restore_exclude || exit 6 # exclude var/log/YaST2 to avoid a hang up in YaST at 'gzip /mnt/var/log/YaST2/y2log-1': echo 'var/log/YaST2' >>$backup_restore_exclude || exit 7 # dump the files from the backup into the new installed target system filesystem root # and have the modification time of the files as the time when the files were extracted # so that one can compare what files might be overwritten later when AutoYaST proceeds: tar -xmzvf $backup_mountpoint/$backup_nfs_file -X $backup_restore_exclude 2>&1 || exit 8 # clean up stuff from backup restore: umount $backup_mountpoint rmdir $backup_mountpoint rm $backup_restore_exclude # make initrd verbosely in the target system filesystem root: if test -n "$make_initrd" then chroot /mnt /bin/bash -c "dracut -f -v" || exit 9 fi # install bootloader in the target system filesystem root: if test -n "$bootloader_install_device" then # make bootloader configuration in the target system filesystem root: chroot /mnt /bin/bash -c "grub2-mkconfig -o /boot/grub2/grub.cfg" || exit 10 # install bootloader in the target system filesystem root: chroot /mnt /bin/bash -c "grub2-install --force $bootloader_install_device" || exit 11 fi # wait so that one can inspect the result before AutoYaST proceeds: test -n "$backup_wait_after_restore" && touch $backup_wait_after_restore while test -e "$backup_wait_after_restore" do echo "waiting until $backup_wait_after_restore is removed (sleeping 10 seconds)" sleep 10 done # exit: exit 0
Currently AutoYaST does not provide a feature that a script can provide feedback to AutoYaST while it runs so that the user could be informed by AutoYaST about the progress of a running script (e.g. via a progress bar or even via meaningful feedback messages directly from the script). Currently there is only an unfortunate non-progressing progress bar which shows a static "0%" at a message like "Executing autoinstall scripts in the installation environment" (or similar) regardless that the script proceeds to restore the backup because the script cannot trigger the progress bar.
How does autoinst.xml for native disaster recovery with AutoYaST look like?
Each machine needs its specific autoinst.xml file.
Create your autoinst.xml for native disaster recovery with AutoYaST from the original /root/autoinst.xml that was created by the installer of your system (see above).
If you changed your basic system (in particular partitioning, filesystems, mount points, bootloader) and you need a new autoinst.xml that machtes your current system, first keep your /root/autoinst.xml that was created during initial system installation via "mv /root/autoinst.xml /root/autoinst.initial-installation.xml" and then create an up-to-date /root/autoinst.xml via "yast2 --ncurses clone_system".
The basic idea when creating an autoinst.xml for native disaster recovery with AutoYaST is to remove everything that is not needed for native disaster recovery with AutoYaST and add special stuff that is needed for native disaster recovery with AutoYaST. In this case AutoYaST is used only for partitioning with filesystems and mount points and for bootloader installation (see above).
Here is an example how an autoinst.xml for native disaster recovery with AutoYaST may look like. It shows all top-level sections that are usually needed. All other top-level sections have been completely removed. In particular all top-level sections to configure the system except partitioning and bootloader have been completely removed (e.g. firewall, groups, kdump, keyboard, language, login_settings, networking, ntp-client, proxy, services-manager, timezone, user_defaults, users and so on).
Text in brackets '[...]' is only a comment here (i.e. it is not in the actual autoinst.xml) to point out where usually no changes are needed or where special stuff for native disaster recovery with AutoYaST is added.
<xml version="1.0"?> <!DOCTYPE profile> <profile xmlns="http://www.suse.com/1.0/yast2ns" xmlns:config="http://www.suse.com/1.0/configns"> <add-on> [usually no changes needed - leave the content as is] </add-on> <bootloader> [usually no changes needed - leave the content as is] </bootloader> <deploy_image> [usually no changes needed - leave the content as is] </deploy_image> <general> [add the following line] <minimal-configuration config:type="boolean">true</minimal-configuration> <ask-list config:type="list"/> <mode> <confirm config:type="boolean">false</confirm> [add the following line] <second_stage config:type="boolean">false</second_stage> </mode> [usually no further changes needed - leave the other content as is] </general> <partitioning config:type="list"> [usually no changes needed - leave the content as is] </partitioning> <report> [usually no changes needed - leave the content as is] <software> [replace the content with the following - adapt the values as you need] <image> <image_location>http://my_internal_http_server</image_location> <image_name>backup-restore-script</image_name> <script_location>http://my_internal_http_server/backup-restore.sh</script_location> </image> </software> </profile>
Therein the added special stuff for native disaster recovery with AutoYaST is:
<minimal-configuration config:type="boolean">true</minimal-configuration> <second_stage config:type="boolean">false</second_stage> <image> <image_location>http://my_internal_http_server</image_location> <image_name>backup-restore-script</image_name> <script_location>http://my_internal_http_server/backup-restore.sh</script_location> </image>
The first two (i.e. minimal-configuration=true and second_stage=false) are needed to prevent AutoYaST to touch the files after they have been restored from the tarball. Nevertheless during the actual AutoYaST installation it shows several messages that files are copied into the installed system or modified therein. This is another rough edge of the currently under development native disaster recovery with AutoYaST.
The third one (i.e. image script_location) replaces the usual software installation via RPM packages by a so called "image installation" where a script is run that "does the right thing" to install the whole software as if it was an image. In this case the image is the backup.tar.gz file.
How one could make files available via HTTP? (e.g. for AutoYaST or for inst-sys)
On another host in your internal network install the Apache web server software (apache2 packages) and store the files in the web server default directory /srv/www/htdocs/ and start the web server (e.g. via "rcapache2 start" or for systemd via appropriate systemctl commands). Check on the system that should be recreated via AutoYaST that the files can be downloaded by commands like
wget http://my_internal_http_server/autoinst.xml wget http://my_internal_http_server/backup-restore.sh
Other possible ways to do native disaster recovery with AutoYaST
The above is only one particular way how to do native disaster recovery with AutoYaST. AutoYaST provides various ways to run scripts at various stages during installation. Therefore a backup restore script can run at various alternative stages during an AutoYaST installation.
For example one could leave the software section in autoinst.xml as is to let AutoYaST install all RPM software packages of the complete system and afterwards restore a backup that only contains those files that are added or changed compared to the original files in the RPM software packages. In this case the restore script could run as a so called "chroot script" (with "chrooted=false" or "chrooted=true" as needed in the particular case).
Alternatively when the backup is complete and when the restore script is not run as software image installation script (as described above) but later (e.g. as "chroot script"), there is no need to let AutoYaST first install RPM software packages and later overwrite all of them when the backup is restored. In this case the software installation might be completely skiped in autoinst.xml like the following:
<software> <image> <script_location>file:///bin/true</script_location> </image> </software>
This way only /bin/true is run as a dummy to "install" the software which lets AutoYaST effectively skip the whole software package installation successfully.
Furthermore one could run the restore script at a very late stage of the installation (as so called AutoYaST "post-install script" or "init script") to avoid as much as possible that AutoYaST may change any files after they have been restored from the tarball. Depending on at what stage the restore script is run all what is needed by the restore script may have to be installed. In such cases it is not possible to completely skip the software installation because at least a minimal system that provides all what is needed by the restore script must be installed via the software section in autoinst.xml (or perhaps via an additional preparatory restore/install script that is run at an early stage of the installation).
Venturous users needed for native disaster recovery with AutoYaST
When you are experienced with AutoYaST and in particular when you already use AutoYaST to install your systems, it looks reasonable to also do your disaster recovery with AutoYaST.
In the end when there are no users who really use and demand native disaster recovery with AutoYaST, it will die off.
Native disaster recovery with AutoYaST is deprecated
Because up to now (as of this writing dated 05. Feb. 2021) there was basically zero feedback about native disaster recovery with AutoYaST, it is "deprecated" which means zero further effort will be made to get it really working. In particular nothing was done to keep it working for SUSE Linux Enterprise 15.
Generic usage of the plain SUSE installation system for backup and recovery
When you use the above described ways how to do disaster recovery you work on top of a monstrous mountain of sophisticated machinery (Relax-and-Recover or AutoYaST) that (hopefully) does what you want.
In contast to taming complicated "monsters", generic usage of the plain SUSE installation system goes back to the roots which means:
An experienced admin makes a script that calls his specifically appropriate low-level commands directly.
On the one hand this is working in compliance with the KISS principle "keep it simple and straightforward" (cf. http://en.wikipedia.org/wiki/KISS_principle) and it avoids what is best described in RFC1925 (cf. http://tools.ietf.org/html/rfc1925) as "It is always possible to add another level of indirection" because the primary intent behind generic disaster recovery is simplicity and control (cf. https://hackweek.suse.com/12/projects/784).
But on the other hand there is a price to pay: It is for experts who know how to work with low-level commands (which again shows that there is no such thing as a disaster recovery solution that "just works").
The following description was done on SUSE Linux Enterprise 12 GA. Because things change all the time there are likely differences on later SUSE Linux Enterprise versions so that adaptions are needed to make that work on later SUSE Linux Enterprise versions.
Generic disaster recovery with the plain SUSE installation system
Basically the idea is to boot the plain SUSE installation system (inst-sys) but to not let it run YaST. Instead inst-sys runs a specific selfmade script that does the usual disaster recovery steps:
- create partitioning with filesystems and mount points
- restore the files backup from an NFS server
- install initrd and boot loader
- reboot
For example to recover a SUSE Linux Enterprise 12 GA system with the default btrfs subvolume structure on a single 12 GB harddisk such a script could be like the following (attention - very long lines could be shown wrapped in your browser):
#! /bin/bash # Print commands and their arguments as they are executed: set -x # Print shell input lines as they are read: set -v # Exit immediately if a command exits with a non-zero status: set -e # Pipeline returns last command to exit with non-zero status or zero otherwise: set -o pipefail # Disable file name generation (globbing): set -f # Treat unset variables as an error when substituting: set -u # Export all variables which are modified or created automatically to have them available in any sub-shell: set -a # Have a clean environment: export PATH="/sbin:/usr/sbin:/usr/bin:/bin" export LC_ALL="POSIX" export LANG="POSIX" umask 0022 # Make sure the effective user ID is 0 (i.e. this script must run as root): test "0" != "$( id -u )" && { echo "Error: Need 'root' privileges but the effective user ID is not '0'." ; false ; } # Start logging: my_name=${0##*/} starting_timestamp=$( date +%Y%m%d%H%M%S ) log_file=$my_name.$starting_timestamp.log # Have a new file descriptor 3 which is a copy of the stdout file descriptor: exec 3>&1 # Have a new file descriptor 4 which is a copy of the stderr file descriptor: exec 4>&2 # Have stdout on the terminal and also in the log file: exec 1> >( exec -a $my_name:tee tee -a $log_file ) logging_tee_pid=$! # Make stderr what stdout already is (i.e. terminal and also in the log file): exec 2>&1 # Adapt the following variables as you need. # Be careful and verify that your settings work with the code below. # This script is not at all foolproof. # It is for experts who know how to work with low-level commands. command_before_actual_work="" #command_before_actual_work="bash -c 'echo exit this sub-shell to start the recovery ; exec bash -i'" # Partitioning: harddisk_device="/dev/sda" harddisk_disklabel="msdos" swap_partition=${harddisk_device}1 swap_partition_begin_cylinder="0" swap_partition_end_cylinder="191" system_partition=${harddisk_device}2 system_partition_begin_cylinder="191" system_partition_end_cylinder="last_cylinder" # Filesystems: swap_partition_make_filesystem_command="mkswap -f" system_partition_filesystem="btrfs" system_partition_make_filesystem_command="mkfs.btrfs -f" # Mountpoint in the installation system of the target system filesystem root: target_system_filesystem_root="/mnt" # Btrfs subvolumes: btrfs_default_subvolume_parent_directory="" btrfs_default_subvolume="@" btrfs_subvolumes_parent_directories="@/boot/grub2 @/usr @/var/lib" btrfs_subvolumes="@/boot/grub2/i386-pc @/boot/grub2/x86_64-efi @/home @/opt @/srv @/tmp @/usr/local @/var/crash @/var/lib/mailman @/var/lib/named @/var/lib/pgsql @/var/log @/var/opt @/var/spool @/var/tmp @/.snapshots" # Backup restore: backup_nfs_share="my_internal_nfs_server:/path/to/nfs/share" backup_nfs_file="backup.tar.gz" command_before_backup_restore="" #command_before_backup_restore="bash -c 'echo exit this sub-shell to start the backup restore ; exec bash -i'" command_after_backup_restore="" #command_after_backup_restore="bash -c 'echo exit this sub-shell to continue after backup restore ; exec bash -i'" backup_mountpoint="/tmp/backup_mountpoint" backup_restore_exclude_file="/tmp/backup_restore_exclude" # Initrd: make_initrd="yes" # Bootloader: bootloader_install_device=$system_partition # Reboot: command_before_reboot="" #command_before_reboot="bash -c 'echo exit this sub-shell to reboot ; exec bash -i'" # Start the actual work: test -n "$command_before_actual_work" && eval $command_before_actual_work # To find out what YaST had done to set up partitions, filesystems, and btrfs subvolumes in the original system # inspect the YaST log files (/var/log/YaST2/y2log*) for the commands that have been executed by libstorage # e.g. by using a command like: grep -o 'libstorage.*Executing:".*' /var/log/YaST2/y2log* # Make partitions: # Wait until udev has finished and then verify that the harddisk device node exists: udevadm settle --timeout=20 test -b $harddisk_device # Create new disk label. The new disk label will have no partitions: parted -s $harddisk_device mklabel $harddisk_disklabel harddisk_last_cylinder=$( parted -s $harddisk_device unit cyl print | grep "^Disk $harddisk_device:" | cut -d ' ' -f 3 | tr -c -d '[:digit:]' ) # Make swap partition: test "last_cylinder" = "$swap_partition_end_cylinder" && swap_partition_end_cylinder=$harddisk_last_cylinder parted -s --align=optimal $harddisk_device unit cyl mkpart primary linux-swap $swap_partition_begin_cylinder $swap_partition_end_cylinder parted -s $harddisk_device set 1 type 0x82 # Make system partition: test "last_cylinder" = "$system_partition_end_cylinder" && system_partition_end_cylinder=$harddisk_last_cylinder # Use hardcoded parted fs-type "ext2" as dummy for now regardless what filesystem will be actually created there later: parted -s --align=optimal $harddisk_device unit cyl mkpart primary ext2 $system_partition_begin_cylinder $system_partition_end_cylinder parted -s $harddisk_device set 2 type 0x83 set 2 boot on # Wait until udev has finished and then verify that the harddisk partitions device nodes exist: udevadm settle --timeout=20 test -b $swap_partition test -b $system_partition # Make filesystems: # Erase filesystem, raid or partition-table signatures (magic strings) to clean up a used disk before making filesystems: wipefs -a $swap_partition wipefs -a $system_partition $swap_partition_make_filesystem_command $swap_partition $system_partition_make_filesystem_command $system_partition # Use the swap partition: swapon --fixpgsz $swap_partition # Make btrfs subvolumes: if test -n "$btrfs_default_subvolume" -o -n "$btrfs_subvolumes" then mount -t btrfs -o subvolid=0 $system_partition $target_system_filesystem_root if test -n "$btrfs_default_subvolume_parent_directory" then mkdir -p $target_system_filesystem_root/$btrfs_default_subvolume_parent_directory fi if test -n "$btrfs_default_subvolume" then btrfs subvolume create $target_system_filesystem_root/$btrfs_default_subvolume btrfs_default_subvolume_ID=$( btrfs subvolume list $target_system_filesystem_root | cut -d ' ' -f 2 ) btrfs subvolume set-default $btrfs_default_subvolume_ID $target_system_filesystem_root fi if test -n "$btrfs_subvolumes_parent_directories" then for btrfs_subvolume_parent_directory in $btrfs_subvolumes_parent_directories do mkdir -p $target_system_filesystem_root/$btrfs_subvolume_parent_directory done fi if test -n "$btrfs_subvolumes" then for btrfs_subvolume in $btrfs_subvolumes do btrfs subvolume create $target_system_filesystem_root/$btrfs_subvolume done fi umount $target_system_filesystem_root fi # To be on the safe side wait until udev has finished: udevadm settle --timeout=20 # Mount system partition at the mountpoint in the installation system of the target system filesystem root: mount $system_partition $target_system_filesystem_root # Create etc/fstab in the target system: mkdir $target_system_filesystem_root/etc pushd /dev/disk/by-uuid/ set +f swap_partition_uuid=$( for uuid in * ; do readlink -e $uuid | grep -q $swap_partition && echo $uuid || true ; done ) system_partition_uuid=$( for uuid in * ; do readlink -e $uuid | grep -q $system_partition && echo $uuid || true ; done ) set -f popd ( echo "UUID=$swap_partition_uuid swap swap defaults 0 0" echo "UUID=$system_partition_uuid / $system_partition_filesystem defaults 0 0" if test -n "$btrfs_subvolumes" then for btrfs_subvolume in $btrfs_subvolumes do btrfs_subvolume_mountpoint=${btrfs_subvolume#@} echo "UUID=$system_partition_uuid $btrfs_subvolume_mountpoint $system_partition_filesystem subvol=$btrfs_subvolume 0 0" done fi ) > $target_system_filesystem_root/etc/fstab # Backup restore: mkdir $backup_mountpoint # Mount the NFS share where the backup_nfs_file is stored: mount -o nolock -t nfs $backup_nfs_share $backup_mountpoint # Exclude old files from the backup where restore would cause damage of the new installed target system: echo 'etc/fstab' >$backup_restore_exclude_file # Wait so that one can inspect the system before the backup is restored: test -n "$command_before_backup_restore" && eval $command_before_backup_restore # Change from the installation system to the new installed target system filesystem root: pushd $target_system_filesystem_root # Dump the files from the backup into the new installed target system filesystem root: tar -xzvf $backup_mountpoint/$backup_nfs_file -X $backup_restore_exclude_file # Change back from the target system filesystem root to the installation system: popd # Wait so that one can inspect the system after the backup was restored: test -n "$command_after_backup_restore" && eval $command_after_backup_restore # Make initrd and install bootloader: if test -n "$make_initrd" -o -n "$bootloader_install_device" then # Make /proc /sys /dev from the installation system available in the target system # which are needed to make initrd and to install bootloader in the target system: mount -t proc none $target_system_filesystem_root/proc mount -t sysfs sys $target_system_filesystem_root/sys mount -o bind /dev $target_system_filesystem_root/dev # Make initrd verbosely in the target system: if test -n "$make_initrd" then chroot $target_system_filesystem_root /usr/bin/dracut -f -v fi # Install bootloader in the target system: if test -n "$bootloader_install_device" then # Make bootloader configuration in the target system: chroot $target_system_filesystem_root /usr/sbin/grub2-mkconfig -o /boot/grub2/grub.cfg # Install bootloader in the target system: chroot $target_system_filesystem_root /usr/sbin/grub2-install --force $bootloader_install_device fi fi # Prepare for reboot: echo "Preparing for reboot..." # Stop logging: # Have stdout and stderr on the terminal but no longer in the log file that is in use by the my_name:tee process # which was forked at "Start logging" via: exec 1> >( exec -a $my_name:tee tee -a $log_file ) # Close stdout and stderr to finish the my_name:tee logging process: exec 1>&- exec 2>&- # Reopen stdout as what was saved in file descriptor 3: exec 1>&3 # Reopen stderr as what was saved in file descriptor 4: exec 2>&4 # Wait one second to be on the safe side that the my_name:tee logging process has finished: sleep 1 if ps $logging_tee_pid 1>/dev/null then echo "$my_name:tee process (PID $logging_tee_pid) still running (writes to $log_file)." echo "Waiting 60 seconds to give the $my_name:tee logging process more time to finish." for i in $( seq 60 ) do echo -n "." sleep 1 done echo "" fi # Copy the log file into the target system and also where the backup is: cp $log_file $target_system_filesystem_root/var/log/$log_file cp $log_file $backup_mountpoint/$log_file # Umount the NFS share where the backup is: umount $backup_mountpoint rmdir $backup_mountpoint # Umount proc sys dev: if test -n "$make_initrd" -o -n "$bootloader_install_device" then umount $target_system_filesystem_root/proc umount $target_system_filesystem_root/sys umount $target_system_filesystem_root/dev fi # Umount system partition: umount $target_system_filesystem_root # Do no longer use the swap partition: swapoff $swap_partition # Reboot: test -n "$command_before_reboot" && eval $command_before_reboot echo "Rebooting now..." sync sleep 10 reboot -f
This script is listed here primarily as documentation how it could be done. Do not blindly copy and use this script. You need to carefully examine it, understand what each command does, and thoroughly adapt and enhance it as you need it for your particular case.
Assume this script is stored as "recover.sh".
To make a script available to inst-sys and to let inst-syst automatically run it, do the following:
- package the script as cpio archive "recover.cpio"
- make the cpio archive available in your internal network (e.g via HTTP, see above).
- let inst-sys download the cpio archive from your internal network server
- tell inst-sys to run that script (instead of YaST)
To package the script as cpio archive "recover.cpio" run
chmod u+rx recover.sh echo recover.sh | cpio -o >recover.cpio
and make the cpio archive available in your internal network (e.g via HTTP).
Assume the cpio archive is available via HTTP as http://my_internal_http_server/recover.cpio
To let inst-sys download the cpio archive and run it (instead of YaST) provide something like the following linuxrc parameters on the boot-screen commandline (cf. SDB:Linuxrc):
netsetup=dhcp setupcmd="setctsid $(showconsole) inst_setup /recover.sh" insecure=1 dud=http://my_internal_http_server/recover.cpio
Because the whole recovery runs full-automated (including reboot) even a relatively unexperienced person can do the recovery when the boot-screen commandline is not so complicated. To achieve this save the boot-screen commandline parameters in a file "recover.info"
netsetup=dhcp setupcmd="setctsid $(showconsole) inst_setup /recover.sh" insecure=1 dud=http://my_internal_http_server/recover.cpio
and make this file available in your internal network, for example via HTTP as http://my_internal_http_server/recover.info
To recover a SUSE Linux Enterprise 12 system, boot the machine from a SUSE Linux Enterprise 12 installation medium, select in the boot menue "Installation", and specify on the boot-screen commandline:
info=http://my_internal_http_server/recover.info
Generic files backup with the plain SUSE installation system
To be really safe that the content of a files backup is consistent the backup must be done when the system is not in use, for example in single-user runlevel, or when the system is not running.
What "consistent backup" means
Consistent backup means that all files in the backup are consistent for the user. For example assume the user is running an application program that changes several files simultaneously (in the simplest case think about the user is manually changing several files simultaneously, e.g. 'root' or a setup tool may change several config files). When that program is running during files backup, the backup may contain old data in some files and new data in other files which could be an inconsistent state. Such inconsistencies could lead to errors in that application after such an inconsistent backup was restored or probably even worse the application blindly continues to operate on inconsistent data which could make the overall state more and more inconsistent over time until at the very end all might be completely messed up.
To backup the files of a system that is not running, the basic idea is to boot the plain SUSE installation system (inst-sys) but to not let it run YaST. Instead inst-sys runs a specific selfmade script that does the files backup.
For example to create a files backup of a SUSE Linux Enterprise 12 GA system with the default btrfs subvolume structure on a single 12 GB harddisk such a script could be like the following (attention - very long lines could be shown wrapped in your browser):
#! /bin/bash # Print commands and their arguments as they are executed: set -x # Print shell input lines as they are read: set -v # Exit immediately if a command exits with a non-zero status: set -e # Pipeline returns last command to exit with non-zero status or zero otherwise: set -o pipefail # Disable file name generation (globbing): set -f # Treat unset variables as an error when substituting: set -u # Export all variables which are modified or created automatically to have them available in any sub-shell: set -a # Have a clean environment: export PATH="/sbin:/usr/sbin:/usr/bin:/bin" export LC_ALL="POSIX" export LANG="POSIX" umask 0022 # Make sure the effective user ID is 0 (i.e. this script must run as root): test "0" != "$( id -u )" && { echo "Error: Need 'root' privileges but the effective user ID is not '0'." ; false ; } # Start logging: my_name=${0##*/} starting_timestamp=$( date +%Y%m%d%H%M%S ) log_file=$my_name.$starting_timestamp.log # Have a new file descriptor 3 which is a copy of the stdout file descriptor: exec 3>&1 # Have a new file descriptor 4 which is a copy of the stderr file descriptor: exec 4>&2 # Have stdout on the terminal and also in the log file: exec 1> >( exec -a $my_name:tee tee -a $log_file ) logging_tee_pid=$! # Make stderr what stdout already is (i.e. terminal and also in the log file): exec 2>&1 # Adapt the following variables as you need. # Be careful and verify that your settings work with the code below. # This script is not at all foolproof. # It is for experts who know how to work with low-level commands. command_before_actual_work="" #command_before_actual_work="bash -c 'echo exit this sub-shell to start the backup process ; exec bash -i'" # Partitioning: harddisk_device="/dev/sda" swap_partition=${harddisk_device}1 system_partition=${harddisk_device}2 # Filesystems: system_partition_filesystem="btrfs" # Mountpoint in the installation system of the target system filesystem root: target_system_filesystem_root="/mnt" # Backup: backup_nfs_share="my_internal_nfs_server:/path/to/nfs/share" backup_nfs_file="backup.tar.gz" # Backup all normal btrfs subvolumes (i.e. all except the btrfs snapshot subvolumes in "/.snapshots"): backup_btrfs_subvolumes="@/boot/grub2/i386-pc @/boot/grub2/x86_64-efi @/home @/opt @/srv @/tmp @/usr/local @/var/crash @/var/lib/mailman @/var/lib/named @/var/lib/pgsql @/var/log @/var/opt @/var/spool @/var/tmp" command_before_backup_create="" #command_before_backup_create="bash -c 'echo exit this sub-shell to start the backup ; exec bash -i'" command_after_backup_create="" #command_after_backup_create="bash -c 'echo exit this sub-shell to continue after backup ; exec bash -i'" backup_mountpoint="/tmp/backup_mountpoint" # Reboot: command_before_reboot="" #command_before_reboot="bash -c 'echo exit this sub-shell to reboot ; exec bash -i'" # Start the actual work: test -n "$command_before_actual_work" && eval $command_before_actual_work # Use the swap partition: swapon --fixpgsz $swap_partition # Mount system partition at the mountpoint in the installation system of the target system filesystem root: mount $system_partition $target_system_filesystem_root # Mount btrfs subvolumes at their mountpoints in the target system filesystem root: backup_btrfs_subvolumes_mountpoints="" if test -n "$backup_btrfs_subvolumes" then for btrfs_subvolume in $backup_btrfs_subvolumes do btrfs_subvolume_mountpoint=${btrfs_subvolume#@} backup_btrfs_subvolumes_mountpoints="$backup_btrfs_subvolumes_mountpoints $btrfs_subvolume_mountpoint" mount -t btrfs -o subvol=$btrfs_subvolume $system_partition $target_system_filesystem_root/$btrfs_subvolume_mountpoint done fi # Backup: mkdir -p $target_system_filesystem_root/$backup_mountpoint # Mount the NFS share where the backup_nfs_file is to be stored in the target system filesystem root: mount -o nolock -t nfs $backup_nfs_share $target_system_filesystem_root/$backup_mountpoint # Wait so that one can inspect the system before the backup is made: test -n "$command_before_backup_create" && eval $command_before_backup_create # From within the target system filesystem root dump its files into the backup_nfs_file: chroot $target_system_filesystem_root /bin/tar -czvf $backup_mountpoint/$backup_nfs_file --one-file-system / $backup_btrfs_subvolumes_mountpoints # Wait so that one can inspect the system after the backup was made: test -n "$command_after_backup_create" && eval $command_after_backup_create # Prepare for reboot: echo "Preparing for reboot..." # Stop logging: # Have stdout and stderr on the terminal but no longer in the log file that is in use by the my_name:tee process # which was forked at "Start logging" via: exec 1> >( exec -a $my_name:tee tee -a $log_file ) # Close stdout and stderr to finish the my_name:tee logging process: exec 1>&- exec 2>&- # Reopen stdout as what was saved in file descriptor 3: exec 1>&3 # Reopen stderr as what was saved in file descriptor 4: exec 2>&4 # Wait one second to be on the safe side that the my_name:tee logging process has finished: sleep 1 if ps $logging_tee_pid 1>/dev/null then echo "$my_name:tee process (PID $logging_tee_pid) still running (writes to $log_file)." echo "Waiting 60 seconds to give the $my_name:tee logging process more time to finish." for i in $( seq 59 -1 0 ) do echo "$i" sleep 1 done fi # Copy the log file into the target system filesystem root and also where the backup is: cp $log_file $target_system_filesystem_root/var/log/$log_file cp $log_file $target_system_filesystem_root/$backup_mountpoint/$log_file # Umount the NFS share where the backup is: umount $target_system_filesystem_root/$backup_mountpoint rmdir $target_system_filesystem_root/$backup_mountpoint # Umount btrfs subvolumes: if test -n "$backup_btrfs_subvolumes" then for btrfs_subvolume in $backup_btrfs_subvolumes do btrfs_subvolume_mountpoint=${btrfs_subvolume#@} umount $target_system_filesystem_root/$btrfs_subvolume_mountpoint done fi # Umount system partition: umount $target_system_filesystem_root # Do no longer use the swap partition: swapoff $swap_partition # Reboot: test -n "$command_before_reboot" && eval $command_before_reboot echo "Rebooting now..." sync sleep 10 reboot -f
This script is listed here primarily as documentation how it could be done. Do not blindly copy and use this script. You need to carefully examine it, understand what each command does, and thoroughly adapt and enhance it as you need it for your particular case. In particular verify that the backup tool (like 'tar') supports your particular needs (e.g. access control lists or extended attributes and whatever else).
Assume this script is stored as "backup.sh".
To make a script available to inst-sys and to let inst-syst automatically run it, do the following:
- package the script as cpio archive "backup.cpio"
- make the cpio archive available in your internal network (e.g via HTTP, see above).
- let inst-sys download the cpio archive from your internal network server
- tell inst-sys to run that script (instead of YaST)
To package the script as cpio archive "backup.cpio" run
chmod u+rx backup.sh echo backup.sh | cpio -o >backup.cpio
and make the cpio archive available in your internal network (e.g via HTTP).
Assume the cpio archive is available via HTTP as http://my_internal_http_server/backup.cpio
Provide the matching boot-screen commandline parameters in a file "backup.info"
netsetup=dhcp setupcmd="setctsid $(showconsole) inst_setup /backup.sh" insecure=1 dud=http://my_internal_http_server/backup.cpio
and make this file available in your internal network, for example via HTTP as http://my_internal_http_server/backup.info
To backup the files of a SUSE Linux Enterprise 12 system, reboot the machine from a SUSE Linux Enterprise 12 installation medium, select in the boot menue "Installation", and specify on the boot-screen commandline:
info=http://my_internal_http_server/backup.info
Generic system installation with the plain SUSE installation system
To install a system only with the plain SUSE installation system (without YaST), the basic idea is to boot the plain SUSE installation system (inst-sys) but to not let it run YaST. Instead inst-sys runs a specific selfmade script that does the system installation.
Currently generic installation with plain inst-sys is in an experimental state. It emerged as spin-off from generic recovery with plain inst-sys. The basic idea behind is to replace the restore of the files backup with an installation of software packages. In contrast to restoring a files backup that results a configured system (because configuration files are restored from the backup), installing software packages results a pristine "raw" system (with the default configuration files from the software packages) that must be configured as needed.
For example to install a "raw" SUSE Linux Enterprise 12 GA base system with the default btrfs subvolume structure on a single 12 GB harddisk where the only end-user application is the Mozilla Firefox web browser such a script could be like the following (attention - very long lines could be shown wrapped in your browser):
#! /bin/bash # Print commands and their arguments as they are executed: set -x # Print shell input lines as they are read: set -v # Exit immediately if a command exits with a non-zero status: set -e # Pipeline returns last command to exit with non-zero status or zero otherwise: set -o pipefail # Disable file name generation (globbing): set -f # Treat unset variables as an error when substituting: set -u # Export all variables which are modified or created automatically to have them available in any sub-shell: set -a # Have a clean environment: export PATH="/sbin:/usr/sbin:/usr/bin:/bin" export LC_ALL="POSIX" export LANG="POSIX" umask 0022 # Make sure the effective user ID is 0 (i.e. this script must run as root): test "0" != "$( id -u )" && { echo "Error: Need 'root' privileges but the effective user ID is not '0'." ; false ; } # Start logging: my_name=${0##*/} starting_timestamp=$( date +%Y%m%d%H%M%S ) log_file=$my_name.$starting_timestamp.log # Have a new file descriptor 3 which is a copy of the stdout file descriptor: exec 3>&1 # Have a new file descriptor 4 which is a copy of the stderr file descriptor: exec 4>&2 # Have stdout on the terminal and also in the log file: exec 1> >( exec -a $my_name:tee tee -a $log_file ) logging_tee_pid=$! # Make stderr what stdout already is (i.e. terminal and also in the log file): exec 2>&1 # Adapt the following variables as you need. # Be careful and verify that your settings work with the code below. # This script is not at all foolproof. # It is for experts who know how to set up a system with low-level commands. command_before_actual_work="" #command_before_actual_work="bash -c 'echo exit this sub-shell to start the installation ; exec bash -i'" # Partitioning: harddisk_device="/dev/sda" harddisk_disklabel="msdos" swap_partition_number="1" swap_partition=$harddisk_device$swap_partition_number swap_partition_begin_percentage="0" swap_partition_end_percentage="17" system_partition_number="2" system_partition=$harddisk_device$system_partition_number system_partition_begin_percentage="17" system_partition_end_percentage="100" # Filesystems: swap_partition_make_filesystem_command="mkswap -f" system_partition_filesystem="btrfs" system_partition_make_filesystem_command="mkfs.btrfs -f" # Mountpoint in the installation system of the target system filesystem root: target_system_filesystem_root="/mnt" # Btrfs subvolumes: btrfs_default_subvolume_parent_directory="" btrfs_default_subvolume="@" btrfs_subvolumes_parent_directories="@/boot/grub2 @/usr @/var/lib" btrfs_subvolumes="@/boot/grub2/i386-pc @/boot/grub2/x86_64-efi @/home @/opt @/srv @/tmp @/usr/local @/var/crash @/var/lib/mailman @/var/lib/named @/var/lib/pgsql @/var/log @/var/opt @/var/spool @/var/tmp" # Software installation: command_before_software_installation="" #command_before_software_installation="bash -c 'echo exit this sub-shell to start the software installation ; exec bash -i'" zypper_baseproduct_file="/etc/products.d/SLES.prod" zypper_software_repositories="/var/adm/mount/suse" zypper_install_items="patterns-sles-base kernel-default patterns-sles-x11 MozillaFirefox" command_after_software_installation="" #command_after_software_installation="bash -c 'echo exit this sub-shell to continue after software installation ; exec bash -i'" # Initrd: make_initrd="yes" # Bootloader: bootloader_install_device=$system_partition # Root password: root_password="install" # Snapper: make_snapper_root_config="yes" # Reboot: command_before_reboot="" #command_before_reboot="bash -c 'echo exit this sub-shell to reboot ; exec bash -i'" # Start the actual work: test -n "$command_before_actual_work" && eval $command_before_actual_work # Make partitions: # Wait until udev has finished and then verify that the harddisk device node exists: udevadm settle --timeout=20 test -b $harddisk_device # Create new disk label. The new disk label will have no partitions: parted -s $harddisk_device mklabel $harddisk_disklabel # Make swap partition: parted -s --align=optimal $harddisk_device unit % mkpart primary linux-swap $swap_partition_begin_percentage $swap_partition_end_percentage parted -s $harddisk_device set $swap_partition_number type 0x82 # Make system partition: # Use hardcoded parted fs-type "ext2" as dummy for now regardless what filesystem will be actually created there later: parted -s --align=optimal $harddisk_device unit % mkpart primary ext2 $system_partition_begin_percentage $system_partition_end_percentage parted -s $harddisk_device set $system_partition_number type 0x83 set $system_partition_number boot on # Report what is actually set up by parted: parted -s $harddisk_device unit GiB print # Wait until udev has finished and then verify that the harddisk partitions device nodes exist: udevadm settle --timeout=20 test -b $swap_partition test -b $system_partition # Make filesystems: # Erase filesystem, raid or partition-table signatures (magic strings) to clean up a used disk before making filesystems: wipefs -a $swap_partition wipefs -a $system_partition $swap_partition_make_filesystem_command $swap_partition $system_partition_make_filesystem_command $system_partition # Use the swap partition: swapon --fixpgsz $swap_partition # Make btrfs subvolumes: if test -n "$btrfs_default_subvolume" -o -n "$btrfs_subvolumes" then mount -t btrfs -o subvolid=0 $system_partition $target_system_filesystem_root if test -n "$btrfs_default_subvolume_parent_directory" then mkdir -p $target_system_filesystem_root/$btrfs_default_subvolume_parent_directory fi if test -n "$btrfs_default_subvolume" then btrfs subvolume create $target_system_filesystem_root/$btrfs_default_subvolume btrfs_default_subvolume_ID=$( btrfs subvolume list $target_system_filesystem_root | cut -d ' ' -f 2 ) btrfs subvolume set-default $btrfs_default_subvolume_ID $target_system_filesystem_root fi if test -n "$btrfs_subvolumes_parent_directories" then for btrfs_subvolume_parent_directory in $btrfs_subvolumes_parent_directories do mkdir -p $target_system_filesystem_root/$btrfs_subvolume_parent_directory done fi if test -n "$btrfs_subvolumes" then for btrfs_subvolume in $btrfs_subvolumes do btrfs subvolume create $target_system_filesystem_root/$btrfs_subvolume done fi umount $target_system_filesystem_root fi # To be on the safe side wait until udev has finished: udevadm settle --timeout=20 # Mount system partition at the mountpoint in the installation system of the target system filesystem root: mount $system_partition $target_system_filesystem_root # Create etc/fstab in the target system: mkdir $target_system_filesystem_root/etc pushd /dev/disk/by-uuid/ set +f swap_partition_uuid=$( for uuid in * ; do readlink -e $uuid | grep -q $swap_partition && echo $uuid || true ; done ) system_partition_uuid=$( for uuid in * ; do readlink -e $uuid | grep -q $system_partition && echo $uuid || true ; done ) set -f popd ( echo "UUID=$swap_partition_uuid swap swap defaults 0 0" echo "UUID=$system_partition_uuid / $system_partition_filesystem defaults 0 0" if test -n "$btrfs_subvolumes" then for btrfs_subvolume in $btrfs_subvolumes do btrfs_subvolume_mountpoint=${btrfs_subvolume#@} echo "UUID=$system_partition_uuid $btrfs_subvolume_mountpoint $system_partition_filesystem subvol=$btrfs_subvolume 0 0" done fi ) > $target_system_filesystem_root/etc/fstab # Make /proc /sys /dev from the installation system available in the target system: for mountpoint_directory in proc sys dev do mkdir $target_system_filesystem_root/$mountpoint_directory done mount -t proc none $target_system_filesystem_root/proc mount -t sysfs sys $target_system_filesystem_root/sys mount -o bind /dev $target_system_filesystem_root/dev # Software installation: # Wait so that one can inspect the system before the software is installed: test -n "$command_before_software_installation" && eval $command_before_software_installation # Avoid the zypper warning that "The /etc/products.d/baseproduct symlink is dangling or missing": mkdir -p $target_system_filesystem_root/etc/products.d ln -s $zypper_baseproduct_file $target_system_filesystem_root/etc/products.d/baseproduct # Add software repositories: zypper_repository_number=0; for zypper_software_repository in $zypper_software_repositories do zypper_repository_number=$(( zypper_repository_number + 1 )) zypper -v -R $target_system_filesystem_root addrepo $zypper_software_repository repository$zypper_repository_number done # First and foremost install the very basic stuff: zypper -v -R $target_system_filesystem_root -n install aaa_base # aaa_base requires filesystem so that zypper installs filesystem before aaa_base # but for a clean filesystem installation RPM needs users and gropus # as shown by RPM as warnings like (excerpt): # warning: user news does not exist - using root # warning: group news does not exist - using root # warning: group dialout does not exist - using root # warning: user uucp does not exist - using root # Because those users and gropus are created by aaa_base scriptlets and # also RPM installation of permissions pam libutempter0 shadow util-linux # (that get also installed before aaa_base by zypper installation of aaa_base) # needs users and gropus that are created by aaa_base scriptlets so that # those packages are enforced installed a second time after aaa_base was installed: for package in filesystem permissions pam libutempter0 shadow util-linux do zypper -v -R $target_system_filesystem_root -n install -f $package done # The actual software installation: for zypper_install_item in $zypper_install_items do zypper -v -R $target_system_filesystem_root -n install $zypper_install_item done # To be on the safe side verify dependencies of installed packages and in case of issues let zypper fix them: zypper -v -R $target_system_filesystem_root -n verify --details # As summary extract each package install with its "Additional rpm output" (except 'NOKEY' warnings) from the log file: sed -n -e '/ Installing: /,/^$/p' $log_file | grep -E -v '^$|^Additional rpm output:$|^warning: .* NOKEY$' >$log_file.rpm_summary # Have the package install summary in the log file: cat $log_file.rpm_summary # Report the differences of what is in the RPM packages compared to the actually installed files in the target system: chroot $target_system_filesystem_root /bin/bash -c "rpm -Va || true" # Wait so that one can inspect the system after the software was installed: test -n "$command_after_software_installation" && eval $command_after_software_installation # Remove the software repositores: for i in $( seq $zypper_repository_number ) do zypper -v -R $target_system_filesystem_root removerepo 1 done # Make initrd verbosely in the target system: if test -n "$make_initrd" then chroot $target_system_filesystem_root /usr/bin/dracut -f -v fi # Install bootloader in the target system: if test -n "$bootloader_install_device" then # Make bootloader configuration in the target system: sed -i -e 's/^GRUB_DISTRIBUTOR=.*/GRUB_DISTRIBUTOR="GenericInstall"/' $target_system_filesystem_root/etc/default/grub chroot $target_system_filesystem_root /usr/sbin/grub2-mkconfig -o /boot/grub2/grub.cfg # Install bootloader in the target system: chroot $target_system_filesystem_root /usr/sbin/grub2-install --force $bootloader_install_device fi # Set root password in the target system: if test -n "$root_password" then echo -e "$root_password\n$root_password" | passwd -R $target_system_filesystem_root root fi # Make snapper root configuration in the target system: if test -n "$make_snapper_root_config" then chroot $target_system_filesystem_root /usr/bin/snapper --no-dbus create-config --fstype=$system_partition_filesystem --add-fstab / chroot $target_system_filesystem_root /usr/bin/snapper --no-dbus set-config NUMBER_CLEANUP=yes NUMBER_LIMIT=10 NUMBER_LIMIT_IMPORTANT=10 TIMELINE_CREATE=no chroot $target_system_filesystem_root /usr/bin/sed -i -e 's/^USE_SNAPPER=".*"/USE_SNAPPER="yes"/' /etc/sysconfig/yast2 fi # Prepare for reboot: echo "Preparing for reboot..." # Stop logging: # Have stdout and stderr on the terminal but no longer in the log file that is in use by the my_name:tee process # which was forked at "Start logging" via: exec 1> >( exec -a $my_name:tee tee -a $log_file ) # Close stdout and stderr to finish the my_name:tee logging process: exec 1>&- exec 2>&- # Reopen stdout as what was saved in file descriptor 3: exec 1>&3 # Reopen stderr as what was saved in file descriptor 4: exec 2>&4 # Wait one second to be on the safe side that the my_name:tee logging process has finished: sleep 1 if ps $logging_tee_pid 1>/dev/null then echo "$my_name:tee process (PID $logging_tee_pid) still running (writes to $log_file)." echo "Waiting 10 seconds to give the $my_name:tee logging process more time to finish." sleep 10 fi # Copy the log file into the target system: cp $log_file $target_system_filesystem_root/var/log/$log_file # Umount proc sys dev: umount $target_system_filesystem_root/proc umount $target_system_filesystem_root/sys umount $target_system_filesystem_root/dev # Umount system partition: umount $target_system_filesystem_root # Do no longer use the swap partition: swapoff $swap_partition # Reboot: test -n "$command_before_reboot" && eval $command_before_reboot echo "Rebooting now..." sync sleep 10 reboot -f
Currently this script does neither implement a fully clean system installation nor does it implement a complete system installation. There is a lot of logging functionality in this script to debug issues. There are still some error or failed messages reported in the log file and the installed system is not at all configured. In particular there is no network configured in the installed system. One can log in as root and has to do the needed configuration.
This script is listed here primarily as documentation how it could be done. Do not blindly copy and use this script. You need to carefully examine it, understand what each command does, and thoroughly adapt and enhance it as you need it for your particular case.
Assume this script is stored as "install.sh".
To make a script available to inst-sys and to let inst-syst automatically run it, do the following:
- package the script as cpio archive "install.cpio"
- make the cpio archive available in your internal network (e.g via HTTP, see above).
- let inst-sys download the cpio archive from your internal network server
- tell inst-sys to run that script (instead of YaST)
To package the script as cpio archive "install.cpio" run
chmod u+rx install.sh echo install.sh | cpio -o >install.cpio
and make the cpio archive available in your internal network (e.g via HTTP).
Assume the cpio archive is available via HTTP as http://my_internal_http_server/install.cpio
Provide the matching boot-screen commandline parameters in a file "install.info"
netsetup=dhcp setupcmd="setctsid $(showconsole) inst_setup /install.sh" dud=disk:/suse/x86_64/libaugeas0-1.2.0-1.5.x86_64.rpm dud=disk:/suse/x86_64/zypper-1.11.14-1.2.x86_64.rpm insecure=1 dud=http://my_internal_http_server/install.cpio
and make this file available in your internal network, for example via HTTP as http://my_internal_http_server/install.info
With "dud=disk:/suse/x86_64/libaugeas0-1.2.0-1.5.x86_64.rpm" and "dud=disk:/suse/x86_64/zypper-1.11.14-1.2.x86_64.rpm" the files that are contained in those RPMs from a SUSE Linux Enterprise 12 installation medium will get installed into inst-sys to make zypper usable in inst-sys because the script calls zypper to install the software RPM packages. The exact RPM package names (in particular the exact RPM package version numbers) depend on what exact SUSE installation medium is used. Generic working with the plain SUSE installation system (see below) helps a lot to accomplish such tasks. For example to find out what exact RPM packages need to be additionally installed into inst-sys to get this or that additional tool usable in inst-sys boot inst-sys where only a plain bash runs (see below how to do that). In inst-sys use the "mount" command to find out where the SUSE installation medium (e.g. a CD/DVD or ISO image that also contains the RPM packages) is mounted (usually at /var/adm/mount). For example to install zypper in inst-sys type a command like "find /var/adm/mount/suse | grep zypper" to get the exact file name of the zypper-<version>-<release>.<architecture>.rpm package and then let rpm try to install that with a command like "rpm -i /var/adm/mount/suse/<architecture>/zypper-<version>-<release>.<architecture>.rpm" which will fail with a lot of failed dependencies even for standard libraries like libc.so and libstdc++.so. The reason is that there is no usable RPM database in inst-sys (in particular "rpm -qa" shows nothing in inst-sys so that one cannot create or rebuild a RPM database in inst-sys). Therefore one must manually verify for each reported required dependency whether or not it is already fulfilled in inst-sys or actually missing. When a required dependency is basically a file name one could use a command like "find / | grep <file_name>" to check if the matching dependency is already fulfilled in inst-sys. For example to check if libfoo.so.1.2.3 and /bin/bar exist in inst-sys one could use "find / | grep libfoo.so" and "find / | grep /bin/bar". For each actually missing dependency the matching RPM package that provides it must be also installed into inst-sys. To find out which RPM package provides what particular missing dependency one should check that on a regular running system (e.g. a real SUSE Linux Enterprise 12 system) via "rpm -q --whatprovides <excat_dependency>" e.g. "rpm -q --whatprovides 'libaugeas.so.0()(64bit)'". Actually installing RPM packages into inst-sys via command like "rpm -i --nodeps /path/to/package.rpm" is usually not possible because there is no usable space for additional files on the various specially mounted filesystems in inst-sys. Therefore "dud=..." must be used to extend inst-sys. When additional RPM packages get installed into inst-sys via "dud=disk:/suse/<architecture>/<package>-<version>-<release>.<architecture>.rpm" only the plain files get extracted from the packages and copied into inst-sys but no RPM scriptlets are run. If running RPM scriptlets of those packages is needed, the actually needed commands from the RPM scriptlets must be manually run in inst-sys. To find out which commands are run by RPM scriptlets check that on a regular running system via "rpm -q --scripts <package>".
To install a SUSE Linux Enterprise 12 system this way, boot the machine from a SUSE Linux Enterprise 12 installation medium, select in the boot menue "Installation", and specify on the boot-screen commandline:
info=http://my_internal_http_server/install.info
Generic working with the plain SUSE installation system
To work with the plain SUSE installation system, the basic idea is to boot the plain SUSE installation system (inst-sys) but to not let it run YaST. Instead inst-sys only launches a bash.
Provide the matching boot-screen commandline parameters in a file "bash.info" like the following
netsetup=dhcp setupcmd="setctsid $(showconsole) inst_setup bash" sshd=1 sshpassword=something_temporary plymouth=0 splash=verbose keytable=de-latin1
and make this file available in your internal network, for example via HTTP as http://my_internal_http_server/bash.info
With "sshd=1" and "sshpassword=something_temporary" inst-sys will be also accessible via ssh and the root password in inst-sys for ssh access will be "something_temporary" (never use a real root password for one of your machines here).
Optionally specify what meets your particular needs like "plymouth=0", "splash=verbose", "keytable=de-latin1", and so on (see SDB:Linuxrc).
To work only with plain inst-sys of SUSE Linux Enterprise 12, boot the machine from a SUSE Linux Enterprise 12 installation medium, select in the boot menue "Installation", and specify on the boot-screen commandline:
info=http://my_internal_http_server/bash.info
When the "info" parameter value is of the form "http://..." linuxrc does automatically a network setup with DHCP so that the "netsetup=dhcp" entry in the linuxrc parameters file is superfluous. After the network setup with DHCP linuxrc downloads the parameters file.
If downloading the parameters file fails, linucrc emits a message but (intentionally) ignores that error and starts the YaST installation by default (which is usually a reasonable fallback behaviour).
To see linuxrc messages while inst-sys boots you may have to press [Esc] to disable the default splash screen graphics that obscures plain text messages or you may have to use [Ctrl]+[Alt]+[F3] to switch to /dev/tty3 where linuxrc messages are shown by default (cf. "linuxrc.log" at SDB:Linuxrc) or you may have to specify linuxrc parameters like "plymouth=0" and/or "splash=verbose" directly on the boot-screen commandline so that the boot-screen commandline may look as follows:
splash=verbose info=http://...
For example if you mistyped the "info" parameter value, you may notice an "error 22 ..." message from linuxrc. In this case the '22' is a curl error code because linuxrc uses curl for downloading (CURLE_HTTP_RETURNED_ERROR 22 is returned if the HTTP server returns an error code that is >= 400).
Another reason why downloading the parameters file may fail could be that the network is not yet ready to use. In this case you may notice a plain "error 7" message from linuxrc where '7' is the curl error code CURLE_COULDNT_CONNECT "Failed to connect to host". In this case it may help to add the linuxrc parameter "netwait=10" on the boot-screen commandline to wait e.g. 10 seconds after activating the network interface (see SDB:Linuxrc) so that the whole boot-screen commandline may look as follows:
netwait=10 splash=verbose info=http://...
When one of the above scripts exits with non-zero exit code, linuxrc in inst-sys shows an error popup that reads "An error occurred during the istallation". To inspect the log file of the script for the actual error, acknowledge the error popup with [OK] that leads to the linuxrc "Main Menu". Therein select "Expert" and [OK] that leads to the "Expert" menu. Therein select "Start shell" and [OK] that leads to a bash prompt. The log file of the scripts above is stored in the root directory of inst-sys as '/<script_name>.*'.
One cannot leave inst-sys as usual systems. The only way to leave inst-sys is to manually shut it down (in particular umount mounted harddisk partitions) and finally call "reboot -f" or "halt -f".
Venturous users needed for generic usage of the plain SUSE installation system
Venturous users are needed who like to try it out and test it, provide meaningful issue reports, answer questions, and so on.
When you are an experienced user and you prefer working directly with low-level commands over working on top of monstrous mountains of various kind of sophisticated machinery, it looks reasonable to try out if generic usage of the plain SUSE installation system might be of interest for you.
If you are a venturous user who likes to try out generic usage of the plain SUSE installation system, you may add comments to SDB Talk:Disaster Recovery or contact me (User:Jsmeix) via "jsmeix@suse.com" directly.
Using ReaR as generic installer in any installation system
The core of ReaR is an installer that reinstalls the system from scratch.
Accordingly it makes sense to use the ReaR installer in any installation system - provided the installation system is sufficiently powerful to run the various low-level tools that are called by the ReaR installer bash scripts.
Obviously the ReaR recovery system is a sufficiently powerful installation system (otherwise "rear recover" would not work).
The SUSE installation system is a sufficiently powerful installation system ("rear install" works, see below).
Also other Linux distributions installation systems should be sufficiently powerful installation systems (otherwise they could not be used by the particular Linux distributions installers).
Currently using ReaR as generic installer is in an experimental proof-of-concept and work-in-progress state.
It emerged as natural successor of the generic system installation with the plain SUSE installation system (see above).
The basic idea behind is to add a "rear install" workflow where the restore of the files backup is replaced with an installation of software packages.
See the SUSE Hack Week 13 project https://hackweek.suse.com/13/projects/1190 and the ReaR upstream GitHub issue https://github.com/rear/rear/issues/732
The ultimate goal is to be able to use one same installation system and one same installer for all kind of installations (see "Let's face it" above):
- Using "rear install" for initial system installation.
- Using "rear install" or "rear recover" for productive deployment.
- Using "rear recover" for disaster recovery re-installation.
Using ReaR as generic installer in the ReaR recovery system
Using ReaR as generic installer in the ReaR recovery system is in an experimental proof-of-concept and work-in-progress state.
Currently it works as proof-of-concept for SLES12-SP1 by using the rear20160121install RPM package from the openSUSE Build Service project "home:jsmeix" - but this is not yet in a state where it can be used under normal circumstances. One must be at least somewhat experienced in generic system installation with the plain SUSE installation system to use rear20160121install in its current state.
Using ReaR as generic installer in the plain SUSE installation system
Using ReaR as generic installer in the plain SUSE installation system is in an experimental proof-of-concept and work-in-progress state.
Currently it works as proof-of-concept for SLES12-SP1 by using the rear20160121install RPM package from the openSUSE Build Service project "home:jsmeix" - but this is not yet in a state where it can be used under normal circumstances. One must be at least somewhat experienced in generic system installation with the plain SUSE installation system to use rear20160121install in its current state.
A sketchy description how to use ReaR as generic installer in the plain SUSE installation system for people who are experienced in generic system installation with the plain SUSE installation system:
1.
Create a virtual machine with full hardware virtualization with a single 20GB virtual harddisk and BIOS (do not use UEFI).
2.
On that virtual machine install a simple SLES12-SP1 system (you can get a SLES12-SP1 install medium from http://download.suse.com) with a single system partition that uses the ext4 filesystem (do not use the complicated btrfs default structure for your tests - unless you prefer to deal with complicated issues). To get a simple system with X11 for example install only the patterns "minimal", "base", and "X11" plus the packages MozillaFirefox and lsb-release (the latter is required by ReaR).
3.
Get the rearinstall RPM package from the openSUSE Build Service project "home:jsmeix" and install it on the SLES12-SP1 system. For SLES12-SP1 get the rearinstall RPM from
http://download.opensuse.org/repositories/home:/jsmeix/SLE_12/x86_64/
4.
On the SLES12-SP1 system use /usr/share/rear/conf/examples/SLE12-SP1-ext4-install-example.conf as template for your /etc/rear/local.conf - here an example how your /etc/rear/local.conf might look like (long lines may show wrapped in your browser):
OUTPUT=ISO BACKUP=NETFS BACKUP_OPTIONS="nfsvers=3,nolock" BACKUP_URL=nfs://NFS.server.IP.address/NFS_directory NETFS_KEEP_OLD_BACKUP_COPY=yes SSH_ROOT_PASSWORD="rear" USE_DHCLIENT="yes" readonly INSTALL_PAYLOAD_SUSE_ZYPPER_BASEPRODUCT_FILE="/etc/products.d/SLES.prod" readonly INSTALL_PAYLOAD_SUSE_ZYPPER_SOFTWARE_REPOSITORIES="http://local.HTTP.server.IP/SLES12SP1RPMs" readonly INSTALL_PAYLOAD_SUSE_ZYPPER_INSTALL_ITEMS="patterns-sles-base kernel-default patterns-sles-x11 MozillaFirefox lsb-release" readonly INSTALL_CONFIGURE_FSTAB_SWAP_PARTITION="/dev/sda1" readonly INSTALL_CONFIGURE_FSTAB_ROOT_PARTITION="/dev/sda2" readonly INSTALL_CONFIGURE_FSTAB_ROOT_FS="ext4" readonly INSTALL_CONFIGURE_SUSE_BOOTLOADER_DEVICE="$INSTALL_CONFIGURE_FSTAB_ROOT_PARTITION" readonly INSTALL_CONFIGURE_SUSE_GRUB_DISTRIBUTOR="rearInstall" readonly INSTALL_CONFIGURE_ROOT_PASSWORD="install" readonly INSTALL_CONFIGURE_SUSE_DHCP_CLIENT_NETWORK_INTERFACES="eth0" readonly INSTALL_CONFIGURE_SYSTEMD_UNIT_FILES="sshd.service" readonly INSTALL_FINAL_EXIT_TASK="LogPrint 'Install workflow finished' ; sleep10" REQUIRED_PROGS=( "${REQUIRED_PROGS[@]}" rpm zypper zypp-CheckAccessDeleted zypp-NameReqPrv zypp-refresh appdata2solv comps2solv deltainfoxml2solv dumpsolv installcheck mergesolv repo2solv.sh repomdxml2solv rpmdb2solv rpmmd2solv rpms2solv susetags2solv testsolv updateinfoxml2solv ) COPY_AS_IS=( "${COPY_AS_IS[@]}" /etc/zypp/ /usr/lib/zypp/ /usr/lib/rpm/ $INSTALL_PAYLOAD_SUSE_ZYPPER_BASEPRODUCT_FILE ) test "$INSTALL_CONFIGURE_ROOT_PASSWORD" && REQUIRED_PROGS=( "${REQUIRED_PROGS[@]}" passwd )
See the comments in /usr/share/rear/conf/examples/SLE12-SP1-ext4-install-example.conf to better understand each entry. If you wonder about the stange "...CONFIGURE_FSTAB..", "...BOOTLOADER_DEVICE..." and similar values: That is a current quick-and-dirty hack that needs to be replaced by an automated fstab generator and by using the ReaR bootloader installation. For now one must maintain that via explicit values in /etc/rear/local.conf and in its current proof-of-concept state only one SWAP_PARTITION and one ROOT_PARTITION is supported and the values in /etc/rear/local.conf must match the entries in disklayout.conf (see the next step). But - as I wrote - that is only a quick-and-dirty hack that will be replaced. The REQUIRED_PROGS and COPY_AS_IS values are not needed for using ReaR as generic installer in the plain SUSE installation system but for using ReaR as generic installer in the ReaR recovery system because this way zypper and what it needs get added to the ReaR recovery system so that then "rear install" in the ReaR recovery system can run zypper to install RPMs.
5.
In the above /etc/rear/local.conf there is INSTALL_PAYLOAD_SUSE_ZYPPER_SOFTWARE_REPOSITORIES="http://local.HTTP.server.IP/SLES12SP1RPMs"
which means that the RPMs that should be installed by zypper during "rear install" must be provided in a repository on your local HTTP server. To do that copy the SLES12-SP1 RPMs that are on the SLES12-SP1 install medium located under /suse/ onto your local HTTP server into /srv/www/htdocs/SLES12SP1RPMs/ so that you have the RPMs in the /srv/www/htdocs/SLES12SP1RPMs/noarch/ and /srv/www/htdocs/SLES12SP1RPMs/x86_64/ sub-directories. Afterwards go to /srv/www/htdocs/SLES12SP1RPMs/ and run
createrepo -v .
(you need the createrepo RPM installed on your local HTTP server). This creates /srv/www/htdocs/SLES12SP1RPMs/repodata/* files that are required by zypper when it should use http://local.HTTP.server.IP/SLES12SP1RPMs as RPM package repository during "rear install".
6.
On the SLES12-SP1 system run
rear -d -D mkbackup
Its primary intent is to get /var/lib/rear/layout/disklayout.conf created that is needed later by "rear install" to set up partitioning and filesystems. Alternatively you might create an appropriate disklayout.conf file manually.
7.
Get /var/lib/rear/layout/disklayout.conf from the SLES12-SP1 system and store it on your local HTTP server as /srv/www/htdocs/rearconf/var/lib/rear/layout/disklayout.conf and also store /etc/rear/local.conf from the SLES12-SP1 system on your local HTTP server as /srv/www/htdocs/rearconf/etc/rear/local.conf
8.
On your local HTTP server go to /srv/www/htdocs/rearconf and run "tar -czvf ../rearconf.tar.gz *" to create a /srv/www/htdocs/rearconf.tar.gz which is the tarball that contains all information that "rear install" later needs to install the system.
9.
Provide the rearinstall RPM package on your local HTTP server as /srv/www/htdocs/rearinstall-1.17.20160121.i-9.1.x86_64.rpm (when the exact RPM package file name is different you need to adapt in the next step in /srv/www/htdocs/rearinstall.sh the rear_rpm_package_name value).
10.
On your local HTTP server create a /srv/www/htdocs/rearinstall.sh that primarily has to download and unpack rearconf.tar.gz to make the ReaR configuration available in the installation system and then run "rear install". Alternatively you may boot the SUSE installation system with a plain bash (see "Generic working with the plain SUSE installation system") and download and unpack rearconf.tar.gz manually and then run "rear -d -D install". The below rearinstall.sh script provides additionally some sophisticated logging and it also installs the rearinstall RPM package. Such a script could be as follows (long lines may show wrapped in your browser):
#! /bin/bash # Have a clean environment: export PATH="/sbin:/usr/sbin:/usr/bin:/bin" export LC_ALL="POSIX" export LANG="POSIX" umask 0022 # Make sure the effective user ID is 0 (i.e. this script must run as root): test "0" != "$( id -u )" && { echo "Error: Need 'root' privileges but the effective user ID is not '0'." ; false ; } # Start logging: my_name=${0##*/} starting_timestamp=$( date +%Y%m%d%H%M%S ) log_file=$my_name.$starting_timestamp.log # Have a new file descriptor 3 which is a copy of the stdout file descriptor: exec 3>&1 # Have a new file descriptor 4 which is a copy of the stderr file descriptor: exec 4>&2 # Have stdout on the terminal and also in the log file: exec 1> >( exec -a $my_name:tee tee -a $log_file ) logging_tee_pid=$! # Make stderr what stdout already is (i.e. terminal and also in the log file): exec 2>&1 # Adapt the following variables as you need. # Be careful and verify that your settings work with the code below. # This script is not at all foolproof. # It is for experts who know how to set up a system with low-level commands. command_before_actual_work="" #command_before_actual_work="bash -c 'echo exit this sub-shell to start the installation ; exec bash -i'" # ReaR config archive name: rear_config_archive_name="rearconf.tar.gz" # ReaR config URL: rear_config_URL="http://local.HTTP.server.IP/$rear_config_archive_name" # Mountpoint in the installation system of the target system filesystem root: target_system_filesystem_root="/mnt/local" # ReaR RPM package name: rear_rpm_package_name="rearinstall-1.17.20160121.i-9.1.x86_64.rpm" # ReaR RPM package URL: rear_rpm_package_URL="http://local.HTTP.server.IP/$rear_rpm_package_name" # Reboot: command_before_reboot="" #command_before_reboot="bash -c 'echo exit this sub-shell to reboot ; exec bash -i'" # Show IP4 address and if exists wait 10 seconds so that one can notice it: ip address show | grep 'inet ' && sleep 10 # Get the ReaR config: wget $rear_config_URL # Install the ReaR config: tar -xf $rear_config_archive_name # Start the actual work: test -n "$command_before_actual_work" && eval $command_before_actual_work # Run the ReaR installation: rear -v -d -D install # Get the currently used ReaR RPM package (cf. rearinstall.info): wget $rear_rpm_package_URL # Install the ReaR RPM package in the target system: rpm --root $target_system_filesystem_root -i $rear_rpm_package_name # Copy current ReaR config file into the target system: cp /etc/rear/local.conf $target_system_filesystem_root/etc/rear/local.conf # Prepare for reboot: echo "Preparing for reboot..." # Stop logging: # Have stdout and stderr on the terminal but no longer in the log file that is in use by the my_name:tee process # which was forked at "Start logging" via: exec 1> >( exec -a $my_name:tee tee -a $log_file ) # Close stdout and stderr to finish the my_name:tee logging process: exec 1>&- exec 2>&- # Reopen stdout as what was saved in file descriptor 3: exec 1>&3 # Reopen stderr as what was saved in file descriptor 4: exec 2>&4 # Wait one second to be on the safe side that the my_name:tee logging process has finished: sleep 1 if ps $logging_tee_pid 1>/dev/null then echo "$my_name:tee process (PID $logging_tee_pid) still running (writes to $log_file)." echo "Waiting 10 seconds to give the $my_name:tee logging process more time to finish." for i in $( seq 9 -1 0 ) do echo -n "$i " sleep 1 done echo "" fi # Copy the log file into the target system: cp $log_file $target_system_filesystem_root/var/log/$log_file # Reboot: test -n "$command_before_reboot" && eval $command_before_reboot echo "Rebooting now..." # Umount the target system (if it fails, try to remount read-only) # and if that fails umount all what can be unmounted: umount -r $target_system_filesystem_root || umount -a # To be on the safe side call sync and sleep 10 seconds in any case: sync sleep 10 reboot -f
You must at least adapt local.HTTP.server.IP and perhaps the rearinstall RPM package file name.
11.
On your local HTTP server provide /srv/www/htdocs/rearinstall.sh as cpio archive. In /srv/www/htdocs/ do:
chmod u+rx rearinstall.sh echo rearinstall.sh | cpio -o >rearinstall.cpio
Cf. "Generic system installation with the plain SUSE installation system" above.
12.
On your local HTTP server create a /srv/www/htdocs/rearinstall.info that contains the parameters for linuxrc when it boots the SUSE installation system. Such an info file could be as follows (long lines may show wrapped in your browser):
netsetup=dhcp plymouth=0 splash=verbose sshd=1 sshpassword=install setupcmd="setctsid $(showconsole) inst_setup /rearinstall.sh" dud=disk:/suse/x86_64/libaugeas0-1.2.0-3.1.x86_64.rpm dud=disk:/suse/x86_64/zypper-1.12.23-1.3.x86_64.rpm dud=disk:/suse/noarch/lsb-release-2.0-26.1.noarch.rpm insecure=1 dud=http://local.HTTP.server.IP/rearinstall-1.17.20160121.i-9.1.x86_64.rpm dud=http://local.HTTP.server.IP/rearinstall.cpio
You must at least adapt local.HTTP.server.IP and perhaps the rearinstall RPM package file name. Regarding the other RPM package file names: If you need to adapt them see "Generic system installation with the plain SUSE installation system" above.
13.
Create a second virtual machine with full hardware virtualization with a single 20GB virtual harddisk and BIOS (do not use UEFI).
14.
Boot the second virtual machine with the original SLES12-SP1 boot medium and enter at the initial boot screen
splash=verbose info=http://local.HTTP.server.IP/rearinstall.info
to let the second virtual machine be installed full automated by "rear install" with same partitioning, filesystems, and software as the first virtual machine was installed. When you change the content of rearconf.tar.gz you can install the second virtual machine as you like.
Using ReaR as generic installer in other Linux distributions installation systems
Hopefully for the future - provided other Linux distributions are interested.
Virtual machines
Usually the virtualization host software provides a snapshot functionality so that a whole virtual machine (guest) can be saved and restored. Using the snapshot functionality results that the virtual machine is saved in files which are specific for the used virtualization host software and those files are usually stored on the virtualization host. Therefore those files must be saved in an additional step (usually the complete virtualization host must be saved) to get the virtual machine safe against failure of the virtualization host.
In contrast when using ReaR the virtual machine is saved as backup and ISO image which are independent of the virtualization host.
Full/hardware virtualization
With ReaR it is possible to save a fully virtual machine which runs in a particular full/hardware virtualization software environment on one physical machine and restore it in a same full/hardware virtualization software environment on another physical machine. This way it should be possible to restore a fully virtual machine on different replacement hardware which mitigates the requirement to have same or compatible replacement hardware available. Nevertheless you must test if this works in your particular case with your particular replacement hardware.
Usually it is not possible to save a fully virtual machine which runs in one full/hardware virtualization software environment and restore it in a different full/hardware virtualization software environment because different full/hardware virtualization software environments emulate different machines which are usually not compatible.
Paravirtualization
Paravirtualized virtual machines are a special case, in particular paravirtualized XEN guests.
A paravirtualized XEN guest needs a special XEN kernel (vmlinux-xen) and also a special XEN initrd (initrd-xen). The XEN host software which launches a paravirtualized XEN guest expects the XEN kernel and the XEN initrd in specific file names "/boot/ARCH/vmlinuz-xen" and "/boot/ARCH/initrd-xen" where ARCH is usually i386 or i586 or x86_64.
Furthermore a paravirtualized XEN guest needs in particular the special kernel modules xennet and xenblk to be loaded. This can be specified in /etc/rear/local.conf with a line "MODULES_LOAD=( xennet xenblk )" which lets the ReaR recovery system autoload these modules in the given order (see /usr/share/rear/conf/default.conf).
ReaR does not provide functionality to create a special "medium" that can be used directly to launch a paravirtualized XEN guest. ReaR creates an usual bootable ISO image which boots on usual PC hardware. ReaR creates a bootable ISO image where kernel and initrd are located in the root directory of the ISO image.
To use ReaR to recreate a paravirtualized XEN guest, the configuration of the XEN host must be adapted so that it can launch the ReaR recovery system on a paravirtualized XEN guest. Basically this means to launch a paravirtualized XEN guest from an usual bootable ISO image.
Remember: There is no such thing as a disaster recovery solution that "just works". Therefore: When it does not work, you might perhaps change your system to be more simple (e.g. use full/hardware virtualization instead of paravirtualization) or you have to manually adapt and enhance the disaster recovery framework to make it work for your particular case.
Non PC compatible architectures
Non PC compatible architectures are those that are neither x86/i386/i486/i586 (32-bit) nor x86_64 (64-bit) like ppc, ppc64, ia64, s390, s390x.
Recovery medium compatibility
ReaR (up to version 1.17.2) creates a traditional El Torito bootable ISO image which boots on PC hardware in the traditional way (i.e. it boots in BIOS mode - not in UEFI mode). For a UEFI bootable ISO image one needs at least ReaR version 1.18 plus the ebiso package (cf. "rear118a" in the "rear / rear116 / rear1172a / rear118a / rear23a" section above).
ReaR does not provide special functionality to create whatever kind of special bootable "medium" that can be used to boot on non PC compatible architectures.
Therefore ReaR cannot be used without appropriate enhancements and/or modifications on hardware architectures that cannot boot an El Torito bootable medium.
Launching the ReaR recovery system via kexec
It should always be possible to launch the ReaR recovery system via kexec from an arbitrary already running system on your replacement hardware (provided that already running system supports kexec).
On your replacement hardware have an already running small and simple system which can be any system that supports kexec and which can be installed as you like (e.g. from an openSUSE or SUSE Linux Enterprise install medium), cf. the above section "Prepare replacement hardware for disaster recovery".
That already running small and simple system does not need to be compatible with the system that you intend to recreate. For example when you intend to recreate a SUSE Linux Enterprise 12 system with a possibly complicated structure of various filesystems, the already running system could be a minimal openSUSE Tumbleweed system with only one single ext4 root filesystem.
It is recommended that the already running system is up-to-date because a more up-to-date kernel should be better prepared against possible kexec failures when there are still ongoing hardware operations after kexec (e.g. old DMA operations that still write somewhere in memory) from the old kernel while the new kernel is already running.
On your original system that you intend to recreate let ReaR copy its plain initrd that contains the ReaR recovery system plus the original kernel of the system that you intend to recreate to ReaR's output location by using 'OUTPUT=RAMDISK'.
On ReaR versions before ReaR 2.6 'OUTPUT=RAMDISK' may not yet work well, cf. https://github.com/rear/rear/pull/2149 and https://github.com/rear/rear/issues/2148 so you should use ReaR version 2.6 or alternatively you may try out current ReaR upstream GitHub master code, cf. the section "Testing current ReaR upstream GitHub master code" above.
In the already running system on your replacement hardware kexec that original kernel of the system that you intend to recreate plus the matching ReaR recovery system initrd to launch the ReaR recovery system.
An example etc/rear/local.conf could be like (excerpts):
OUTPUT=RAMDISK BACKUP=NETFS BACKUP_URL=nfs://your.NFS.server.IP/path/to/your/rear/backup
Via OUTPUT=RAMDISK "rear mkrescue/mkbackup" copies the kernel (by default named kernel-HOSTNAME) plus the ReaR recovery system initrd (by default named initramfs-HOSTNAME.img) to the output location at the same place where the backup gets stored (i.e. what is specified by BACKUP_URL).
To recreate that system on your replacement hardware boot the already installed system on your replacement hardware and therein do those steps:
1.)
Copy the kernel and the ReaR recovery system initrd from ReaR's output location into your already running system on your replacement hardware.
2.)
Load the kernel and ReaR's initrd and provide an appropriate kernel command line to run the ReaR recovery system in a ramdisk and use standard VGA 80x25 text mode (you may have to add special hardware dependant parameters from your original kernel command line in /proc/cmdline of your original system that you intend to recreate) for example like:
kexec -l kernel-HOSTNAME --initrd=initramfs-HOSTNAME.img --command-line='root=/dev/ram0 vga=normal rw'
3.)
Kexec the loaded kernel (plus ReaR's initrd) which will instantly reboot into the loaded kernel without a clean shutdown of the currently running system:
kexec -e
4.)
In the booted ReaR recovery system log in as 'root' (this may only work directly on the system console) and recover your system which will completely replace the system from before so it does not matter when it was not cleanly shut down:
rear -D recover
See also the issue "Alternatively do kexec instead of regular boot to speed up booting" https://github.com/rear/rear/issues/2186 at ReaR upstream.
Bootloader compatibility
Basically GRUB as used on usual PC hardware is the only supported bootloader.
There might be some kind of limited support for special bootloader configurations but one cannot rely on it.
Therefore it is recommended to use GRUB with a standard configuration.
If GRUB with a standard configuration cannot be used on non PC compatible architectures, appropriate enhancements are needed to add support for special bootloader configurations.
It is crucial to check in advance whether or not it is possible to recreate your particular non PC compatible systems with ReaR or RecoveryImage/AutoYaST or any other disaster recovery procedure that you use.
All filesystems are equal, but some are more equal than others
ext2 ext3 ext4
When the standard Linux filesystems ext2, ext3, ext4 are used with standard settings there should be no issues but special filesystem tuning settings may need manual adaptions to make it work for your particular special case.
btrfs
First and foremost: As of this writing (October 2013) the btrfs code base is still under heavy development, see btrfs.wiki.kernel.org which means: When you use btrfs do not expect that any kind of disaster recovery framework "just works" for btrfs.
Since ReaR version 1.12 only basic btrfs filesystem backup and restore may work (but no subvolumes). When btrfs is used with subvolumes that contain normal data (no snapshots), at least ReaR version 1.15 is required that provides some first basic support to recreate some kind of btrfs subvolume structure so that backup and restore of the data could work. Since ReaR version 1.17 there is generic basic support for btrfs with normal subvolumes (but no snapshot subvolumes). Note the "basic support". In particular there is no support for "interwoven btrfs subvolume mounts" which means when subvolumes of one btrfs filesystem are mounted at mount points that belong to another btrfs filesystem and vice versa (cf. the Relax-and-Recover upstream issue 497 and the openSUSE Bugzilla bug 908854).
When btrfs is used with snapshots (i.e. with subvolumes that contain btrfs snapshots) then usual backup and restore cannot work. The main reason is: When there are btrfs snapshot subvolumes and you run usual file-based backup software (e.g. 'tar') to backup the whole data of the btrfs filesystem then all what there is in snapshot subvolumes gets backed up as complete files. For example assume there is a 8GB disk with btrfs filesystem where 5GB disk space is used and there is a recent snapshot. A recent snapshot needs almost no disk space because of btrfs' copy on write functionality. But 'tar' would create a backup that has an uncompressed size of about 10GB because the files appear twice: Under their regular path and additionally under the snapshot subvolume's path. It would be impossible to restore that tarball on the disk. This means btrfs snapshot subvolumes cannot be backed up and restored as usual with file-based backup software.
The same kind of issue can happen with all filesystems that implement copy on write functionality (e.g. OCFS2). For background information you may read about "reflink" (versus "hardlink").
For disaster recovery with Relax-and-Recover on SUSE Linux Enterprise 12 GA the special RPM package rear116 is required that provides ReaR version 1.16 with special adaptions and enhancements for the default btrfs subvolume structure in SUSE Linux Enterprise 12 GA but restore of btrfs snapshot subvolumes cannot be supported (see above and see the "Fundamentals about Relax-and-Recover presentation PDF" below). There are changes in the default btrfs subvolume structure in SUSE Linux Enterprise 12 SP1 that require further adaptions and enhancements in ReaR, see the Relax-and-Recover upstream issue 556 so that for disaster recovery with Relax-and-Recover on SUSE Linux Enterprise 12 SP1 the special RPM package rear1172a is required and for SUSE Linux Enterprise 12 SP2 the RPM package rear118a should be used (cf. above "rear / rear116 / rear1172a / rear118a / rear23a").
Help and Support
Feasible in advance
Help and support is feasible only "in advance" while your system is still up and running when something does not work when you are testing on your replacement hardware to recreate your system with your recovery medium and that the recreated system can boot on its own and that all its system services still work as you need it in your particular case.
Hopeless in retrospect
Help and support is usually hopeless "in retrospect" when it fails to recreate your system on replacement hardware after your system was destroyed.
The special ReaR recovery system provides a minimal set of tools which could help in some cases to fix issues while it recreates your system, see the "Relax-and-Recover" section above. A precondition is that the ReaR recovery system at least boots correctly on your replacement hardware. If the ReaR recovery system fails to boot, it is usually a dead end.
Be prepared for manual recreation
If it finally fails to recreate your system, you have to manually recreate your basic system, in particular your partitioning with filesystems and mount points and afterwards you have to manually restore your backup e.g. from your NFS server or from another place where your backup is. Therefore you must have at least your partitioning, filesystem, mount point, and networking information available so that you can manually recreate your system. It is recommended to manually recreate your system on your replacement hardware as an exercise to be prepared.
It is crucial to have replacement hardware available in advance to verify that you can recreate your system because there is no such thing as a disaster recovery solution that "just works".
SUSE support for Relax-and-Recover
SUSE provides Relax-and-Recover (ReaR) via the SUSE Linux Enterprise High Availability Extension (SLE-HA) which is the only SUSE product/extension where SUSE officially supports ReaR.
For other SUSE products (like plain SLES without the HA extension) SUSE supports on a voluntary basis only the newest ReaR upstream version or preferably the ReaR upstream GitHub master code directly at ReaR upstream https://github.com/rear/rear - see the above sections "Version upgrades with Relax-and-Recover", "Testing current ReaR upstream GitHub master code", "Debugging issues with Relax-and-Recover", and "How to contribute to Relax-and-Recover".
ReaR neither is nor is it meant to be something like a "ready to use solution for disaster recovery", see the above section "Inappropriate expectations". Instead ReaR is only meant to be a framework that can be used by experienced users and system admins to build up their particular disaster recovery procedure. Therefore ReaR is written entirely in the native language for system administration: as shell (bash) scripts. The intent behind is that ReaR users can and should adapt or extend the ReaR scripts if needed to make things work for their specific case, see the above section "Disaster recovery with Relax-and-Recover (ReaR)".
So in general there is no such thing as a "ready to use solution for disaster recovery" that is provided and/or supported by SUSE. In particular SUSE does not provide RPM package updates with adaptions and/or enhancements for particular use cases because in general it is almost impossible to foresee if regressions could happen for other use cases. Instead SUSE provides from time to time ReaR version upgrades as separated RPM packages (cf. the above section "RPM packages for disaster recovery").
On the other hand "SUSE supports ReaR" but that means a solution could be that SUSE offers help so that a user could adapt or extend his ReaR scripts to make things work for his particular case (for reasonable use cases as far as possible with reasonable effort).
Therefore all ReaR scripts are marked as "config(noreplace)" in SUSE's rear* RPM packages since ReaR version 1.17.1 (cf. the above section "RPM packages for disaster recovery") so that ReaR scripts that were changed by the user are not overwritten by a RPM package update. New scripts that are different in a RPM package update compared to scripts that were changed by the user will get installed as '.rpmnew' files. After a RPM package update the user must check the differences in the '.rpmnew' scripts compared to his changed scripts to find out if his changed scripts still work in his particular use case or if further adaptions are needed plus a careful and complete re-validation that his particular disaster recovery procedure still works, cf. the above section "Version upgrades with Relax-and-Recover".
See also
- Essentials about disaster recovery with Relax-and-Recover presentation PDF at File:Essentials about disaster recovery with rear jsmeix presentation v1.pdf
- First steps with Relax-and-Recover workshop PDF at File:Rear workshop fosdem 2017.pdf
- Fundamentals about Relax-and-Recover presentation PDF at File:Relax-and-Recover jsmeix presentation.pdf
- openSUSE Hack Week 12 project "Generic disaster recovery"
- openSUSE Hack Week 13 project "Use Relax-and-Recover as generic installer in the plain SUSE installation system"
- openSUSE Hack Week 14 project "Relax-and-Recover recovery system: Download ReaR configuration files"
- openSUSE Hack Week 15 project "Relax-and-Recover: New kind of backup method: BACKUP=ZYPPER"
Related articles
- SUSE Documentation for SUSE Linux Enterprise High Availability Extension 11 SP4: High Availability Guide: Disaster Recovery with Rear
- SUSE Documentation for SUSE Linux Enterprise High Availability Extension 12 SP5: Administration Guide: Disaster Recovery with Rear (Relax-and-Recover)
- SUSE Documentation for SUSE Linux Enterprise High Availability Extension 15 SP2: Administration Guide: Disaster Recovery with Rear (Relax-and-Recover)
External links
- Relax-and-Recover home page at http://relax-and-recover.org/
- Relax-and-Recover upstream development at https://github.com/rear/
- rear packages at the openSUSE build service project Archiving:Backup:Rear
- rear-SUSE package at the openSUSE build service project home:jsmeix
- AutoYaST Documentation at http://www.suse.de/~ug/
- AutoYaST Documentation at http://doc.opensuse.org/projects/autoyast/