Monday, November 21, 2016

Create & Restore Linux System Data Using Dump Command


I was searching for a simple backup solution in Linux system that could serve in block-level backup and for disaster recovery as well, and simply cost effective. Now-e-days, one could see a lot of options, however, I wanted to try out simple and yet native Linux solution, so, I came across this "dump & restore" utility.  This may not be an ideal solution for a larger network, but one could leverage on this for smaller network wherein there is no tight schedule on recovery. So, I thought of creating a step-by-step document of the same.

Complete system can be backed up and restored using native Linux commands “dump & restore” which would facilitate backup or disaster recovery process.
What are “dump” & “restore” commands?

The “dump” would capture data files and would be restored using "restore" command.

As per the man page of dump command:

Dump examines files on an ext2/3/4 file-system and determines which files need to be backed up. These files are copied to the given disk, tape or other storage medium for safe keeping.

The restore command would restores the files copied via dump command.
Some Points: Since this only records and restores data, one must take care of disk layout and corresponding UUID or labels and should manually create it. This would be shown later in this document.
Build Environment:-

I’m using a VM (Virtual Machine) for this test and documentation purpose. This VM is installed with RHEL 6.8 running in Workstation 11.
This VM has got 2 hard drives (sda & sdb). Root file system is installed on /dev/sda2 and boot on /dev/sda1 & /dev/sda3 is a swap partition as shown below:


The other partition “/dev/sdb1” is being used to store disk dumps (backup data).
Some system details of the VM:-

“/etc/fstab” details:-


Plan:-

Here, I’d take disk level backups using dump command of /dev/sda{1..3} partitions on which my root file system resides, and store the dumps on /dev/sdb1 partition. Later, destroy the partition table of /dev/sda. At this stage, system would fail to boot since there is no partition table (no MBR data). Hence, restore dump data after creating proper partitions on /dev/sda.

Points To Consider Before Proceeding:-

- Make backup of partition details, /etc/fstab, blkid output and other details as required before destroying partition table which makes system not bootable.
- Good to run “sosreport” (in RHEL variants) and save complete system configurations on external device.
- We’d need partition-wise layout details for re-creating partition table while restoring.
- Also, make a note of the file system UUID or label whichever relevant.
- In case of LVM based file systems, need to backup /etc/lvm/archive which could be used to re-create required pvs, vgs and lvs.
A snap of “fdisk -cul /dev/sda” as shown below:-

A list of file system’s UUIDs associated:-

That being said, let’s start executing the plan.


Step 1: Dump partition wise data

Now, let’s backup each partition wise data using “dump” command. In our setup, we’d need to backup data from /dev/sda1 (boot file system) and /dev/sda2 (root file system) only and /dev/sda3 is a swap partition hence, can be skipped.
- Boot the system into single user mode.

- Use “dump” command to backup data from /dev/sda1 & /dev/sda2 partitions as shown in below screen image (this dump is stored on the block device /dev/sdb1):-


Dumping data would depend on amount of data actually stored on block device. In the above (test-setup) example, it took almost 5 minutes to backup /dev/sda2 (root file system).

The details of the backup is as shown here:-


Notes:-
  • If the system has been running for a long time, it is advisable to run e2fsck on the partitions before backup.
  • “dump” should (may) not be used at heavily loaded and mounted filesystem as it could backup corrupted version of files. This problem has been mentioned on dump.sourceforge.net and Red Hat Manual.
If required, backup could be stored on remote systems as shown below. In general, backup files should of course be stored in a different location.
#dump -0u -f - /dev/sda1 | ssh root@remoteserver dd of=/tmp/sda1.dump
- Now, lets delete the partition table of /dev/sda device which would make system non-bootable:-

- After reboot the system failed to boot up and received an error “Operating system not found”.


Step 2: Backup Process

Since system failed to boot, I tested system by booting from rescue image and when tried to identify Linux partitions it failed to detect (as shown below) :-


Deleting of the partition details has nullified MBR (Master Boot Record) and hence the above error.

The first step in restoring the system is to create disk layout as necessary.  So, lets create partition layout exactly as it was before in rescue mode on hard drive “sda”.

- Creating /dev/sda1 partition:-

Need to create the first partition which is for boot file system and this would require the starting sector in bytes and end sector as shown (creating exact layout as it was before):-

Same way, created /dev/sda2 partition (root file system) as well and changed the partition ID of “/dev/sda3” as swap.

Created file systems on the newly created partitions. Since /dev/sda3 is being used for swap, we’d need to run “mkswap” with same UUID as it was before as shown below:-

Likewise, need to change the UUID of /dev/sda1 and /dev/sda2 to the original UUID as it was before by referring to the backup data that we had taken earlier. To get this done, “tune2fs” command can be used.
- Next step is to toggle the boot flag and mark /dev/sda1 as boot partition.

NOTE:-  If the data is not corrupt then there is a high possibility of recovering without running mkfs.ext4 command on each partitions (boot & root).  Yes, one could run "setup" command as explained below on boot device/partition and try booting the system to check if it comes up.  If the damage was only to the partition table then system would be able to boot up, if not then need to create file system and then restore from backup on each block device.
- Now, let’s mount /dev/sda1 on /temp1, /dev/sda2 on /temp2 & /dev/sdb1 on /backup-data. Check if there is any data in /temp1 and /temp2, there should not be any since these are new file systems.

The files under /backup-data “sda1.dump” & “sda2.dump” are the backup of /dev/sda1 and /dev/sda2 file systems which are stored on device /dev/sdb1.
- Let’s restore data from dumps using “restore” command in rescue environment.

- After this, system yet failed to boot up and thrown up same error message as before “Operating system not found”.  This could happen because of missing MBR data on primary hard drive.
- Mount boot partition on a temporary mount point and check device.map file under grub directory.

Run the “grub” command by sourcing “device.map” file as shown below which would drop into grub prompt.  Identify the root file system disk and run “setup” command which would install missing MBR and corresponding files over there:
Command: “grub --device-map /temp1/grub/device.map

After this exit the shell and reboot the system. System should boot up fine. That’s all!!!
NOTE:- If the underlying block devices are logical volumes (lv) then need to create physical volumes using UUID referencing to archives stored under /etc/lvm/archive folder, likewise need to create same volume group (vg) and lvs.  Saying so, there is a need to backup /etc/lvm contents.
References:-

Thursday, November 3, 2016

Rear- Linux Disaster Recovery Solution

 
  

Rear” (Relax-and-Recover) fits perfect in implementing a bare metal disaster recovery solution & image migration as well.  Rear is the leading Open Source disaster recovery solution. It is a modular framework with many ready-to-go workflows for common situations.


Most backup software solutions are very good at restoring data but do not support recreating the system layout. It is very good at recreating the system layout but works best when used together with supported backup software.


In this combination Relax-and-Recover recreates the system layout and calls the backup software to restore the actual data. Thus there is no unnecessary duplicate data storage and the Relax-and-Recover rescue media can be very small. There are a lot of articles or documents available for "Rear", hence, this documented I created is for my reference, but available publicly, so if anyone finds it helpful then I'd say "I'm glad".


As per “Red Hat” it is officially supported from RHEL7.2 on-wards, however, relax-and-recover web site got rear packages for RHEL5/6 as well. I was able to trace “rear” package in the ISO image of RHEL 6.8 version.


With this solution "sys admins" should be able to recover a failed system with minimal downtime. Saying so, “Rear” is not a backup solution rather it is recovery solution. Hence, external precautions and procedure should be followed up to take data backup.


There are two main steps involved in this process:-     

  • Backup system
  • Recover system


Let’s see how we could ‘backup’ and ‘recover’ system using rear now.  

Scope: This document is mainly for Linux systems and intention is recovering a failed system, again, not a backup solution.
Testing Environment: This solution would be tested on RHEL 6.7 virtual system running in VMware Workstation 11. Backup of the system image would be stored on RHEL 6.8 virtual system in same network.


Step 1: Backup System

In this step, I’m going to take system backup using “rear mkbackup” command. This is a test VM (virtual machine) which I’d destroy later and then try to recover using “rear” solution which would be explained in step 2.
- A snap of this system:



- Install “rear” package.

This is not available in the ISO image, hence, I had to download it from the site http://download.opensuse.org/repositories/Archiving:/Backup:/Rear/RedHat_RHEL-6/ and also available in EPEL, and other resources.


- Once installed, need to edit /etc/rear/local.conf file and make necessary changes to tell the system about backup method being used, backup server etc. So, I’ve made the changes to this file as shown below:-


Here, the backup would create a bootable ISO image file in the format "rear-$(hostname).iso", and it is using default “NETFS” backup method (internal archive method being used by rear), and finally the backup files including the ISO files are stored on the remote system as mentioned in the parameter “BACKUP_URL” via NFS protocol.  This option is mandatory when backup method is specified as “NETFS”.

Refer this link for more details on the other options:-


- Let’s start making backup of the system now. At this stage, we are ready to run the command “rear -v mkbackup” to start making backup of our system as shown below:-



Depending on the amount of data to be taken as backup and network speed, it would take some time to complete. When checked on the server where this backup is stored, found that the total size of backup was 1.1 GB as shown:-



- Now, I'd delete the boot files from /boot partition of this system "test" to make it non-bootable.


- After the reboot, the system dropped into grub shell and unable to boot up. Since I had removed all the boot files it is difficult to get those files backup, hence, I’d need to restore using “rear” now.


Step 2: Recover Failed System

Now, at this stage, I’d need to boot the failed system using the bootable ISO file that was created in the earlier stage. So, let’s mount this ISO image and boot the system up.  When checked on my system this file was size of 73 MB as shown here:-



Boot the system with this ISO file. Once systems comes up, screen should look similar to this one:



- So, go ahead and select “Recover test.example.com” and hit enter to start the recovery process. This would check and build the necessary layout for the system and drops into rescue shell.


At the shell, need to run the command “rear recovery”. After a few minutes, systems failed to recover and shown the below error message:-



When checked and found that it was because dynamic IP assigned to this system running in rescue, hence, I changed it to static IP after which I was able to recover successfully.




Starting recovery…….



End of recovery…..



At the prompt type “reboot” and system should be able to reboot without any errors. File system re-labeling would kick off since SELinux is in enforcing mode.


Once system is up, those files that I had stored under /data was there as before:-




That's all :)