1. Introduction
2. Requirements
3. Installation
4. Using self compiled kernel
5. Tips and tricks
6. Trouble shooting
7. Testing
8. About
This howto briefly describes how to install Debian Sarge on LVM2 on software RAID 1, directly from official Sarge DVD (without having to install it on a classical partition and then move to LVM on RAID).
Two words before starting.
This task is not something for beginners but for advanced users.
Don't forget to make a backup of your datas (possibly on an external HD that will be unplugged before starting the installation, or on a CD/DVD).
1.1. Restrictions
1.1.1. Restriction one
There is a limitation installing the system on LV: the "vgscan" and "vgchange" commands must be available to the system in order to mount LVs.
There are the following possibilities to accomplish this:
1) Use of an initrd image.
2) Don't install the whole system on LVs.
With the first solution, the whole system can be placed on LVs (either in a single LV or in more LVs). I'm not a fan of initrd images, but this is the easiest way.
With the second solution, the /sbin containing the vg tools must be placed outside the LVs, in a non-LVM partition.
At least other directories must be also outside the LVs (e.g.: /, /etc, /dev/, ...). The great advantage is that only the necessary LVs are mounted read-write, all the rest are mounted read-only!
Since the second solution is the right way to configure a server, but is a little bit expensive for a desktop PC, I only treat the first one in this document.
1.1.2. Restriction two
This resctriction is caused by booting a system installed on LVM.
The /boot can not be in LVM because the bootloader cannot read it!
1.2. Layout
Saying we have 2 IDE harddisks: dev/hda and /dev/hdb. Each one will be the mirror of the other.
Since /boot have to be in non LVM partition, 2 RAID devices are needed: /dev/md0 and /dev/md1.
The first for /boot, the second for LVM (/, /home, ...).
Therefore, this is the situation:
/dev/hda1 /dev/hdb1 used for /dev/md0
/dev/hda2 /dev/hdb2 used for /dev/md1
/dev/md0 used for /boot
/dev/md1 used for LVM
For the volume group name I have choosen "raid", but you can use whatelse do you prefer.
For the logical volume name of the / I have choosen "sarge", but you can use whatelse do you prefer.
/dev/raid/sarge used for /
2.1. Hardware
The minimal requirements are:
- 2 harddisks (it doesn't make sense to have raid 1 on the same harddisk!)
2.2. Software
The minimal requirementes are:
- kernel 2.6 (maybe works also with 2.4, but the document is based on 2.6)
- lvm2
- mdadm
3.1. Booting the DVD
Power on with the Sarge DVD1 in the DVD reader.
Once Debian logo appears, type
expert26
This will choose a 2.6 kernel.
3.2. Installing the system
Select the following items and configure it as normally:
"Choose language"
"Choose country or region"
"Select a keyboard layout"
"Detect and mount CD-ROM"
Now, the first important step for our purpose. Select
"Load installer components from CD"
and then mark the following items:
lvmcfg
mdcfg
This will load full (kernel + applications) LVM and RAID support.
Proceed as normally with:
"Detect network hardware"
"Configure the network"
"Detect hardware"
3.2.1. Creating physical partitions
Now, the second important step for our purpose: Select
"Partition disks"
then
"Manually edit partition table"
With the following steps, the content of the harddisk will be permanently erased. Have you made the backup? If not, run Forrest!
Create /dev/hda1 and /dev/hdb1 with a size of 64M, type 0xFD (Linux RAID autodectection) and bootable flag on.
Create /dev/hda2 and /dev/hdb2 with the remaining size and type 0xFD (Linux RAID autodectection).
These partitions will be used for the 2 RAID1 devices.
3.2.2. Creating RAID devices
Select
"Configure software RAID"
You will asked for writing to the storage devices. Answer yes.
Create /dev/md0 using /dev/hda1 and /dev/hdb1.
This device will be used for /boot.
Create /dev/md1 using /dev/hda2 and /dev/hdb2.
This device will be used for LVM.
Now, the RAID arrays have been created and are currently syncronizing them self. You can verify this going on the second console (ALT + F2) and repeatedly calling
cat /proc/mdstat
It would be nice to simply call
watch -n 1 cat /proc/mdstat
but this last has not been integrated in the Debian installer.
3.2.3. Creating LVM devices
In the "Partition disks" window, select the just created RAID1 device of xxM (RAID1 device #1 / #1) and then:
Use as: physical volume for LVM
Select
"Configure the Logical Volume Manager"
You will asked for writing changes to disks. Answer yes and ignore eventually warning about the kernel.
Create volume group "raid" on /dev/md1.
Create logical volume "sarge" with a size of 5.5G and of group "raid". This will be used for /.
Depending on your configuration, create additional volumes in the group "raid" for /home, /var/www, ...
If you prefer, you can do this later, but the process will be a little bit more complicated.
3.2.4. Creating /
In the "Partition disks" window, select the just created LV device of 5.5G (LVM VG raid, LV sarge / #1) and then:
Use as: ReiserFS journaling file system
Mount point: /
Mount options: notail
Label: sarge
3.2.5. Creating /boot
In the "Partition disks" window, select the RAID1 device of 64M (RAID1 device #0 / #1) and then:
Use as: Ext3 journaling file system
Mount point: /boot
Mount options: defaults
Label: boot
Reserved blocks: 5%
Typical usage: standard
3.2.6. Creating additional LVs before rebooting
For each additional logical volume you have created before, repeat the same process like for the / partition.
Here an example for /home:
In the "Partition disks" window, select the LV device created for /home and then:
Use as: ReiserFS journaling file system
Mount point: /home
Mount options: notail
Label: home
3.2.7. Finish partitioning
Write all changes to disk and leave this step.
3.2.8. Installing the base system
You are now ready to proceed with the system installation.
Install it exactly as you would make without LVM and RAID. The only difference now, is that /target is mounted on /dev/raid/sarge and /target/boot is mounted on /dev/md0. You can verify this by going on the second console (ALT + F2) and typing
mount
When you are asked for the kernel to install, choose the proposed one:
kernel-image-2.6.8-686-2
3.2.9. Installing the boot loader on the first harddisk
I prefer GRUB instead of LILO, because it is newer, more flexible and has some nice features like upgrading it self when a new kernel is installed/removed.
If you want, you can try with LILO, but at your own risk.
Once promped if installing GRUB in the MBR, answer yes.
Note the boot loader has only be installed on the first harddisk. In case this harddisk has a failure, you will not be able to boot the system. Therefore, the boot loader must also be installed on the second harddisk. Since this step cannot be accomplished here, you have to do it later.
A detailed documentation about GRUB can be found here: http://www.gnu.org/software/grub/manual/html_node/index.html.
3.2.10. Rebooting
Not sure if really needed, but for safety, wait until RAID has finished resyncing by repeatedly calling
cat /proc/mdstat
on the second console (ALT + F2).
When done, switch back to the first console (ALT + F1) and select "Finish the installation" to reboot.
If your system does not boot correctly, you have probably done something wrong in one of the steps above.
In this case, refer you to chapter "Trouble shooting".
3.2.11. Installing the boot loader on the second harddisk
Since the bootloader has been installed only on the first harddisk, you have to manually install it on the second one.
Therefore, once the system is up:
grub
device (hd0) /dev/hdb
root (hd0,0)
setup (hd0)
Logically, replace the device path with them of your second harddisk!
3.3. Installing and configuring additional packages
You can now proceed installing all kind of packages do you need and logically configuring them.
3.4. Creating additional LVs after rebooting
If you have already done this step before, just skip to the next chapter.
Don't forget to create all the LVs you need (e.g.: /home, /var/www, ...) and to update the /etc/fstab file.
Logically, before mounting them with the correct mount point, move the content to the LV.
Here a complete example for /home.
3.4.1. Creating /home
Create LV of 2G:
lvcreate -A -n home -L 2G raid
Create filesystem:
mkreiserfs --label home /dev/raid/home
Temporarily mount /dev/raid/home and transfer the content of /home:
mount /dev/raid/home /mnt
mv /home/* /mnt
umount /mnt
3.4.2. Updating /etc/fstab
Add an entry for /home:
/dev/raid/home /home reiserfs noatime,notail
3.4.3. Mounting /home
mount /home
4.1. Introduction
A self compiled kernel is not a MUST, but it could be necessary for performance optimization.
In my case, for example, I would hyperthreading enabled for my Pentium IV, but there is not such an official kernel.
In such a case, it is important to create the initrd image, otherwise the system will not boot.
4.2. Compilation
In order to have a initrd image, it's enough to specify the parameter
--initrd
Therefore:
fakeroot make-kpkg --append_to_version -yourHost --initrd --revision=yourRevisionNumber kernel _image modules_image
Don't panic! This will not create a package with the initrd image! But after the installation the initrd will be in /boot.
I think, it is created during the installation it self.
4.3. Installation
Just install your self compiled kernel:
dpkg -i /usr/src/kernel-image-2.6.x-yourHost_yourRevisionNumber.deb
If you have followed my suggestion and have installed grub, you are ready to reboot and testing your self compiled kernel.
5.1. Making backup of RAID configuration
In case the system becomes unbootable, you need the RAID configuration to be sure to can start the RAID in every condition.
Therefore, once the system is working, type:
cd /etc/mdadm
echo 'DEVICE /dev/hd*[0-9] /dev/sd*[0-9]' > mdadm.conf
mdadm --detail --scan >> mdadm.conf
Now, make a backup of this file out of your RAID system!!!. It is very important if you have to solve a problem in your system using a rescue CD/DVD.
5.2. How to prevent the system becomes unusable
This system is safe enough from failure, but it is not safe enough against you! ;-)
It is very easy to break the system trying something new, maybe installing a kernel without the currect support (LVM, RAID, ...) and having the same name of the working kernel. In this way, the working kernel will be removed and the buggy kernel installed.
To prevent you cannot access your system if it becomes unbootable, procure you a rescue system.
5.2.1. Rescue CD/DVD
Procure you a rescue CD/DVD with kernel 2.6, LVM2 and RAID support.
You can download my Emi's rescue CD here: http://emidio.planamente.ch/rescuecd.
If you prefer, you can use the Debian DVD it self, but it needs much more time until you can have access to your damaged system.
5.2.2. Rescue system
An alternativ solution to the rescue CD/DVD is to install a rescue system on an other HD (maybe an external one).
Be sure to install it in a way you can always boot it, for example on a little partition on the first harddisk.
Be also sure to have installed all what is needed (kernel + tools).
5.3. Backup
Don't forget, data and system backups are always important, also if you use RAID system. An erased file is erased for the whole array and this is irreversible!!!
Take a look at http://www.planamente.ch/emidio/pages/linux_howto_backup.php.
6.1. Duplicate PV
By creating LVM on RAID 1 device, it could happen PVs are not created on the RAID device but on the physical partitions (e.g.: /dev/sda1 and /dev/sdb1 if there are part of /dev/md0).
This will result as an error by doing LVM scanning (pvscan, vgscan, ...) like
Found duplicate PV 9w3TIxKZ6lFRqWUmQm9tlV5nsdUkTi4i: using /dev/sda1 not /dev/sdb1
and will make impossible to create a new volume group on /dev/md0.
In this case, you have to create a filter for LVM.
In the /etc/lvm/lvm.conf file, you have to add such a line:
filter = [ "r|/dev/cdrom|","r|hd[ab]|" ]
Logically, replace "hd" with "sd" for SCSI devices and "ab" with the correct one.
This will prevent such devices are scanned or used for LVM.
You have to reload LVM with
/etc/init.d/lvm force-reload
but since your system is on LVM, this is not possible. Therefore you have to reboot.
6.2. System hangs up
The current description is for using the Sarge DVD1 as rescue DVD.
Boot with Sarge DVD1.
6.2.1. Starting RAID
Once the rescue system has booted, the RAID devices are not started yet, because the md driver (raid driver) is compiled as module and not built in.
I have taken a look in the source code and it seems the raid autodetection is explicity disabled if the driver is compiled as module. Don't ask me why.
Therefore, you have to know your exactly RAID configuration: using the mdadm.conf file of just your head!
Load drivers:
modprobe md raid1
Assemble devices, without configuration file:
mdadm --assemble /dev/md/0 /dev/scsi/host0/bus0/target0/lun0/part1 /dev/scsi/host0/bus0/target1/lun0/part1
mdadm --assemble /dev/md/1 /dev/scsi/host0/bus0/target0/lun0/part2 /dev/scsi/host0/bus0/target1/lun0/part2
Assemble devices, with configuration file:
mdadm --assemble --scan --config=myConfigFile
Logically, in both cases, replace the partition paths with yours.
If the arrays are not degraded, they should also automatically be started. If not, do it:
mdadm --run /dev/md/0
mdadm --run /dev/md/1
Verify they have been started:
cat /proc/mdstat
6.2.2. Starting LVM
The logical volumes are easier to start.
Load device mapper driver:
modprobe dm-mod
If you forget to load this driver, you will get a terrible error like:
/proc/misc: No entry for device-mapper found
Is device-mapper driver missing from kernel?
Failure to communicate with kernel device-mapper driver.
Incompatible libdevmapper 1.01.00-ioctl (2005-01-17)(compat) md kernel driver
Don't panic, just load dm-mod!!!
Search and activate all volume groups:
vgscan
vgchange -a y
Verify all LVs are active with:
lvscan
6.2.3. Mounting broken system
Make mount point
mkdir /target
and mount / of the broken system:
mount /dev/raid/sarge /target
6.2.4. Changing root
This is one of the most interessting command of GNU/Linux: chroot.
This command will change the root in the shell where it has been invoked. Therefore, call
chroot /target
and you will transfered in the broken system. If you type
ls -l /
you will see the content of /target, but all what is outside it is absolutely not visible.
6.2.5. Mounting /proc
In order to restore your broken system, you have to mount the /proc directory, otherwise your kernel won't have info about the chrooted system.
Therefore, just type:
mount /proc
6.2.6. Mounting /boot
Since the /boot is not in the same partition, it has to be mounted. Therefore, call
mount /boot
6.2.7. Solving the problem
Here I can't help you a lot. There could be milion of problems and it is your responsability to solve it. I have warned you. Root on LVM on RAID is not for everyone! ;-)
If you didn't follow my instructions and installed LILO instead of GRUB, don't forget to call
lilo
when you are done.
6.2.8. Unmounting /proc
In theory, you are ready to reboot, but you have to exit from the chrooted directory before.
Since normally it is important to unmount /proc before exiting, we do it (just for pedagogical level) also if in this case it would not be necessary, because you want to reboot and not work anymore on the rescue system.
Anyway, type:
umount /proc
6.2.9. Exiting
Now, we are ready for leaving the chrooted environment.
Just type:
exit
6.2.10. Rebooting
Finally, we can reboot and hope the problem is solved.
7.1. Simulating disk failure
7.1.1. Second disk fails
I have made this test by physically removing the second HD from my system (PC was turned off!!!) and booting.
The system booted without any problem and once logged in, I have received a mail informing me there was a degraded array.
Taking a look at
cat /proc/mdstat
I had confirm.
Interessting thing, is that after have reputting the second drive on the system, the system still says the array is degraded.
I had to manually readd the second harddisk by calling:
mdadm --add /dev/md0 /dev/hdb1
mdadm --add /dev/md1 /dev/hdb2
and wait until resync was done, to restore the original situation.
In case of a real failure, the MBR has to be reinstalled after have replaced the broken disk:
grub
device (hd0) /dev/hdb
root (hd0,0)
setup (hd0)
7.1.2. First disk fails
I have made this test by physically removing the first HD from my system (PC was turned off!!!) and booting.
The system could not boot, but not because the RAID was not working but for the following reason.
I have both SCSI and IDE harddisks and the system is installed on the SCSI.
The BIOS can map the sequence of all the harddisk. By default, it assigns IDE before SCSI, but I have changed this order.
If the BIOS detects a change in the harddisk configuration, it reassigns the position of the devices by putting the IDE before SCSI.
In this case, the system becomes unbootable. I had just to reassign the correct order and the system booted correctly.
Also in this case, after a few minutes, I received a mail from the system , informing me that the array was degraded.
Taking a loot at
cat /proc/mdstat
I had confirm.
Note: In this case, the second harddisk is called /dev/hda and not /dev/hdb, because the first one is missing!!!
After shuttind down, reputting the first harddisk and booting, I could re-add it to the array with:
mdadm --add /dev/md0 /dev/hda1
mdadm --add /dev/md1 /dev/hda2
Note: Now, the second harddisk is again called /dev/hdb, because the first one has been readded!!!
After a few minutes, the raid has been restored.
In case of a real failure, the MBR has to be reinstalled after have replaced the broken disk:
grub
device (hd0) /dev/hda
root (hd0,0)
setup (hd0)
7.2. Partition fails
One after the other, I have set all the single partitions making part of the array to faulty.
Every time I have set only one partition to faulty, I have verified the root has received a warning email and I have rebooted the system to verify if it still could come up.
To set a partition to faulty:
mdadm --fail /dev/md1 /dev/hda2
To add the faulty partition to the array:
mdadm --add /dev/md1 /dev/hda2
7.2.1. Data corruption
RAID has not been implemented against data corruption. If you try to simulate data corruption, you will have data corruption.
Therefore, don't try such a test unless you will really destroy your data!
8.1. Author
Emidio Planamente <eplanamente@gmx.ch>
8.2. Feedback
Please let me know if you could successfully install your system on LVM2 on RAID1 following this document description.
Any other feedback is also welcome.
8.3. History
Version 2.2 / 2006-02-06
Changed "Creating RAID devices"
Changed "Installing the boot loader on the first harddisk"
Changed "Installing the boot loader on the second harddisk"
Version 2.1 / 2006-01-25
Changed "Restriction one"
Changed "Creating LVM devices"
Version 2.0 / 2006-01-22
Changed "Rescue CD/DVD"
Version 1.9
Changed "Creating RAID devices"
Changed "Rebooting"
Version 1.8
Fixed "Creating physical partitions"
Fixed "Creating RAID devices"
Version 1.7
Fixed "System hangs up"
Version 1.6
Changed "Partition fails"
Version 1.5
Added "Using self compiled kernel"
Version 1.4
Changed "First disk fails"
Changed "Second disk fails"
Version 1.3
Changed "First disk fails"
Version 1.2
Changed "Starting RAID" "without config file"
Version 1.1
Changed "5.2 System hangs up"
Changed "Installing the boot loader"
Added "6. Testing"
Version 1
First public release
|