Replacing a failing rootvg disk on AIX
Par Nixman le dimanche 4 mai 2008, 20:51 - AIX - Lien permanent
Works on : AIX
Let's suppose you're getting permanent hardware errors on hdisk0 when
running the errpt -a command on an IBM AIX server.
In order to check that both disks are really assigned to the volume group,
you should start with:
lsvg -p rootvg
You should see both hdisk0 and hdisk1 under the PV name.
lsvg -l rootvgJust check that there is a 1:2 relationship between LPs and PPs, and that PVs is equal to 2. Otherwise, you should check that the volume that's not copied doesn't reside on the failing disk with:
lslv -l LV_NAMEOnce you've done these preliminary checks, you can start detaching hdisk0 from the volume:
unmirrorvg rootvg hdisk0
After running the command, I've sometimes had these messages,
which are mostly informational:0516-1246 rmlvcopy: If hd5 is the boot logical volume, please run 'chpv -c <diskname>'
as root user to clear the boot record and avoid a potential boot
off an old boot image that may reside on the disk from which this
logical volume is moved/removed.
0301-108 mkboot: Unable to read file blocks. Return code: -1
0516-1132 unmirrorvg: Quorum requirement turned on, reboot system for this
to take effect for rootvg.
0516-1144 unmirrorvg: rootvg successfully unmirrored, user should perform
bosboot of system to reinitialize boot records. Then, user must modify
bootlist to just include: hdisk0.
Then we reduce the volume:
reducevg rootvg hdisk0And remove the device from configuration:
rmdev -dl hdisk0Then, we will have to power down the machine, as we're dealing with a rootvg disk. However, before doing so, it's preferable to check whether we will boot of from the right drive:
bootinfo -b will tell you which drive was last booted up.If it's the failed drive (hdisk0 in our case), we should change it to the drive still usable (hdisk1 in our case) by creating the boot image on hdisk1 and recrcreating the fixed ipldevice link, which was deleted by the previous rmdev command :
bosboot -ad /dev/hdisk1ln /dev/rhdisk1 /dev/ipldeviceThen, we can check bootlist:
bootlist -m normal -o... And now, we can finally power down our server, replace the failed drive, and power it back on...
Once the server has booted up, we should run:
cfgmgrso that the OS will recognize the new disk.
To check that AIX really has done its job, run:
l
sdev -Cc diskwhich should list both disks hdisk0 and hdisk1
Now, we can assign the new disk to the rootvg volume group:
extendvg rootvg hdisk0Then we mirror the group:
mirrorvg rootvgWait for hdisk1 to complete copying on hdisk0 (it can take some time, as you can imagine). You can check activity with
iostat.You should check that both disks are really assigned to rootvg by typing:
lsvg -p rootvgAn
lsvg -l rootvg will show you whether mirroring has worked OK.
You should once again have a 1:2 relationship between LPs and PPs.Then, create the boot image on the new disk:
bosboot -a -d hdisk0
Finally, modify the bootlist to take into account both disks:
bootlist -m normal hdisk0 hdisk1Check with:
bootlist -m -normal -oAnd you're finally done!
Happy computing.
Drop me a comment if this post has been useful to you, or if you see any reason for add-on or modification.
Nixman