Using AIX VG mirroring in combination with hardware snapshots

One of the great things about Logical Volume Managers is how you can use them for all manner of clever solutions. I recently explored how to use a combination of hardware snapshots and LVM to create rapid backups without using backup software (or as a source for a data protection product).

To do this we need to do the following:

We need to present a staging disk to the host, large enough to hold the data we are trying to protect. In this example a volume group (VG) being used to hold DB2 data. This disk could come from a different primary storage device (i.e. an XIV or a Storwize V7000) or could be an Actifio presented disk. You need to check whether your multi-pathing software will work with that disk.
We mirror our datavg onto our new staging disk using AIX VG mirroring.
We take a hardware snapshot of that disk.
We now allow the VG mirror to become stale to remove disk load on the host
Prior to taking the next snapshot, we get the mirrors back in sync again.

This process clearly depends on whether you would prefer to leave the two copies in sync or let them go stale. The advantage of letting them go stale is that the disk I/O workload needed to keep them in sync is avoided. While you will need to catch-up later, the total effort to do this may well be significantly less than the continual effort of mirroring them.

Example configuration

We have a VG (called db2vg) with one copy. We know only one copy exists because each logical volume in the volume group has only one PV.

[AIX_LPAR_5:root] / > lsvg -l db2vg

db2vg:
LV NAME             TYPE       LPs     PPs     PVs  LV STATE      MOUNT POINT
jfsdb2log1          jfs2log    1       1       1    open/syncd    N/A
jfsdb2log2          jfs2log    1       1       1    open/syncd    N/A
jfsdb2log3          jfs2log    1       1       1    open/syncd    N/A
db2binlv            jfs2       14      14      1    open/syncd    /db2
db2loglv            jfs2       10      10      1    open/syncd    /db2log
db2datalv           jfs2       40      40      1    open/syncd    /db2data

If I display the detailed view of the relevant VG I can see the VG is currently in a good state

[AIX_LPAR_5:root] / > lsvg -L db2vg
VOLUME GROUP:       db2vg                    VG IDENTIFIER: 00f771ac00004c0000000144bf115a1e
VG STATE:           active                   PP SIZE:        512 megabyte(s)
VG PERMISSION:      read/write               TOTAL PPs:      71 (36352 megabytes)
MAX LVs:            512                      FREE PPs:       4 (2048 megabytes)
LVs:                6                        USED PPs:       67 (34304 megabytes)
OPEN LVs:           6                        QUORUM:         2 (Enabled)
TOTAL PVs:          1                        VG DESCRIPTORS: 2
STALE PVs:          0                       STALE PPs:      0
ACTIVE PVs:         1                        AUTO ON:        yes
MAX PPs per VG:     130048
MAX PPs per PV:     1016                     MAX PVs:        128
LTG size (Dynamic): 512 kilobyte(s)          AUTO SYNC:      no
HOT SPARE:          no                       BB POLICY:      relocatable
PV RESTRICTION:     none                     INFINITE RETRY: no

We have added one new disk to the server. We know it’s not in use because it has no VG (it says none).

[AIX_LPAR_5:root] / > lspv
hdisk0          00f771acd7988621                    None
hdisk5          00f771acbf1159f6                    db2vg           active
hdisk6          00f771ac41353d73                    rootvg          active

[AIX_LPAR_5:root] / > lsdev -Cc disk
hdisk0 Available C9-T1-01 MPIO IBM 2076 FC Disk
hdisk5 Available C9-T1-01 MPIO IBM 2076 FC Disk
hdisk6 Available C9-T1-01 MPIO IBM 2076 FC Disk

We extend the VG onto the new staging disk and then mirror it. We specify the VG name (db2vg) and the name of the unused or free disk (hdisk0).

It takes a while so we run the mirrorvg command as a background task with &

[AIX_LPAR_5:root] / > extendvg db2vg hdisk0
[AIX_LPAR_5:root] / > mirrorvg db2vg hdisk0 &
0516-1804 chvg: The quorum change takes effect immediately.

We monitor the mirroring with a script. I did not write this script but did modify it. The original author (W.M. Duszyk) should thus be acknowledged! Also thanks to Chris Gibson for help with this.

#!/usr/bin/ksh93
### W.M. Duszyk, 3/2/12
### AVandewerdt 01/05/14
### show percentage of re-mirrored PPs in a volume group
 [[ $# < 1 ]] && { print "Usage: $0 vg_name"; exit 1; }
vg=$1
printf "Volume Group $vg has ";lsvg -L $vg | grep 'ACTIVE PVs:' | awk '{printf $3}';printf " copies "
Stale=`lsvg -L $vg | grep 'STALE PPs:' | awk '{print $6}'`
[[ $Stale = 0 ]] && { print "and is fully mirrored."; exit 2; }
Total=`lsvg -L $vg | grep 'TOTAL PPs:' | awk '{print $6}'`
PercDone=$(( 100 - $(( $(( Stale * 50.0 )) / $Total )) ))
echo "and is mirrored $PercDone%."
exit 0

We can use this script to check if the VG is in sync. You run the script and specify the name of the VG:

 [AIX_LPAR_5:root] / >./checkvg.sh db2vg
Volume Group db2vg has 2 copies and is mirrored 85%.

We wait for it to reach 100%

 [AIX_LPAR_5:root] / > ./checkvg.sh db2vg
Volume group db2vg has 2 copies and is fully mirrored.

If you want to see the exact state of the VG, lets look at the volume group details. Note how each LV now has 2 PPs and the LV state is open/syncd. An LV state of closed/syncd is not an issue if the LV is actually raw (rather than using a file system) and it is not being used by the application.

[AIX_LPAR_5:root] / > lsvg -l db2vg
db2vg:
LV NAME             TYPE       LPs     PPs     PVs  LV STATE      MOUNT POINT
jfsdb2log1          jfs2log    1       2       2    open/syncd    N/A
jfsdb2log2          jfs2log    1       2       2    open/syncd    N/A
jfsdb2log3          jfs2log    1       2       2    open/syncd    N/A
db2binlv            jfs2       14      28      2    open/syncd    /db2
db2loglv            jfs2       14      28      2    open/syncd    /db2log
db2datalv           jfs2       40      80      2    open/syncd    /db2data

Now display that LV. We can see hdisk0 is copy 2 (PV 2). This is good.

[AIX_LPAR_5:root] / > lslv -m db2binlv
db2binlv:/db2
LP    PP1  PV1               PP2  PV2               PP3  PV3
0001  0002 hdisk5            0002 hdisk0
0002  0003 hdisk5            0003 hdisk0
0003  0004 hdisk5            0004 hdisk0
0004  0005 hdisk5            0005 hdisk0
0005  0006 hdisk5            0006 hdisk0
0006  0007 hdisk5            0007 hdisk0
0007  0008 hdisk5            0008 hdisk0
0008  0009 hdisk5            0009 hdisk0
0009  0010 hdisk5            0010 hdisk0
0010  0011 hdisk5            0011 hdisk0
0011  0012 hdisk5            0012 hdisk0
0012  0013 hdisk5            0013 hdisk0
0013  0014 hdisk5            0014 hdisk0
0014  0015 hdisk5            0015 hdisk0

We are now ready to snapshot the staging disk to preserve its state as it is in the synced state. Once the snapshot is created, we can let the mirror go stale so that there is no disk load to keep the staging disk in sync. You should co-ordinate this snapshot with the application writing to the disk. With Actifio we do this with the Actifio Connector software.

Once the snapshot is taken we can split the VG to stop the workload of mirroring. We are going to split off copy 2, which is the copy that is on our staging disk (hdisk0). So now we split off a copy:

splitvg -c2 db2vg

The new copy is called vg00. You can force AIX to use a different name.

[AIX_LPAR_5:root] / > splitvg -c2 db2vg
[AIX_LPAR_5:root] / > lsvg
db2vg
rootvg
vg00

If we check db2vg we can see it still shows 2 PPs but actually we are no longer keeping the second copy (on hdisk0) in sync.

[AIX_LPAR_5:root] / > lsvg -l db2vg
db2vg:
LV NAME             TYPE       LPs     PPs     PVs  LV STATE      MOUNT POINT
jfsdb2log1          jfs2log    1       2       2    open/syncd    N/A
jfsdb2log2          jfs2log    1       2       2    open/syncd    N/A
jfsdb2log3          jfs2log    1       2       2    open/syncd    N/A
db2binlv            jfs2       14      28      2    open/syncd    /db2
db2loglv            jfs2       10      20      2    open/syncd    /db2log
db2datalv           jfs2       40      80      2    open/syncd    /db2data

When we look at our newly created VG (vg00) it does not have 2 copies.

[AIX_LPAR_5:root] / > lsvg -l vg00
vg00:
LV NAME             TYPE       LPs     PPs     PVs  LV STATE      MOUNT POINT
fsjfsdb2log1        jfs2log    1       1       1    closed/syncd  N/A
fsjfsdb2log2        jfs2log    1       1       1    closed/syncd  N/A
fsjfsdb2log3        jfs2log    1       1       1    closed/syncd  N/A
fsdb2binlv          jfs2       14      14      1    closed/syncd  /fs/db2
fsdb2loglv          jfs2       10      10      1    closed/syncd  /fs/db2log
fsdb2datalv         jfs2       40      40      1    closed/syncd  /fs/db2data

Curiously while we show as being in sync the sync actually is stale by 3 PPs already:

[AIX_LPAR_5:root] / > chmod 755 checkvg.sh;./checkvg.sh db2vg
Volume Group db2vg has 1 copies and is mirrored 99%.

I generate some change by copying some files to /db2data to increase this difference. Of course if DB2 is really running then changes will start occurring straight away.

[AIX_LPAR_5:root] / > ./checkvg.sh db2vg
Volume Group db2vg has 1 copies and is mirrored 97%.

If we check the state of the LVs we can see that this file I/O has created stale partitions. This is not a problem. The speed with which partitions become stale will depend on the size of the PPs and the address range locality of typical IOs generated between snapshots.

[AIX_LPAR_5:root] / > lsvg -l db2vg
db2vg:
LV NAME             TYPE       LPs     PPs     PVs  LV STATE      MOUNT POINT
jfsdb2log1          jfs2log    1       2       2    open/stale    N/A
jfsdb2log2          jfs2log    1       2       2    open/stale    N/A
jfsdb2log3          jfs2log    1       2       2    open/stale    N/A
db2binlv            jfs2       14      28      2    open/stale    /db2
db2loglv            jfs2       10      20      2    open/stale    /db2log
db2datalv           jfs2       40      80      2    open/stale    /db2data

When we are ready to take the next snapshot we need to get the two copies back together and in sync. To do this we rejoin the two with this command:

joinvg db2vg

We can see the two start coming back to sync:

[AIX_LPAR_5:root] / > ./checkvg.sh db2vg
Volume Group db2vg has 2 copies and is mirrored 98%.

When the two get into sync we can clearly see this as the state is syncd rather than stale.

[AIX_LPAR_5:root] / > lsvg -l db2vg
 db2vg:
 LV NAME             TYPE       LPs     PPs     PVs  LV STATE      MOUNT POINT
 jfsdb2log1          jfs2log    1       2       2    open/syncd    N/A
 jfsdb2log2          jfs2log    1       2       2    open/syncd    N/A
 jfsdb2log3          jfs2log    1       2       2    open/syncd    N/A
 db2binlv            jfs2       14      28      2    open/syncd    /db2
 db2loglv            jfs2       10      20      2    open/syncd    /db2log
 db2datalv           jfs2       40      80      2    open/stale    /db2data

If the resync does not occur, we can force it with the syncvg command:

syncvg -v db2vg

Once we are in sync, we can do another snapshot of the staging disk.

Issues with scripting this:

One thing you may want to do is allow a non-root user to perform these commands. So for instance if we want to allow the DB2 user (in this example db2inst2) to execute splitvg and joinvg commands we can use sudo to do this.

Download and install sudo on the AIX host
Issue this command to edit the sudo config file: visudo
Add this line:
db2inst2 ALL = NOPASSWD: /usr/sbin/joinvg,/usr/sbin/splitvg

Log on as the DB2 user and check that it worked:

[AIX_LPAR_5:db2inst2] /home/db2inst2 > sudo -l User
db2inst2 may run the following commands on this host:
(root) NOPASSWD: /usr/sbin/joinvg
(root) NOPASSWD: /usr/sbin/splitvg

Using the snapshot with a backup host

One strategy that can be used in combination with this method is to present the snapshot to a server running backup software. The advantage of doing this is that the backup can effectively be done off-host. The disadvantage is that each backup will be a full backup unless the backup software can scan the disk for changed files or blocks.

Import the VG

To use the snapshot, connect to the management interface of the storage device that created the snapshot and map it to your backup host. Then logon to the backup host and discover the disks:

cfgmgr

Learn the name of the hdisk

lspv
lsdev -Cc disk

Then import the volume group. You need to use -f to force an import with only half the VG members present (since you are importing a snapshot of one half of a mirrored pair). In this example we have discovered hdisk1 and are using it to import the VG db2vg.

importvg -y db2vg hdisk1 -f

Recreate the VG

If you are presenting the snapshot back to the same host that has the original VG, then we have to do two extra steps. Because the snapshot has the same PVID as the staging disk you need to change the PVID and use the recreatevg command, not the importvg command.

In this example I have two VGs and two disks.

[aix_lpar_4:root] / > lspv
hdisk0          00f771acc8dfb10a                    actvg           active  
hdisk2          00f771accdcbafa8                    rootvg          active

I map the snapshot I created and run cfgmgr. If you are sharp eyed you will spot I don’t have any PVID clashes. Actually I don’t even have the original DB2 VG, but the method is still totally valid.

[aix_lpar_4:root] / > cfgmgr
[aix_lpar_4:root] / > lspv
hdisk0          00f771acc8dfb10a                    actvg           active  
hdisk2          00f771accdcbafa8                    rootvg          active  
hdisk4          00f771acd7988621                    None

We need to bring the VG online, so we clear the PVID

 [aix_lpar_4:root] / > chdev -l hdisk4 -a pv=clear
hdisk4 changed
[aix_lpar_4:root] / > lspv
hdisk0          00f771acc8dfb10a                    actvg           active  
hdisk2          00f771accdcbafa8                    rootvg          active  
hdisk4          none                                None

We now build a new VG using the VG name db2restorevg on hdisk4.

[aix_lpar_4:root] / > recreatevg -f -y db2restorevg hdisk4
db2restorevg
[aix_lpar_4:root] / > lsvg -l db2restorevg
db2restorevg:
LV NAME             TYPE       LPs     PPs     PVs  LV STATE      MOUNT POINT
fsjfsdb2log1        jfs2log    1       1       1    closed/syncd  N/A
fsjfsdb2log2        jfs2log    1       1       1    closed/syncd  N/A
fsjfsdb2log3        jfs2log    1       1       1    closed/syncd  N/A
fsdb2binlv          jfs2       14      14      1    closed/syncd  /fs/db2
fsdb2loglv          jfs2       14      14      1    closed/syncd  /fs/db2log
fsdb2datalv         jfs2       40      40      1    closed/syncd  /fs/db2data

Again if you are sharp eyed you will spot in the output above every LV has fs added to its name. In other words db2binlv that was mounted on /db2 is recreated as fsdb2binlv mounted on /fs/db2. This is done because the recreatevg command assumes you are creating this VG on a host that already has this VG. So it renames constructs to prevent name clashes. If for some reason you don’t want this renaming to occur, you can avoid it in the recreatevg command like this, where -L / and -Y NA forces the command to not rename any labels. Use this with care.

recreatevg -f -L / -Y NA -y db2restorevg hdisk4

Backups without backup software or file system scans.

If the staging disk is presented by Actifio, then Actifio will track every changed block and will only need to read the changed blocks to create a new backup image of the snapshot. The VG PP size will play a role in determining the quantity of changed blocks. This effectively allows backups without backup software since the Actifio Dedup engine can read blocks straight from snapshots created by Actifio. This is a very neat trick. Also since we presented the staging disk from the Actifio snapshot pool, we now also have a copy that we can present at will for instant test and dev or analytics purposes.

Scripting for Application Consistency

When creating the snapshot, you ideally want the whole process to be orchestrated where a regular update job is run according to a schedule. The process should get the VG mirror back into sync, get the application into a consistent state (such as hot backup mode), create a snapshot and then let the VG mirror go stale again.

The Actifio Connector can be used to coordinate application consistency. Clearly if your staging disk is coming from a different storage product then you will need to use that vendors method. Every time Actifio starts a snapshot job (which can be automated by the Actifio SLA scheduling engine) it can call the Actifio Connector installed on the host to help orchestrate the snapshot. It does so in phases: init; thaw; freeze; fini and if necessary abort. We set the database name and path and VGname at the start of the script. The init phase re-syncs the VG; the thaw phase puts DB2 into hot backup mode; the freeze phase takes DB2 out of hot backup mode; the fini phase splits the VGs.

#!/bin/sh
DBPATH=/home/db2inst2/sqllib/bin
DBNAME=demodb
VGNAME=db2vg
if [ $1 = "freeze" ];then
 $DBPATH/db2 connect to $DBNAME
 $DBPATH/db2 set write suspend for database
 exit 0
fi
if [ $1 = "thaw" ];then
 $DBPATH/db2 connect to $DBNAME
 $DBPATH/db2 set write resume for database
 exit 0
fi
if [ $1 = "init" ];then
 sudo joinvg $VGNAME
 while true
 do
 synccheck=$(/act/scripts/checkvg.sh $VGNAME)
 if [ "$synccheck" != "Volume Group $VGNAME has 2 copies and is fully mirrored." ]
 then
 echo $synccheck
 sleep 30
 else
 break
 fi
 done
 exit 0
fi
if [ $1 = "fini" ];then
 echo "Splitting $VGNAME"
 sudo splitvg -c2 $VGNAME
 exit 0
fi
if [ $1 = "abort" ];then
 $DBPATH/db2 connect to $DBNAME
 $DBPATH/db2 set write resume for database
 exit 0
fi

Hopefully this whole process is helpful whether you use Actifio or not. Here is a small set of references which helped me with this:

Waldemar Mark Duszyk Blog

Chris Gibsons Blog

IBM Technote

4 Responses to Using AIX VG mirroring in combination with hardware snapshots

Shen says:

June 24, 2014 at 10:10 pm

Hello Anthony,
what the reason to snapshot mirrored disk(hdisk0), but not original one(hdisk5)? Snapshot is performed by storage anyway, why not just snapshot hdisk5 without any mirrorvg?

- Anthony Vandewerdt says:
  
  June 25, 2014 at 6:03 am
  
  It’s a good question. The goal here is to have an off array snapshot. One that does not rely on the primary storage to survive.
  
Peter says:

December 18, 2015 at 5:27 am

Hi Anthony,
Thank you for this BTW. This helped me immensely in creating a new backup schedule and solution for an AIX system running UniData.

Have you happen to run into issues mounting the /fs/ after the mirror is split? We’re trying to rsync the changes in the “snapshot” over to another host which runs our offsite backup client since we can’t have it crawl the FS during operation and we’re running into issues mounting the fs after changing /fs/ to /SNAP/. I couldn’t see any reason for it but I figured I’d ask in case you’d run into this during testing.

Thanks again, you seriously were instrumental in helping us get a new minimal downtime solution.
Cheers
Peter

SUAVE says:

January 16, 2016 at 8:21 am

Clever explanation, example, and tools. Thanks.