ZFS does an outstanding job to protect the integrity of your data with hierarchical checksumming of all data and metadata and by providing redundancy with different RAID levels. Completed with features like ZFS snapshots and snapshot cloning it offers solutions that are hard to beat with other backup solutions.

Beware, while ZFS snapshots are a more than viable solution for a wide variety of recovery scenario's, keep in mind that ZFS snapshots on themselves are not a backup solution. Make sure you know the limitations.

For example, one of the things you better know up-front is that when you roll back to a certain snapshot, you lose all the snapshots made in between. Most probably you wouldn't discover this accidentally as you will get a warning about this and you need to force ZFS to execute that way.

Another argument to make the above statement is that when you lose the file system because of issues with the underlying hardware, you lose the snapshots as well. (hardware redundancy is required to reduce this risk)

Mirrored ZFS volumes (or hardware RAID-1) are neither a backup solution, as a file corruption or virus is instantly replicated on the paired volume.

Think of backup as an off-line copy on external media. Obviously, this will never be the case for an online snapshot.

Despite its limitations, ZFS snapshots can be used as building blocks for a backup/recovery solution. That's the subject of this post.

I was actually writing some scripts to automate the creation of snapshots, with a retention schema to delete older snapshots and some extra functionality to save certain snapshots in files on off-line storage.

After some backup/recovery tests I was convinced that I had all pieces at hand to finish this job in no time. While doing this, I learned more about ZFS on SmartOS, so I believe those are the most valuable parts sharing here.

Approach

Currently, this approach was tested on SmartOS zones (Solaris) and LX branded zones only. In the near future, I might consider extending and testing this procedure on KVM as well.

The challenge is not just a matter of copying snapshots to files, but it's about saving and restoring zones (containers) with all the related datasets, assuring you can do a full recovery from just a set of files that are stored off-line. Based on this, you might even consider migrating a container from one global zone to antoher, just by sending the files over to another SmartOS host.

We will take snapshots of all datasets of the zone, and send certain snapshots to files via zfs send. Combined with the configuration of the zone we could re-create the zone from zero and then use zfs receive to restore the snapshots back from file(s).

It might sound a little over-simplified here, but when the details are fully understood it's not that difficult at all.

SmartOS seems to provide the tools vmadm send and vmadm receive which do exactly that for native zones, but for several years now they are in experimental phase and have some limitations published. From what I read on the web, I'm not convinced whether vmadm send/receive is good enough to base any backup/recovery strategy on. (it might be, but I just didn't find any assurance)

vmadm and imgadm are both written in nodejs, you might be interested in reading the code to learn more about it.

Closer look

If we have a closer look to how zones are build we know that an image is imported with imgadm and a ZFS clone of the image is made when a new zone is created with vmadm create.

To illustrate this we start on an freshly installed SmartOS global zone. We see that initially the list of images is empty by executing imgadm list.

Now let's import a base-64-lts image (native zone) and show it in the list:

[root@demo01 ~]# imgadm import 390639d4-f146-11e7-9280-37ae5c6d53d4
Importing 390639d4-f146-11e7-9280-37ae5c6d53d4 (base-64-lts@17.4.0) from "https://images.joyent.com"
Gather image 390639d4-f146-11e7-9280-37ae5c6d53d4 ancestry
Must download and install 1 image (167.4 MiB)
Download 1 image          [=========================================>] 100% 167.48MB 922.19KB/s  3m 5s
Downloaded image 390639d4-f146-11e7-9280-37ae5c6d53d4 (167.4 MiB)
...11e7-9280-37ae5c6d53d4 [=========================================>] 100% 167.48MB  19.38MB/s     8s
Imported image 390639d4-f146-11e7-9280-37ae5c6d53d4 (base-64-lts@17.4.0)

[root@demo01 ~]# imgadm list
UUID                                  NAME         VERSION  OS       TYPE          PUB
390639d4-f146-11e7-9280-37ae5c6d53d4  base-64-lts  17.4.0   smartos  zone-dataset  2018-01-04
[root@demo01 ~]#

Notice also that at the same time a dataset was created with the same UUID as the image, and that it was mounted on /zones/390639d4-f146-11e7-9280-37ae5c6d53d4:

[root@demo01 ~]# zfs list
NAME                                         USED  AVAIL  REFER  MOUNTPOINT
zones                                       5.68G  13.6G   528K  /zones
zones/390639d4-f146-11e7-9280-37ae5c6d53d4   564M  13.6G   564M  /zones/390639d4-f146-11e7-9280-37ae5c6d53d4
zones/archive                                 23K  13.6G    23K  /zones/archive
zones/config                                  30K  13.6G    30K  legacy
zones/cores                                   46K  13.6G    23K  none
zones/cores/global                            23K  10.0G    23K  /zones/global/cores
zones/dump                                  1.00G  13.6G  1.00G  -
zones/opt                                     23K  13.6G    23K  legacy
zones/swap                                  4.13G  17.7G   813K  -
zones/usbkey                                31.5K  13.6G  31.5K  legacy
zones/var                                    563K  13.6G   563K  legacy
[root@demo01 ~]#

What's more, also a first snapshot was created behind the scenes, with @final at the end of its name:

[root@demo01 ~]# zfs list -t snapshot
NAME                                               USED  AVAIL  REFER  MOUNTPOINT
zones/390639d4-f146-11e7-9280-37ae5c6d53d4@final   127K      -   564M  -
[root@demo01 ~]# 

Now, why is this important? Well, as we will see this new dataset will be cloned when we create a new zone. So, we have to understand this and we will need to take this into account to successfully backup and restore the zone to the exact same state.

The details, step by step

Let's move on by creating a new zone with vmadm create reading from stdin, and then list the vm's with vmadm list.

[root@demo01 ~]# vmadm create <<EOM
> {
>   "autoboot": true,
>   "brand": "joyent",
>   "image_uuid": "390639d4-f146-11e7-9280-37ae5c6d53d4",
>   "alias": "dummy",
>   "hostname": "dummy",
>   "dns_domain": "local",
>   "resolvers": [
>     "8.8.8.8",
>     "8.8.4.4"
>   ],
>   "max_physical_memory": 2048,
>   "nics": [
>     {
>       "interface": "net0",
>       "nic_tag": "admin",
>       "ip": "dhcp"
>     }
>   ]
> }
> EOM
Successfully created VM 85dbcafe-889a-e402-cf36-c7ffdd15e568

[root@demo01 ~]# vmadm list
UUID                                  TYPE  RAM      STATE             ALIAS
85dbcafe-889a-e402-cf36-c7ffdd15e568  OS    2048     running           dummy
[root@demo01 ~]#
[root@demo01 ~]# zfs list -o name,origin
NAME                                              ORIGIN
zones                                             -
zones/390639d4-f146-11e7-9280-37ae5c6d53d4        -
zones/85dbcafe-889a-e402-cf36-c7ffdd15e568        zones/390639d4-f146-11e7-9280-37ae5c6d53d4@final
zones/archive                                     -
zones/config                                      -
zones/cores                                       -
zones/cores/85dbcafe-889a-e402-cf36-c7ffdd15e568  -
zones/cores/global                                -
zones/dump                                        -
zones/opt                                         -
zones/swap                                        -
zones/usbkey                                      -
zones/var                                         -
[root@demo01 ~]#

For a shorter output, we could select only the zone's dataset with the -r option:

[root@demo01 ~]# zfs list -o name,origin -r zones/85dbcafe-889a-e402-cf36-c7ffdd15e568
NAME                                        ORIGIN
zones/85dbcafe-889a-e402-cf36-c7ffdd15e568  zones/390639d4-f146-11e7-9280-37ae5c6d53d4@final
[root@demo01 ~]#

Now, here we see that the zone's only dataset zones/85dbcafe-889a-e402-cf36-c7ffdd15e568 is actually a ZFS clone based on the snapshot zones/390639d4-f146-11e7-9280-37ae5c6d53d4@final.

You probably also noticed the zones/cores/85dbcafe-889a-e402-cf36-c7ffdd15e568 dataset, which provides an empty file system to hold the core dump in case of a system panic. Don't worry, this doesn't take any extra space as long as it is empty. A ZFS file system only takes the space it needs from the zpool to store the data it actually holds.

Now let's confirm all this by having a look at the zpool history:

[root@demo01 ~]# zpool history | cat -n
     1  History for 'zones':
     2  2018-05-01.18:37:16 zpool create -f zones c2t0d0
     3  2018-05-01.18:37:18 zfs set atime=off zones
     4  2018-05-01.18:37:21 zfs create -V 1024mb -o checksum=noparity zones/dump
     5  2018-05-01.18:37:21 zfs create zones/config
     6  2018-05-01.18:37:21 zfs set mountpoint=legacy zones/config
     7  2018-05-01.18:37:22 zfs create -o mountpoint=legacy zones/usbkey
     8  2018-05-01.18:37:22 zfs create -o quota=10g -o mountpoint=/zones/global/cores -o compression=gzip zones/cores
     9  2018-05-01.18:37:22 zfs create -o mountpoint=legacy zones/opt
    10  2018-05-01.18:37:22 zfs create zones/var
    11  2018-05-01.18:37:22 zfs set mountpoint=legacy zones/var
    12  2018-05-01.18:37:27 zfs create -V 4095mb zones/swap
    13  2018-05-01.18:38:11 zpool import -f zones
    14  2018-05-01.18:38:11 zpool set feature@extensible_dataset=enabled zones
    15  2018-05-01.18:38:11 zfs set checksum=noparity zones/dump
    16  2018-05-01.18:38:11 zpool set feature@multi_vdev_crash_dump=enabled zones
    17  2018-05-01.18:38:11 zfs destroy -r zones/cores
    18  2018-05-01.18:38:11 zfs create -o compression=gzip -o mountpoint=none zones/cores
    19  2018-05-01.18:38:16 zfs create -o quota=10g -o mountpoint=/zones/global/cores zones/cores/global
    20  2018-05-01.18:38:33 zfs create -o compression=lzjb -o mountpoint=/zones/archive zones/archive
    21  2018-05-01.18:43:07 zfs receive zones/390639d4-f146-11e7-9280-37ae5c6d53d4-partial
    22  2018-05-01.18:43:07 zfs rename -r zones/390639d4-f146-11e7-9280-37ae5c6d53d4-partial@snap56945 zones/390639d4-f146-11e7-9280-37ae5c6d53d4-partial@final
    23  2018-05-01.18:43:12 zfs rename zones/390639d4-f146-11e7-9280-37ae5c6d53d4-partial zones/390639d4-f146-11e7-9280-37ae5c6d53d4
    24  2018-05-01.18:47:20 zfs clone -o devices=off -F -o quota=10g zones/390639d4-f146-11e7-9280-37ae5c6d53d4@final zones/85dbcafe-889a-e402-cf36-c7ffdd15e568
    25  2018-05-01.18:47:20 zfs create -o quota=102400m -o mountpoint=/zones/85dbcafe-889a-e402-cf36-c7ffdd15e568/cores zones/cores/85dbcafe-889a-e402-cf36-c7ffdd15e568
[root@demo01 ~]#

All ZFS actions can be found here since the creation of the zpool or the moment the SmartOS host was created (line 2 - 20). But for now we are most interested in lines that come after line 20.

On line 21 we see that the image is imported by means of zfs receive, here a dataset was created under a temporary name in case it fails before completion (name ends in "-partial"). Finishing off with lines 22 and 23, renaming respectively the snapshot and the dataset.

As from line 24 we see the actions related to the vmadm create command, the image snapshot is cloned to a new dataset with the UUID of the new zone. Followed by the creation of the cores file system for our new zone on line 25.

Now let's create a first snapshot and show it in the list with the zfs list command.

[root@demo01 ~]# zfs snapshot zones/85dbcafe-889a-e402-cf36-c7ffdd15e568@first_snap

[root@demo01 ~]# zfs list -t snapshot
NAME                                                    USED  AVAIL  REFER  MOUNTPOINT
zones/390639d4-f146-11e7-9280-37ae5c6d53d4@final        127K      -   564M  -
zones/85dbcafe-889a-e402-cf36-c7ffdd15e568@first_snap      0      -   573M  -
[root@demo01 ~]#

Next, let's make some changes to the system by updating the packages database via the pkgin command. From the global zone we login to the zone with zlogin.

[root@demo01 ~]# zlogin 85dbcafe-889a-e402-cf36-c7ffdd15e568
[Connected to zone '85dbcafe-889a-e402-cf36-c7ffdd15e568' pts/2]
   __        .                   .
 _|  |_      | .-. .  . .-. :--. |-
|_    _|     ;|   ||  |(.-' |  | |
  |__|   `--'  `-' `;-| `-' '  ' `-'
                   /  ; Instance (base-64-lts 17.4.0)
                   `-'  https://docs.joyent.com/images/smartos/base

[root@dummy ~]# pkgin up
reading local summary...
processing local summary...
pkg_summary.xz                                             100% 2171KB 434.3KB/s 443.3KB/s   00:05

[root@dummy ~]# exit
logout

[Connection to zone '85dbcafe-889a-e402-cf36-c7ffdd15e568' pts/2 closed]
[root@demo01 ~]#

Back in the global zone, we could check what changes the pkgin up command made on the file system by comparing the content of the snapshot with the initial contents of the file system. We see that a number of files were modified (M), and a few were added (+).

[root@demo01 ~]# zfs diff zones/85dbcafe-889a-e402-cf36-c7ffdd15e568@first_snap
M       /zones/85dbcafe-889a-e402-cf36-c7ffdd15e568/root/root
M       /zones/85dbcafe-889a-e402-cf36-c7ffdd15e568/root/var/log/authlog
M       /zones/85dbcafe-889a-e402-cf36-c7ffdd15e568/root/var/db/pkgin
M       /zones/85dbcafe-889a-e402-cf36-c7ffdd15e568/root/var/adm/wtmpx
M       /zones/85dbcafe-889a-e402-cf36-c7ffdd15e568/root/var/adm/utmpx
M       /zones/85dbcafe-889a-e402-cf36-c7ffdd15e568/root/var/adm/lastlog
+       /zones/85dbcafe-889a-e402-cf36-c7ffdd15e568/root/var/db/pkgin/sql.log
+       /zones/85dbcafe-889a-e402-cf36-c7ffdd15e568/root/var/db/pkgin/pkgin.db
+       /zones/85dbcafe-889a-e402-cf36-c7ffdd15e568/root/root/.bash_history
[root@demo01 ~]#

We make a second snapshot:

[root@demo01 ~]# zfs snapshot zones/85dbcafe-889a-e402-cf36-c7ffdd15e568@second_snap

Now we will send the information in both snapshots as an incremental data stream to a file. So, we don't copy the data in the cloned snapshot here but only the increments of data until snapshot second_snap.

[root@demo01 ~]# zfs send -I zones/390639d4-f146-11e7-9280-37ae5c6d53d4@final  zones/85dbcafe-889a-e402-cf36-c7ffdd15e568@second_snap > /var/tmp/snap_to_file.t

[root@demo01 ~]# ls -lh  /var/tmp/snap_to_file.t
-rw-r--r--   1 root     root         34M May  1 20:01 /var/tmp/snap_to_file.t
[root@demo01 ~]#

In order to recover the zone from zero we will need to be able to re-create it based on the saved metadata, so we have to keep it in a file for later use. This can be done with vmadm get.

[root@demo01 ~]# vmadm get 85dbcafe-889a-e402-cf36-c7ffdd15e568 >/var/tmp/vmadm.get-85dbcafe-889a-e402-cf36-c7ffdd15e568.t

We are ready now to face a disaster that requires a complete recovery of the zone. To simulate this we delete the zone. (not really needed to stop the zone first)

[root@demo01 ~]# vmadm stop 85dbcafe-889a-e402-cf36-c7ffdd15e568
Successfully completed stop for VM 85dbcafe-889a-e402-cf36-c7ffdd15e568

[root@demo01 ~]# vmadm delete 85dbcafe-889a-e402-cf36-c7ffdd15e568
Successfully deleted VM 85dbcafe-889a-e402-cf36-c7ffdd15e568

And as we see here, there is nothing left on the global zone:

[root@demo01 ~]# vmadm list
UUID                                  TYPE  RAM      STATE             ALIAS
[root@demo01 ~]# vmadm list -t snapshot
UUID                                  TYPE  RAM      STATE             ALIAS

It is very easy to re-create the zone from the saved configuration:

[root@demo01 ~]# vmadm create -f /var/tmp/vmadm.get-85dbcafe-889a-e402-cf36-c7ffdd15e568.t
Successfully created VM 85dbcafe-889a-e402-cf36-c7ffdd15e568

Then we stop the zone, in case it was auto-started.

[root@demo01 ~]# vmadm stop 85dbcafe-889a-e402-cf36-c7ffdd15e568
Successfully completed stop for VM 85dbcafe-889a-e402-cf36-c7ffdd15e568

What we are going to do now might seem somewhat uncoventional, but don't worry we have all that's needed to restore the situation. We delete the freshly created zfs clone with the the zfs destroy command. The -f option makes sure to force unmounts, the -r option makes the process recursive to underlying mounted datasets. (not the case here)

[root@demo01 ~]# zfs destroy -fr zones/85dbcafe-889a-e402-cf36-c7ffdd15e568

We immediately repair this by receiving the incremental data stream to receive all the (incremental) data in the snapshots:

[root@demo01 ~]# zfs receive  -F zones/85dbcafe-889a-e402-cf36-c7ffdd15e568 </var/tmp/snap_to_file.t

Beware, if you forgot to specify the -fr options to the zfs destroy command you will get the error message shown below when attempting to receive the data from the incremental snapshots with the zfs receive command.

[root@demo01 ~]# zfs receive -F zones/85dbcafe-889a-e402-cf36-c7ffdd15e568 </var/tmp/snap_to_file.t
cannot receive new filesystem stream: destination 'zones/85dbcafe-889a-e402-cf36-c7ffdd15e568' is a clone
must destroy it to overwrite it

When we list all the datasets and snapshots related to this zone, we see they are all present again.

root@demo01 ~]# zfs list -t all -r  zones/85dbcafe-889a-e402-cf36-c7ffdd15e
zones/85dbcafe-889a-e402-cf36-c7ffdd15e568              42.2M  13.5G   594M  /zones/85dbcafe-889a-e402-cf36-c7ffdd15e568
zones/85dbcafe-889a-e402-cf36-c7ffdd15e568@first_snap    132K      -   573M  -
zones/85dbcafe-889a-e402-cf36-c7ffdd15e568@second_snap   136K      -   593M  -
[root@demo01 ~]#

We can now start the zone again and connect via zlogin to confirm whether all went well!

[root@demo01 ~]# vmadm start 85dbcafe-889a-e402-cf36-c7ffdd15e568
Successfully started VM 85dbcafe-889a-e402-cf36-c7ffdd15e568

[root@demo01 ~]# zlogin 85dbcafe-889a-e402-cf36-c7ffdd15e568
[Connected to zone '85dbcafe-889a-e402-cf36-c7ffdd15e568' pts/2]
Last login: Tue May  1 20:00:13 on pts/2
   __        .                   .
 _|  |_      | .-. .  . .-. :--. |-
|_    _|     ;|   ||  |(.-' |  | |
  |__|   `--'  `-' `;-| `-' '  ' `-'
                   /  ; Instance (base-64-lts 17.4.0)
                   `-'  https://docs.joyent.com/images/smartos/base

[root@dummy ~]#

Some extra thoughts

Obviously, you could simply ignore the cloned dataset and just backup and restore a full snapshot, but you will lose the ZFS clone origin which means you will also lose the advantages related to the cloned dataset approach:

  • Performance improvement of cached reads on RAM based adaptive replacement cache (ARC), which is a very fast cache located in the server’s memory (RAM). The amount of RAM and hence the ARC available in a server is usually limited, by using clones we obtain a more efficient use of the ARC.
  • Less disk space usage, there really is no need to have multiple copies of the exact same file system.

There are some pitfalls to this backup/recovery approach:

  • differences in global zone configurations might need some tweaking that makes recovery on another system more difficult, for example:
    • global zone is based on different boot image, and might not support the same set of images
    • the global zone might have a different network configuration
    • beware for dependency on external (lofs mounted) file systems (aka /home) or delegated datasets.

Alternatively, you could save the zone's metadata as follows and introduce it like this when recovering:

[root@demo01 ~]# cp /etc/zones/85dbcafe-889a-e402-cf36-c7ffdd15e568.xml /var/tmp

[root@demo01 ~]# grep 85dbcafe-889a-e402-cf36-c7ffdd15e568 /etc/zones/index  \
    |tee /var/tmp/85dbcafe-889a-e402-cf36-c7ffdd15e568.idx
85dbcafe-889a-e402-cf36-c7ffdd15e568:installed:/zones/85dbcafe-889a-e402-cf36-c7ffdd15e568:85dbcafe-889a-e402-cf36-c7ffdd15e568:joyent:ex:1

This implies also that you re-create the /zones/core dataset manually.

It is best to keep a copy of the image snapshot ...@final as well, as images might disappear from the imgadm import list over time. It's probably most convenient to have a copy at hand instead of trying to find it manually on an older system or online.