Full system backups for FreeBSD systems using ZFS

Assuming you have created your ZFS FreeBSD system using the instructions on my site, here is how to do full system backups to an extra attached disk.

You can adjust these instructions if you need to store the backup remotely – but they are out of scope of this post.

First, in case you haven’t already… here is how to format/dev/da1 as a dedicated ZFS backup drive.  You can configure the backup drive however you want (it doesn’t even need to be ZFS-based) but you will also have to adjust these instructions accordingly to restore too.

gpart destroy -F da1
dd if=/dev/zero of=/dev/da1 bs=1m count=128
zpool create zbackup /dev/da1
zfs set mountpoint=/backup zbackup

The above will destroy any existing data on /dev/da1, and create a ZFS filesystem which is mounted at /backup.

Next we snapshot all filesystems under zroot and send them to a gzip’d file on the backup medium.  We then release the snapshot:

zfs snapshot -r zroot@backup
zfs send -Rv zroot@backup | gzip > /backup/full-system-backup.zfs.gz
zfs destroy -r zroot@backup

This is all you need to do on the live system.

If the worst happens, and you need to restore to a new system (or a freshly formatted one)…

Firstly, follow the original instructions up to and including the line “zfs set checksum=fletcher4 zroot”.

Next, we import the backup ZFS drive and mount it – then we use ZFS receive to restore the filesystem and all its dependants:

zpool import -f zbackup
zfs set mountpoint=/boot/zfs/backup zbackup
zfs mount zbackup
gunzip -c /boot/zfs/backup/full-system-backup.zfs.gz | zfs receive -vdF zroot

Now we need to unmount the backup drive, and mount the original root ZFS so we can re-create the cache file (the system will not boot without the correct cache file):

zpool export zbackup
zfs set mountpoint=/boot/zfs/zroot zroot
cd /boot/zfs
zpool export zroot && zpool import zroot
cp /boot/zfs/zpool.cache /boot/zfs/zroot/boot/zfs/zpool.cache
zfs unmount -a
zfs set mountpoint=legacy zroot
reboot

This will reboot the system in its original state.  If you want to re-mount your backup medium, it will need to be re-imported and mounted:

zpool import -f zbackup
zfs set mountpoint=/backup zbackup

That’s all there is to it.  A fully working disaster recovery solution.

 

49 thoughts on “Full system backups for FreeBSD systems using ZFS

  1. BSDGuy

    Excellent guide Dan!

    I found a couple of useful variables to include in the file name to include dates:

    zfs send -Rv zroot@backup | gzip > /backups/zroot/`date +%d.%m.%Y`-zroot.zfs.gz

    Is it necessary to separately backup the bootdir pool? Or is this included in the zroot snapshot backup?

    The only thing I am unsure of in this guide is, what would happen if I had a copy of my zroot.zfs.gz (zfs send file) on a USB key/drive that *doesn’t” have ZFS on it…how would I then be able to recreate the cache file that you say is needed for the system to boot?

    Reply
  2. dan Post author

    You would need to run the command set once for each pool (so twice in the encryption case – once for zroot and once for bootdir). It is the pool that you are backing up, so you need to run it once for each pool you have.

    The destination for the backup file can be any medium. It can be a regular UFS mounted disk or even a FAT32 USB drive (tho beware the max file size of 4GB). You could also use ssh to store the file externally via the network if required.

    The cache file is only needed on a new system, it doesn’t need to be backed up. The export & import step of the installer creates the cache file for you and relates to the newly created zpool.

    Reply
  3. BSDGuy

    Got it, that makes sense re bootdir. Just wanted to clarify.

    Aaah, I think I read: zpool export zbackup and just assumed (incorrectly) that the cache file was for that pool but having read more carefully I see the cache file you are copying is related to the zroot pool. I can’t wait to test this.

    If you wanted to schedule your backups to run automatically according to a schedule like you have mentioned above (snapshot, zfs send compressed and then delete snapshot) is this easily accomplished?

    Reply
  4. dan Post author

    You can schedule your backups using a regular root cronjob… create the commands to snapshot, send, destroy in a .sh file somewhere and run it on schedule (either using the ‘crontab’ command or using the system crontab file /etc/crontab)

    Reply
  5. BSDGuy

    Thanks very much! I have scheduled some cron jobs and we’ll see how it goes this week!

    Do you know how to delete files older than (say) 31 days in the .sh file? At the moment I am creating a new zfs send dumpfile everyday but only want to keep 31 days worth. Is there a way to check the date of the file and if its older than 31 days to delete it?

    Reply
  6. dan Post author

    that will delete any file in /backups (including subdirectories) older than 31 days. be careful with it though, it can be dangerous!

    perhaps “cd /backups && find . -type f -mtime +31d -delete” would be better (it wont delete then unless it can cd /backups.

    Reply
  7. BSDGuy

    Thanks Dan! I created a script file and ran a test with it and it worked fine last night. When it ran as a scheduled job for early this morning it caused the server to panic. Its my first FreeBSD panic so will have to investigate.

    Reply
  8. dan Post author

    If you have low memory on the server (under 4GB), then you can get panics if you don’t tweak kernel memory options in /boot/loader.conf. The panics will be things like ‘out of kernel memory’

    Reply
  9. BSDGuy

    Currently I have the following set:

    vfs.zfs.prefetch_disable=”1″
    vfs.root.mountfrom=”zfs:zroot”
    zfs_load=”YES”
    vm.kmem_size=”512M”
    vm.kmem_size_max=”512M”

    Do I need to increase the values of the last two maybe? I only have 4GB RAM…well 3.5GB is useable.

    Reply
  10. dan Post author

    I would suggest adding the following entries in addition to your current settings:

    vfs.zfs.arc_max=”80M”
    vfs.zfs.vdev.cache.size=”5M”

    The “arc” is the main culprit for eating all your ram up.
    Also, don’t be fooled when it says “arc_max” as it can exceed the max setting sometimes. Think of it as a target maximum.

    You can tweak these settings too, on some servers I have them set to 160M/10M

    You should be able to safely increase your kmem sizes to 1024M too.

    Reply
  11. BSDGuy

    The other thing I have been wondering is, do I need to put pauses inbetween each command in the backup.sh file?

    What I find strange is that I can run the command manually but when they are scheduled the server crashes. I am still going to make the changes to the loader.conf file (as you recommended) but it doesn’t make sense to me why this is happening?

    I tried scheduling the backup.sh file using Webmin.

    Reply
  12. dan Post author

    There’s no need to delay between commands, each one only returns once the process is completed so perfectly safe to do them without any delay.

    If you’re running the backup overnight, it may coincide with the system’s daily/weekly checks which adds additional pressure on the filesystem (just a thought).

    The extreme load of reading the entire filesystem as quickly as possible pushes the memory caches up. If you’re running on low kernel memory, it doesn’t take long to push the cache above the kmem limit and this causes a panic. Setting the arc size and vdev cache sizes lower tries to prevent that from happening.

    Also remember that other parts of the kernel require kernel memory (such as network buffers).

    Reply
  13. BSDGuy

    Many thanks for the explanation Dan. I have disabled the daily/weeklymonthly periodic jobs (many months ago). The only “big” cron job I run every day (at about 4:30am) is to update my ports tree. I think the first thing to do is modify the loader.conf values as per your recommendation. Currently my backup file (zfsbackups.sh) looks as follows:

    #!/bin/sh
    zfs snapshot -r bootdir@`date +%d.%m.%Y`-bootdir
    zfs send -Rv bootdir@`date +%d.%m.%Y`-bootdir | gzip > /backups/bootdir/`date +%d.%m.%Y`-bootdir.zfs.gz
    zfs destroy -r bootdir@`date +%d.%m.%Y`-bootdir
    zfs snapshot -r zroot@`date +%d.%m.%Y`-zroot
    zfs send -Rv zroot@`date +%d.%m.%Y`-zroot | gzip > /backups/zroot/`date +%d.%m.%Y`-zroot.zfs.gz
    zfs destroy -r zroot@`date +%d.%m.%Y`-zroot
    cd /backups; find . -type f -mtime +60d -delete;

    Does this backup script look ok? I’ve put the zfsbackups.sh in the /root directory and scheduled this job to run at 2am using Webmin. Hope I’m doing this correctly!

    Reply
    1. Rod

      This script has a common problem when `data xxx` is used in shell scripts, this script is not too bad as the date format is only going to change on day boundaries, worst scripts have times in the `data` format strings and those can lead to frequent problems as the clock ticks along while the script runs.
      The assumption in this script is that the time between the zfs snapshot and zfs destroy occur on the same day. If this script is run near midnight that may not be true. Correct usage of data in this situation is to store it in a shell variable and use that variable any time you want to refer to the same object.
      This is also an efficiency improvement as you call data(1) less often:
      I changed the format to be year, month, day so that the directory listing sorts in cronological order as well.
      #!/bin/sh
      snap=`date +%Y.%m.%d`
      zfs snapshot -r bootdir@${snap} -bootdir
      zfs send -Rv bootdir@${snap} -bootdir | gzip > /backups/bootdir/`date +%d.%m.%Y`-bootdir.zfs.gz
      zfs destroy -r bootdir@${snap} -bootdir

      Reply
  14. dan Post author

    Those look file, but i would change the last line to “cd /backups && find . -type f -mtime +60d -delete” – that prevents the find/delete running if it can’t change to the /backups folder… in your current script if you rmdir /backups, you could end up wiping chunks of your entire system.

    Reply
  15. BSDGuy

    Wooaaa, thanks for the tip! I have made the change to the backup script.

    Whats the best way to test if you have scheduled your backup script correctly in cron? I just want to make sure that when this runs as scheduled that it doesn’t crash my server again during the night 😉 I tried running it as a test via Webmin which ran fine but obviously this wasn’t good enough as the server crashed during the night. Is there a better way to do this?

    Reply
  16. BSDGuy

    I made those changes to the loader.conf file. It didn’t like the kmem size of 1024M (kernel panic) so I changed it back to 512M.

    I tried running the backup script again but when it runs it says:

    /root/zfsbackups.sh: zfs: not found
    /root/zfsbackups.sh: zfs: not found
    /root/zfsbackups.sh: zfs: not found

    Any idea why it would do this?

    Reply
  17. BSDGuy

    Think I got it working 😉 Putting /sbin/zfs for the command solved the above problem. After making the changes to the loader.conf file, I ran the following command:

    sysctl kstat.zfs.misc.arcstats.size

    and its now running around 83MB!

    Reply
  18. dan Post author

    It sounds like your path is broken – so you need to use the full path to the binary.
    With the arc size line I gave you, your arc size should be around 80MB (83 is fine) – it’s not a definitive limit, it’s a target limit. So long as it’s sufficiently low that you don’t exhaust your kernel ram 🙂

    If the zfs command was not found, you’re likely to have to specify a full path to gzip and date too.

    Reply
  19. BSDGuy

    Yeah, full path worked nicely 😉 Gzip and date seem to be running fine (from what I can tell). This is a really nice little backup solution. I’ll monitor it over the coming days to see how it gets on.

    You don’t perhaps know the command to monitor the size of your vdev cache size?

    Reply
  20. BSDGuy

    Oh, ok. I don’t know why I thought there would be two separate commands!

    I plan to give your article a try this weekend to test DR 😉

    I’m really pleased with ZFS so far. Got loads to learn but it’ll be worth it. Thank you so much for all your amazing help and guidance.

    Reply
  21. dan Post author

    🙂 you’re welcome. The server running this blog is running ZFS root (non-encrypted) and has been for over a year.

    Reply
  22. BSDGuy

    Thats good to know! Do you run your server on amd64? Thats my next goal but thats waaaay into the future.

    Reply
  23. dan Post author

    It’s FreeBSD 9-STABLE/amd64 on a HP ProLiant DL360G7 server, with 2x6core Intel Xeon CPUs, 144GB ram and hardware RAID6. I also use the failover network interfaces as described on my blog which connects to two separate switches to cater for switch and/or cable failure.
    I no longer have any i386 FreeBSD servers remaining – in this day and age, everything is amd64-based (if you want more than 3-3.5GB ram you have no choice).
    FreeBSD’s memory management for amd64 is far nicer than i386 edition.

    Reply
  24. BSDGuy

    Drool 😉 Thats hardware must have cost a fair penny! I work with lots of HP ProLiant servers but only with (cough splutter) MS software. I am quite disgusted with the performance I experience on HP ProLiant DL380G7 running Windows Server 2008 R2 with 18GB RAM and fast disks etc. Its pathetic really. My server at home is an HP dc5800 second hand desktop running FreeBSD 9.0 RELEASE i386 with 4GB RAM and I reckon it runs faster/better than the servers I install/configure/support at clients.

    I really like FreeBSD 😉

    Reply
  25. dan Post author

    I work with a mixture of FreeBSD and Windows 2008R2 server (occasionally linux), mostly on HP ProLiant G5-G7 servers of varying specifications. We use quite a few of them for VMware’s ESXi to virtualise servers.
    Disk I/O performance on HP servers is quite low unless you have a decent sized cache module *and* battery. Without the battery, the controllers turn off all write caching rendering most of the cache module useless. (in tests, without battery was 1/10th of the speed)

    Reply
  26. BSDGuy

    VMWares an excellent product. If I had beefier hardware at home I’d run VMWare ESXi to tinker with it. I’d *love* to see FreeBSD run on any recent ProLiant. It must fly. Theres a DL380G5 a clients decommissioned but I couldn’t afford the electricity to run it at home nor could I stand the noise LOL. Would be great in winter though as it would heat my place up nicely 😉

    While I think of it. Have you upgraded an encrypted ZFS root system to a newer version of FreeBSD using the source option (ie: upgrading from FreeBSD 8.2 to 9.0)? Are the steps that much different to upgrading a UFS system (compared to an encrypted ZFS root)? Using snapshots must be a nice thing to do here so you can rollback it anything goes wrong with the upgrade?

    Reply
  27. dan Post author

    Power usage of a 380G5 is around 230-250 watts depending on configuration (each pair of memory modules in a G5 uses a LOT of power, so keeping them down is worth it)
    You certainly wouldn’t want it outside of a datacentre, they’re far too noisy.

    The encryption is done at a very high level, so upgrading your system is identical to a UFS-based unencrypted system. You can snapshot if required, of course.

    The only time ZFS systems get a little more confusing is if you wish to upgrade the ZFS filesystem to a newer revision as this requires a bootcode update before upgrading the pool/filesystem version (the upgrade is never done automatically, it is only ever done as a result of the ‘zpool upgrade’ or ‘zfs upgrade’ commands) and will work perfectly if not upgraded (other than ‘zpool status’ will remind you that it’s an older version)

    Reply
  28. BSDGuy

    I’ve been carefully going through your guide and just realised something 😉 You didn’t mention the restoring of the bootdir pool?

    Also, I would imagine you have to copy your encryption.key into the /bootdir/boot and use it when attaching the drives if you want to use the existing key?

    Reply
  29. BSDGuy

    Hi Dan! I started to do my restore this evening but it didn’t go well ;-( I got up to doing these two steps:
    gunzip -c /media/bootdir.zfs.gz | zfs receive -vdF bootdir
    gunzip -c /media/zroot.zfs.gz | zfs receive -vdF zroot

    But I couldn’t get the pipe character to type out from my keyboard (how do you do the pipe character from a UK keyboard??) so I unzipped the files on another machine and then ran:
    zfs receive -vdF bootdir < /media/bootdir.zfs (/media is my USB key)
    zfs receive -vdF zroot < /media/zroot.zfs

    This then restored all the data…so far so good. This is where I think it went wrong for me (I have modified some commands):
    zfs set mountpoint=/boot/zfs/zroot zroot
    cd /boot/zfs
    zpool export zroot && zpool import zroot
    cp /boot/zfs/zpool.cache /boot/zfs/zroot/boot/zfs/zpool.cache
    zpool export zroot && zpool import bootdir
    cp /boot/zfs/zpool.cache /boot/zfs/bootdir/boot/zfs/zpool.cache – this was a problem as I couldn't do the copy as the destination folders didn't exist?
    At this stage I was stuck so I went ahead anyway with:
    zfs unmount -a
    zfs set mountpoint=legacy zroot
    reboot

    And as I thought it failed to mount from zfs:zroot (with error 2).

    Any ideas what I am doing wrong? I'm a bit confused about how bootdir fits into this and with the zpool.cache files for each pool.

    Thanks for any help! 😉

    Reply
  30. BSDGuy

    Oh I also copied my encryption key across as follows:

    cp /media/encryption.key /boot/zfs/bootdir/encryption.key

    Reply
  31. BSDGuy

    Hi Dan! I managed to get my system restored but in the end I lost track of *how* I got the final bit done. Would you mind having a look at the commands I used. I did a restore of an encrypted ZFS root (bootdir and zroot). I pretty much followed your instructions until the last section for doing the cache file:

    zfs set mountpoint=/boot/zfs/zroot zroot
    zfs set mountpoint=/boot/zfs/bootdir bootdir
    zfs mount bootdir (I think zroot was already mounted at this stage)
    cd /boot/zfs
    zpool export zroot && zpool import zroot
    cp /boot/zfs/zpool.cache /boot/zfs/bootdir/boot/zfs/zpool.cache
    zpool export bootdir && zpool import bootdir
    cp /boot/zfs/zpool.cache /boot/zfs/bootdir/boot/zfs/zpool.cache
    zfs unmount -a
    zfs set mountpoint=legacy zroot
    zfs set mountpoint=/bootdir bootdir
    reboot

    The server booted up fine but I really battled to find the destination directory I was copying the zpool.cache files into. Is a separate zpool.cache file needed for each pool (zroot and bootdir in this case)?

    Reply
  32. dan Post author

    The zpool.cache file is system wide, so you have to export & import each pool (the action of importing updates the zpool.cache file) – after you have done both pools, you copy the zpool.cache file to /boot/zfs/zpool.cache within the bootdir mount (in your case /boot/zfs/bootdir/boot/zfs/zpool.cache)
    When your system boots, your bootdir pool is the root (so /boot/zfs/zpool.cache is accessible) – after booting the kernel, it then mounts the encrypted zpool zroot as the root, and bootdir as /bootdir. Symlinks in the zroot pool point /boot to /bootdir/boot so that your /boot/zfs/zpool.cache file is still located on the bootdir pool (even though it is now mounted in a different place) – the beauty of symlinks.

    Reply
  33. BSDGuy

    Thanks for the explanation! So in this scenario…theres two pools. If I had to summarise:

    zpool export zroot && zpool import zroot
    zpool export bootdir && zpool import bootdir

    Would add **both** pools entries to the zpool.cache file? This is what I think I have been misunderstanding. I thought there was one zpool.cache file per pool. Am I correct in this summary?

    So copying the zpool.cache file after running the export/import on both pools to the /boot/zfs directory is all that needs doing here?

    Hope I’m on the right track…thanks for your help!

    Reply
  34. dan Post author

    That’s correct – no need to copy the file until after re-importing both pools. The cache file holds information on all known pools on the system. Whenever you create or import a pool (even during normal server operation), the file is updated.

    When you export a pool, it removes it from the cache file (as it believes you are going to export it to another system). You can only import a pool that has been properly exported (otherwise a warning appears to say it is part of another system, although you can force it to import with -f)

    Reply
  35. BSDGuy

    Well thank you very much for explaining that to me! I wondered where I was going wrong. I kept thinking the bootdir pool cache file needed to be copied somewhere else. Now that you have confirmed what I was thinking it makes sense now! 😉

    Thank you Dan!

    Reply
  36. francis

    Hi ALl,
    am having this script to do for me hourly snapshots for the zfs pool t24mcb10, but it cannot change from 1 day to another, therefore when it does snapshot at 2300hrs it fails to do 00hrs becourse the dates are different and it names the snapshots by date, anyone with any ideas how to incoporate the changing of the date automaticaly?

    Please help

    #!/bin/sh

    pool=”t24mcb10/beans”
    destination=”t24mcb10″
    host=”172.16.4.7″
    now=`date +”$type-%Y-%m-%d-%H”`

    earlier=`perl -e ‘print scalar localtime ( time – 3600 ) . “\n”;’| awk ‘{print $4}’ |cut -d: -f1,1`
    nowe=`date +”$type-%Y-%m-%d”`
    name=$nowe-$earlier

    # create today snapshot
    snapshot_now=”$pool@$now”
    # look for a snapshot with this name
    if zfs list -H -o name -t snapshot | sort | grep “$snapshot_now$” > /dev/null; then
    echo ” snapshot, $snapshot_now, already exists”
    exit 1
    else
    echo ” taking todays snapshot, $snapshot_now”
    zfs snapshot -r $snapshot_now
    fi

    # look for yesterday snapshot
    snapshot_name=”$pool@$name”
    if zfs list -H -o name -t snapshot | sort | grep “$snapshot_name$” > /dev/null; then
    echo ” earlier snapshot, $snapshot_name, exists lets proceed with backup”

    zfs send -R -i $snapshot_name $snapshot_now | ssh root@$host zfs receive -Fduv $destination
    #zfs send -i $snapshot_name $snapshot_now | zfs receive -F $destination
    echo ” backup complete destroying earlier snapshot”
    zfs destroy -r $snapshot_name
    exit 0
    else
    echo ” missing earlier snapshot aborting, $snapshot_name”
    exit 1
    fi

    Reply
  37. Brooklyner

    Dan,

    Following your guide, I was able to backup and all looks well. However, I must be missing something on the restore part of the process.

    When I attempt: “zfs set mountpoint=/boot/zfs/backup zbackup”

    I receive the error: “Cannot mount ‘/backup’: failed to create mountpoint”

    My install was based on the guide found here: https://wiki.freebsd.org/RootOnZFS/GPTZFSBoot/9.0-RELEASE

    I have a hunch that I am following a slightly different process to get FreeBSD to boot from a zfs volume and that there is some incompatibility in the zfs datasets.

    Any ideas?

    Reply
    1. dan Post author

      There’s no real direct correlation between the command you typed and the error given.
      The only thing I can think of would be that it is trying to mount its lower datasets under zbackup with one of them having a specific mountpoint of /backups rather than an inherited mountpoint.

      Reply
  38. Bob

    Hi Dan

    I have done a restore of my system but when it boots up I get:

    Mounting from zfs:zroot failed with error 45

    Do you know what this means? I can’t seem to find much info on it!

    Reply
    1. dan Post author

      Not entirely sure i’m afraid – but it might be that your zpool.cache file is stale (especially if you’ve imported it).

      Reply
  39. Bob

    When you say: your zpool.cache file is stale

    Can you ellaborate?

    I only imported the bootdir pool as it is on a bootable USB key but the zroot pool was created from scratch and then a zfs send/receive was done to get the data from the old server to the new.

    I did run:

    zpool export zroot && zpool import zroot

    and

    cp /boot/zfs/zpool.cache /boot/zfs/zroot/bootdir/boot/zfs/zpool.cache

    so I’m not sure what I’m missing here!!

    Appreciate the reply.

    Reply
  40. dan Post author

    so long as you did the export/import and copy of the zpool.cache file, it shouldn’t be stale.

    One quick thought…

    As you created the zroot pool fresh, you need to make sure the bootcode, boot loader and kernel can use it.
    e.g. if your bootpool has freebsd 10.0 kernel etc, but you created zroot with freebsd 10.1 then you’ll have trouble booting. (10.1 imported a newer version of zfs which 10.0 will refuse to mount due to new feature flags)

    If the above is the case, you could copy the freebsd 10.1 kernel+boot loader on top of the files in the bootpool already (they’re in kernel.tbz on the FTP sites) which would give you a 10.1 kernel but 10.0 userland (you’d have to complete the upgrade later)

    Reply
  41. Bob

    Dan, I think you may have nailed it!

    One other thing:

    When I ran zpool status bootdir it did say I needed to upgrade the pool but I *didn’t*. So I guess if I upgrade the bootdir pool to the latest version that it will then be able to boot a zroot pool created with a 10.1 disk?

    So I’m guessing I need to upgrade the bootdir pool AND do what you said with updating the kernel?

    Thanks so much!

    Reply
  42. dan Post author

    zfs is backwards compatible. so the bootpool will happily still boot without being upgraded.
    if you upgrade your bootpool, then you’ll also need to update your bootcode on your disks or you wont be able to boot at all!
    I’d recommend leaving the bootpool as it is, and just update the kernel+bootloader.

    If you do upgrade your bootpool, you’ll also need to update the bootcode on your disks (re-run the ‘gpart bootcode’ commands) with pmbr and gptzfsboot from 10.1-RELEASE otherwise your machine won’t boot from the bootpool.

    Reply
  43. Bob

    Dan, you are a genius 😉

    I updated the kernel directory on my USB bootdir drive and was able to boot up straightaway!

    Thank you so much for your help, it is much appreciated.

    Reply

Leave a Reply to Bob Cancel reply

Your email address will not be published. Required fields are marked *