Unattended backups with ZFS, restic, Backblaze B2 and systemd

Regular, incremental and convenient backups (ish), without interference
Published on Tue February 15, 2022 with tags: restic, administration, linux, zfs.

In the past, whenever I had data loss due to hardware failure, I’d just take it on the chin and reassemble as much as possible. Due to crappy internet connections (low upload DOCSYS or, even worse, ADSL) backing up was simply infeasible (for instance, one attempt took about two weeks of continuous uploading followed by day-long incremental updates).

That changed recently, since I moved and got fiber installed, allowing for an acceptable 100Mbps upload.

The other major reason why I didn’t do regular wide scope backups is convenience: I would’ve had to remount /home as read-only, which is lots of inconvenient downtime and either manual or highly intrusive work.

Requirements

Every Monday morning, at around 3AM, automatically run an incremental backup to Backblaze (since it’s pretty cheap). Have it be scheduled at a very low priority, so that it doesn’t interfere with normal computer use and report failure via mail1.

[i] ~$ systemctl status restic-weekly.service
 restic-weekly.service - Weekly unattended /home backups
     Loaded: loaded (/etc/systemd/system/restic-weekly.service; static)
     Active: inactive (dead) since Mon 2022-02-14 03:24:45 CET; 2 days ago
TriggeredBy: ● restic-weekly.timer
    Process: 963988 ExecStart=/home/execute_backup.sh (code=exited, status=0/SUCCESS)
   Main PID: 963988 (code=exited, status=0/SUCCESS)
        CPU: 4min 33.140s

Feb 14 03:24:34 bstg execute_backup.sh[964064]: Added to the repo: 5.833 GiB
Feb 14 03:24:34 bstg execute_backup.sh[964064]: processed 1534343 files, 352.944 GiB in 24:27
Feb 14 03:24:34 bstg execute_backup.sh[964064]: snapshot 9fce7039 saved
Feb 14 03:24:35 bstg execute_backup.sh[963988]: + _clean
Feb 14 03:24:35 bstg execute_backup.sh[963988]: + cd /
Feb 14 03:24:35 bstg execute_backup.sh[963988]: + sleep 5
Feb 14 03:24:40 bstg execute_backup.sh[963988]: + zfs destroy zhome@restic2022_07_1
Feb 14 03:24:45 bstg systemd[1]: restic-weekly.service: Deactivated successfully.
Feb 14 03:24:45 bstg systemd[1]: Finished Weekly unattended /home backups.
Feb 14 03:24:45 bstg systemd[1]: restic-weekly.service: Consumed 4min 33.140s CPU time.

Snapshots as an alternative to downtime

Backing up a file system that is in use can lead to various kinds of data consistency problems and thus it is preferable to operate on snapshots instead.

Note

Snapshots alone do not solve all issues of concurrent writes, but I’ve decided that it’s good enough for my uses. Non-atomic operations (e.g. a single state being updated in two files) could still lead to inconsistent on-disk content, but snapshots reduce the time frame in which this is may happen to milliseconds or less.

In anticipation of properly implementing backups, I began using ZFS for my /home and did some maintenance starting with the removal of old, unused files. I also made sure to add cache tags and exclude markers where appropriate.

ZFS exposes snapshots at $MOUNTPOINT/.zfs/snapshot/$LABEL, but these act like separate devices (they have a different device ID) and are on a different mountpoint. This will be important later.

restic setup

restic is a relatively new backup tool that I picked because it seems fairly robust, easy to use and well implemented. I gave it’s design document a quick review and it seemed appropriate.

Snapshots restic produces reference the mount point and device IDs, along with inode numbers, to detect hardlinks, which is slightly problematic since both of those are different on snapshots. Thankfully, there are two outstanding PRs (#3200 and #3599) that fix this issue. One backport and git format-patch later, they’re ready to be dropped into /etc/portage/patches/app-backup/restic and forgotten about until they break an update.

restic backups operate on remote repositories, in my instance a bucket on Backblaze B2, an object storage service. The repository to operate on can be set via the RESTIC_REPOSITORY environment variable, and requires a password provided by RESTIC_PASSWORD. The B2 backend also requires an account ID and key, provided in B2_ACCOUNT_{ID,KEY}. I format these so that they can be eval’d by a shell and encrypt them with systemd-creds encrypt --name rcreds - /var/lib/backup_creds

Repositories have to be initialized with restic init. This operation sets up the basic structure and puts keys in place.

At this point, it’d be wise to copy this file and store it somewhere safe (perhaps in a password manager).

systemd and timers

systemd provides .timer units with some nifty features. Most useful among these are Persistent=, which acts like Anacron and WakeSystem=, which can resume the system from sleep.

[Unit]
Description=Weekly unattended /home backups (timer)

[Timer]
OnCalendar=Mon *-*-* 03:00:00
Persistent=true
WakeSystem=true
Unit=restic-weekly.service

[Install]
WantedBy=default.target

# vim: ft=systemd :

Note

This timer will wake your computer from sleep and won’t put it back to sleep after.

OnCalendar= defines when to run the event, in this case each Monday at three in the morning, while Unit= makes the timer start restic-weekly.service, which in turns runs the update script:

[Unit]
Description=Weekly unattended /home backups
OnFailure=status-email-arsen@%n.service      # {1}

[Service]
Type=oneshot                                 # {2}
ExecStart=/home/execute_backup.sh
Nice=19                                      # {3}
IOSchedulingClass=idle                       # { }

# vim: ft=systemd :
  1. When the backup fails, start status-email-arsen@restic-weekly.service service (the %n expands into the name of the current unit),
  2. Only run the service once,
  3. Run with the lowest CPU priority (19) and the lowest IO scheduling class (idle). This ensures the system remains virtually unaffected by the snapshot, as CFS will allocate the restic threads only idle time, as well as only doing their I/O when there is no other work to do. As we are operating on a snapshot, this extended period is not an issue, and we can focus on not being intrusive to the user. See manual page sched(7) for more info.

Last of these units is email failure reporting. To get emails to work, I installed OpenSMTPD to deliver mail to my local mailbox, and upon doing that I’ve found out that over the last five years I’ve accumulated around 350 thousand emails in my spool, sent in response to failures in cron jobs.

The status email unit is a short bit of boiler plate:

[Unit]
Description=status email for %i to user

[Service]
Type=oneshot
ExecStart=/usr/local/bin/systemd-email arsen@%H %i
User=nobody
Group=systemd-journal

… and the script it invokes:

#!/bin/sh
[ -n "$1" ] || exit 1
set -xeu

/usr/bin/sendmail -t <<ERRMAIL
To: $1
From: systemd <root@$(hostname)>
Subject: $2
Content-Transfer-Encoding: 8bit
Content-Type: text/plain; charset=UTF-8

$(systemctl status --full "$2")
ERRMAIL

The user in this unit is set to nobody, which systemd will complain about. Ignore that warning, the advertised alternative (DynamicUser=) will not work for us since it forcefully restricts SUID/SGID, which prevents sendmail from switching to the email group to submit to the spool.

Pulling it together

With that out of the way, the bulk of the work is handled by a single script:

#!/bin/sh
eval "$(systemd-creds decrypt --name rcreds /var/lib/backup_creds -)"
: "${B2_ACCOUNT_ID:?B2_ACCOUNT_ID unset}"
: "${B2_ACCOUNT_KEY:?B2_ACCOUNT_KEY unset}"
: "${RESTIC_PASSWORD:?RESTIC_PASSWORD unset}"
: "${RESTIC_REPOSITORY:?RESTIC_REPOSITORY unset}"

set -exu

export RESTIC_CACHE_DIR=/var/cache/restic
[ -t 0 ] || export RESTIC_PROGRESS_FPS=0.0016666

snapshot="$(date +restic%+4Y_%U_%u)"
zfs snap "zhome@$snapshot"

_clean() {
        cd /  # free up the dataset for destruction
        sleep 5 # ?????????????
        zfs destroy "zhome@$snapshot"
}
trap _clean EXIT

cd /home/.zfs/snapshot/"$snapshot"
restic backup \
        --exclude .cache \
        --exclude-caches \
        --exclude '*/dls/' \
        --exclude-if-present .resticexclude \
        --device-map "$(stat -c '%d' .):$(stat -c '%d' /home)" \
        --set-path /home \
        .

Going over the blocks one-by-one:

  1. We load the credentials from the previously encrypted file, check that we get all the parameters needed, then
  2. We tell restic to use /var/cache/restic as the cache directory, as it will default to $XDG_CONFIG_HOME/restic if unspecified, then
  3. If not running in a tty, only update the progress of the backup once per ten minutes, as to not spam logs, then
  4. We take a snapshot, called restic%+4Y_%U_%u, in order to have a unique value and know if a cleanup fails after the fact, the
  5. We use the EXIT trap to clean up after ourselves,
  6. We move into the newly taken snapshot, in order to help restic store the correct paths in the snapshot, which is later helped by --set-path, which changes the stored path to the backup from the snapshot directory into the home directory, effectively obscuring the fact we ever operated on a snapshot, then
  7. We initiate the backup, excluding all .cache directories, all directories tagged with CACHEDIR.TAG2, all download directories directly inside /home/* directories, all directories marked with a .resticexclude file; then it maps the new snapshots device ID to the normal home mounts device ID, in order to preserve unchanged files’ status, and then finally remaps . to /home.

I’m unsure about why the delay on the cd is necessary; I’d have to recompile ZFS to dump all open files on a zfs destroy, or something of that nature, but I haven’t had an opportunity to do that yet.

Afterword

This setup does not respect the 3-2-1 rule; Backblaze is, for my personal data, sufficiently robust, and most importantly, inexpensive.

Currently, as I mentioned, failure notification delivery is entirely local. While I do think that email is the easiest way to do this, it would require non-local delivery and additional monitoring in order to make it reliable (as currently power outages go undetected, and delivery to a remote inbox does not happen). I am likely going to look into creating a VPN to connect all machines being monitored together for notification delivery, and add additional monitoring for “high availability”3 machines, though that is quite likely to not happen any time soon.

Think carefully about what data you want to back up, and don’t shy away from dotting around exclude files: build artifacts are not worth backing up!

This post does not cover restic forget. I intend on using it when need be (= costs grow noticeably), rather than as a preventive measure, likely with --keep-last 4 --keep-yearly 4 or something of that nature.


  1. The system I have set up currently is fully local, I’d like to have null clients produce and email me results in the future.↩︎

  2. See this page for more info on CACHEDIR.TAG files. Not all programs use them, but many well behaved ones do. A notable exception is Chromium, sadly.↩︎

  3. High availability in this context being >99%, no more digits.↩︎

Want to discuss? Say hi by sending an email to ~arsen/public-inbox@lists.sr.ht [archives, etiquette].