github twitter
Duplicacy on Linux with Backblaze B2
Apr 3, 2018
6 minutes read

This article documents the steps I took to set up a daily backup to my Backblaze B2 bucket using Duplicacy on Linux.

For the sake of this article, lets say that my Backblaze B2 bucket is called duplicacy (it doesn’t since someone else took that name) and the path to the directory with all of the data I want to back up is called /path/to/repo/.

Installation

Binaries are available from here: https://github.com/gilbertchen/duplicacy/releases

I downloaded the latest version (2.1.0 as of when I’m writing this) and moved it to /usr/local/bin/duplicacy.

Initialization

Repo/Storage Initialization

There are a variety of options that need to be decided on before initializing your repository and storage; you can see the options using duplicacy init help. The main ones are what you want to call your backup and whether or not you want your backup to be encrypted; the upside is that not anyone with access to your storage can reconstruct the contents of your backup.

To initialize a Duplicacy repo at /path/to/repo/ with a repository id of foobar with encrypted storage to my duplicacy bucket in Backblaze B2,

$ cd /path/to/repo
$ duplicacy init foobar b2://duplicacy -e

Duplicacy will prompt you for your B2 bucket id and key, along with a password used to encrypt the key used to encrypt your files. The encryption key is stored in the config file in your storage bucket. If/when you need to restore from your backup you won’t necessarily need to keep a copy of the config file, but you’ll need the chosen storage password, the repository id (or access to the bucket to look it up), and the Backblaze B2 account ID and key.

If you don’t want ecrypted storage, then don’t use the -e option.

Saving passwords

By default, you will be prompted for your b2 id, key, and storage password every time you run your backup; this obviously is an impediment to automated backups.

If you feel comfortable with being able to protect the .duplicacy directory, then all of this information can be stored in plain text in a .duplicacy/preferences file using the following commands. Duplicacy also has the option to save these credentials in certain keychains if they’re present; more information on these options is on the project wiki here.

duplicacy set -storage b2://duplicacy -key b2_id -value accountidgoeshere
duplicacy set -storage b2://duplicacy -key b2_key -value keygoeshere
duplicacy set -storage b2://duplicacy -key password -value passwordgoeshere

Filtering

There are a lot of options for configuring exclusions. For detailed documentation on configuring exclusions that do what you’re looking for, see the wiki page on it. By default it will include everything in the local repository, which might be perfectly fine depending on your use case.

These filters are specified in a .duplicacy/filters file.

Scripting

Duplicacy expects certain scripts to live in a scripts directory within the .duplicacy directory. In order to keep everything in one place, I store my backup script that ends up getting called by cron in there too.

Backup

At least as of Duplicacy 2.1.0, it does not nicely handle the situation where someone a backup script starts running before the previous one completes. There are more details on this issue here.

Here’s what my backup script currently looks like. I should probably update it to use flock instead of the DIY lockfile technique from stackoverflow, but I haven’t tested and implemented that yet.

This script will run a backup with log-style output, 2 threads, and otherwise default settings. If you want a slower and more thorough backup that hashes all files to determine what needs to be backed up rather than looking at file sizes and timestamps, then add the -hash option to the duplicacy backup command. Note that the -hash option will generate a lot more output which might be relevant if you’re logging the output like I’m doing.

The paths will need to be updated for your environment.

/path/to/repo/.duplicacy/scripts/backup.sh

#!/bin/bash

# https://stackoverflow.com/a/185473/1388019
lockfile="/tmp/duplicacy.lock"

if [ -e ${lockfile} ] && kill -0 `cat ${lockfile}`; then
    echo "duplicacy already running"
    exit
fi

# make sure the lockfile is removed when we exit and then claim it
trap "rm -f ${lockfile}; exit" INT TERM EXIT
echo $$ > ${lockfile}

# run the backup with default settings
cd /path/to/duplicacy/repository
/usr/local/bin/duplicacy -log backup -threads 2

# clean up lockfile
rm -f ${lockfile}

Pruning

Duplicacy will execute scripts that you create in file called .duplicacy/scripts/post-backup. The one I currently use below will remove all snapshots older than 1 year and keep 1 snapshot every 7 days for snapshots older than 60 days.

#!/bin/sh
# Purge old snapshots after backup

# keep none after a year
/usr/local/bin/duplicacy prune -keep 0:365

# keep a snapshot every 7 days after 60 days
/usr/local/bin/duplicacy prune -keep 7:60

Cron

To actually cause the backup script to run, I call it from cron like with an entry in my /etc/cron file. This will run the script at 2am every morning.

0 2 * * *       root    /path/to/repo/.duplicacy/scripts/backup.sh >> /var/log/duplicacy/backup.log 2>&1

Before this can work, you must first create the /var/log/duplicacy/ directory for the backup script output to be logged.

When you run the backup script, here’s an example of what the output looks like when no changes have been made to the local repository since the last backup. If there have been changes, each file will be listed in the output.

2018-04-10 02:00:01.338 INFO STORAGE_SET Storage set to b2://duplicacy
2018-04-10 02:00:17.504 INFO BACKUP_START Last backup at revision 24 found
2018-04-10 02:00:17.504 INFO BACKUP_INDEXING Indexing /path/to/repo
2018-04-10 02:00:17.513 INFO SNAPSHOT_FILTER Loaded 20 include/exclude pattern(s)
2018-04-10 02:03:54.995 INFO BACKUP_THREADS Use 2 uploading threads
2018-04-10 02:04:46.302 INFO BACKUP_END Backup for /path/to/repo at revision 25 completed
2018-04-10 02:04:46.302 INFO SCRIPT_RUN Running script /path/to/repo/.duplicacy/scripts/post-backup
2018-04-10 02:05:17.876 INFO SCRIPT_OUTPUT Storage set to b2://duplicacy
2018-04-10 02:05:17.876 INFO SCRIPT_OUTPUT Keep no snapshots older than 365 days
2018-04-10 02:05:17.876 INFO SCRIPT_OUTPUT No snapshot to delete
2018-04-10 02:05:17.876 INFO SCRIPT_OUTPUT Storage set to b2://duplicacy
2018-04-10 02:05:17.876 INFO SCRIPT_OUTPUT Keep 1 snapshot every 7 day(s) if older than 60 day(s)
2018-04-10 02:05:17.876 INFO SCRIPT_OUTPUT No snapshot to delete

Log maintenance

The following section can be added to /etc/logrotate.conf to give you 1 year (12 months) of logs which are compressed monthly.

/var/log/duplicacy/*.log {
    compress
    monthly
    dateext
    dateformat -%Y-%m-%d.log
    rotate 12
    copytruncate
}

Restoring

Should you computer die and you need to use you backup to restore from scratch, you can do so with * your repository id (e.g., foobar) * the information needed to access your Backblaze B2 bucket * your storage password (if you encrypted your backup).

In a different folder or a new computer,

cd to the directory you want to restore to

$ cd /path/to/new/folder

Initialize the folder as you did the initial repository. The -e option is only need if your backup is encrypted.

$ duplicacy init foobar b2://duplicacy -e

List available snapshots

$ duplicacy list

Then either restore everything using revision 20 (for example)

$ duplicacy restore -r 20

or if you only want to restore certain files, for example everything in a folder called Documents/Important/, you can do so like below.

$ duplicacy restore -r 20 +Documents/Important/* +Documents/

For more details, see the wiki here and here.

Other References


Back to posts


comments powered by Disqus