Setting up Duply with Amazon S3

1.0 Introduction

I had previously been using a DigitalOcean server to do nightly backups, however considering the total file size of what I wanted backed up was only around 10GB, $5/month for their service seemed a little expensive.
I thus decided on using Amazon S3 for the following reasons:

  • It lets me learn what AWS is about, if only superficially
  • At the time of writing this for the initial account you get 12 months of free usage while under some specific bandwidth, storage and API limits
  • It’s very cheap for the 10GB I want backed up, costing my about 0.50 cents a month
  • In the future, should my backups bloat, I can move to their glacier storage, which seems even cheaper

When I started setting up this, I had no real experience with AWS, tho I had used Duply in the past. To estimate the cost of your backups, you can use the AWS calculator here.

1.1 Duply

Duply is a wrapper around Duplicity, which makes configuration of duplicity much easier. Essentially all configuration data needed to make Duplicity work, can be kept in a config file as part of a profile configuration in your home directory.

1.2 Duplicity

Duplicity is an application which enables the user to make encrypted, incremental backups to a server. Duplicity can be a little cumbersome to use, which is why, for the
purposes of this exercise, we will be employing duply to make a lot of the configuration easier for us.

2.0 Setting up AWS

Setting up the AWS account was fairly straightforward. Head over to Amazon AWS, click Signup, follow the prompts, enter your CC info and confirm your account over the phone.

2.1 Creating an IAM User

Past that, you will need to go to My Account->Security settings->Configure Security Credentials and save the credentials for the root user, we will be creating a new user for our Duply configuration rather than using the root user.

Go Back My Account->Security settings->Configure IAM Users and create a new user, named duply. This user will be the one which we will use for our backups.

Make sure you save the duply user credentials to the machine which will initiate the backup.

2.2 Setting AWS User Permissions

Following the creation of the user, we now need to to give the user permissions to access an S3 bucket.
Head over to Groups->Create New Group.
Specify the name as: my_s3 and in the Policy Type box, type S3, select AmazonS3FullAccess.
Click next and click Create Group.
Go back to the Users tab, select the duply user click Add User to Groups, and add the user to the created my_s3 group.

2.3 Creating the S3 bucket

The S3 bucket will be the destination of our backups. Click the Services tab at the top of the page, and select S3. Click “Create a Bucket”, name the bucket as duplybck or any name whose format could be considered a valid domain name, that means no odd characters, spaces or upper case letters, numbers are OK.

Enable logging, this may come useful later for debugging and create the bucket.

3.0 Setting up Encryption and Signing Certificates

Since we don’t really want our data to be tampered with, it is wise to create two certificates, one for encryption and one for signing.
Luckily GnuPG allows us to do just that.
The following represents the process I went through:

gpg --gen-key
gpg (GnuPG) 1.4.16; Copyright (C) 2013 Free Software Foundation, Inc.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Please select what kind of key you want:
(1) RSA and RSA (default)
(2) DSA and Elgamal
(3) DSA (sign only)
(4) RSA (sign only)
Your selection? 1
RSA keys may be between 1024 and 4096 bits long.
What keysize do you want? (2048) 4096
Requested keysize is 4096 bits
Please specify how long the key should be valid.
0 = key does not expire
n= key expires in n days
w = key expires in n weeks
m = key expires in n months
y = key expires in n years
Key is valid for? (0) 0
Key does not expire at all
Is this correct? (y/N) y

You need a user ID to identify your key; the software constructs the user ID
from the Real Name, Comment and Email Address in this form:
"Heinrich Heine (Der Dichter) <heinrichh@duesseldorf.de>;"

Real name: John doe
Email address: somewhere@somewhere.com
Comment: duplicity_test_key
You selected this USER-ID:
"John doe (duplicity_test_key) <somewhere@somewhere.com>"

Change (N)ame, (C)omment, (E)mail or (O)kay/(Q)uit? O
You need a Passphrase to protect your secret key.

gpg: gpg-agent is not available in this session
We need to generate a lot of random bytes. It is a good idea to perform
some other action (type on the keyboard, move the mouse, utilize the
disks) during the prime generation; this gives the random number
generator a better chance to gain enough entropy.
................................+++++
....+++++
We need to generate a lot of random bytes. It is a good idea to perform
some other action (type on the keyboard, move the mouse, utilize the
disks) during the prime generation; this gives the random number
generator a better chance to gain enough entropy.
..........................+++++
+++++
gpg: key B0F54D91 marked as ultimately trusted
public and secret key created and signed.

gpg: checking the trustdb
gpg: 3 marginal(s) needed, 1 complete(s) needed, PGP trust model
gpg: depth: 0 valid: 5 signed: 0 trust: 0-, 0q, 0n, 0m, 0f, 5u
gpg: next trustdb check due at 2020-04-15
pub 4096R/B0F54D91 2015-07-17
Key fingerprint = 69C3 8D3D 9519 CF32 74D6 69E0 6DEA B8FC B0F5 4D91
uid John doe (duplicity_test_key) <somewhere@somewhere.com>
sub 4096R/FDED2C85 2015-07-17

Note, generating a key which is 4096 bits long, requires a lot of random bytes. In this case the Linux kernel entropy pool might run out of numbers.
If this happens and you happen to be doing this on a headless server where connecting a keyboard and a mouse to the machine is not an option, a simple

ls -lR /

in a different shell may do the trick. This should generate enough disk activity to refill the entropy pool and allow the completion of the key creation process.

A word of caution, use a strong passphrase to protect your key, something randomly generated by KeePassX would do nicely.

3.1 Backing up the keys

Should something go horribly wrong with the machine initiating the backups, if our GPG keys go, our backed up data is pretty much useless.
It would be a good idea to backup these keys to some USB stick which we can keep in a drawer.

So, following the above example, run:

 gpg --export-secret-keys keyIDNumber > duplicity_key.asc 

the key ID number form the above example, would be B0F54D91.
Copy duplicity_key.asc to some thumb drive and keep it secret, keep it safe.

I do not create a revocation certificate here since we are only using this key for backups, and we are not sharing it with other people.

3.1.1 Restoring your keys

Lets assume that our main backup machine broke down, or we want to migrate the entire backup process to a different machine, it would be useful to know how to restore the keys.

Get that thumb drive out and run:

 gpg --import duplicity_key.asc 

GPG will complain, and rightly so, that it cannot validate the author of the key. We have to mark the key as ultimately trusted, ourselves.

adanaila@lucifer ~ $ gpg --edit B0F54D91
gpg (GnuPG) 1.4.16; Copyright (C) 2013 Free Software Foundation, Inc.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Secret key is available.

pub 4096R/B0F54D91 created: 2015-07-17 expires: never usage: SC
trust: unknown validity: unknown
sub 4096R/FDED2C85 created: 2015-07-17 expires: never usage: E
[ unknown] (1). John doe (duplicity_test_key) <somewhere@somewhere.com>;

gpg&gt; trust
pub 4096R/B0F54D91 created: 2015-07-17 expires: never usage: SC
trust: unknown validity: unknown
sub 4096R/FDED2C85 created: 2015-07-17 expires: never usage: E
[ unknown] (1). John doe (duplicity_test_key) <somewhere@somewhere.com>;

Please decide how far you trust this user to correctly verify other users' keys
(by looking at passports, checking fingerprints from different sources, etc.)

1 = I don't know or won't say
2 = I do NOT trust
3 = I trust marginally
4 = I trust fully
5 = I trust ultimately
m = back to the main menu

Your decision? 5
Do you really want to set this key to ultimate trust? (y/N) y

pub 4096R/B0F54D91 created: 2015-07-17 expires: never usage: SC
trust: ultimate validity: unknown
sub 4096R/FDED2C85 created: 2015-07-17 expires: never usage: E
[ unknown] (1). John doe (duplicity_test_key) <somewhere@somewhere.com>;
Please note that the shown key validity is not necessarily correct
unless you restart the program.
gpg>; quit

Phiew, back to where we started.

4.0 Setting up Duply

On the machine which will be performing the backup, run:

 sudo apt-get install duply duplicity python-boto haveged
duply create s3_backup
vim ~/.duply/s3_backup/conf 

Since we did all the hard work above, this should be a cakewalk:
The following will be important

GPG_KEY=B0F54D91
GPG_KEY_SIGN=FDED2C85
GPG_PW='Your secret passphrase'
GPG_PW_SIGN='Your secret passphrase'
GPG_OPTS='compress-algo=bzip2'
TARGET_USER=
TARGET_PASS=
TARGET='s3+http://duplybck/'
DUPL_PARAMS='--s3-use-new-style'
SOURCE='/mnt/to_backup'

Ok, so everything is setup, let’s take it for a spin:

duply s3_backup full

If it worked, which first time around it never will, duplicity will start backing things up.

5.0 Debugging Duplicity

Debugging duplicity can be a bit of a nightmare, mostly because there are just so many places where things can fail, mostly relating to incorrect configuration.

First things first, edit the configuration file to enable detailed logging:
vim ~/.duply/s3_backup/conf
VERBOSITY=9
and run the duply command again.

If the issue is with accessing amazon s3, check the S3 logs, which we enabled above.

6.0 Restoring

If a backup was successful, restoring is a doddle:

duply s3_backup restore /path/to/restore/to 

7.0 Conclusion

Overall setting duply up and getting it to work, is a lot of headache. There are plenty of guides out there which make it seem easy, but all is well and good until you run into a problem or a restoration needs to be had of any portion of this process.
I wish there was an easier way to enable all this to work, but with enough tinkering you should be able to get a reliable backup.

8.0 Additional Resources

Duply:

Tiny solution for automated backups: duply


https://www.thomas-krenn.com/en/wiki/Backup_on_Linux_with_duply
https://aguslr.github.io/blog/2012/04/18/backups-with-duply/
Duplicity:
https://splone.com/blog/2015/7/13/encrypted-backups-using-rsync-and-duplicity-with-gpg-and-ssh-on-linux-bsd/
GPG:
http://blog.pangyanhan.com/posts/2014-03-04-gpg-how-to-trust-imported-key.html
https://www.gnupg.org/gph/en/manual/x56.html
https://gist.github.com/chrisroos/1205934
http://ekaia.org/blog/2009/05/10/creating-new-gpgkey/

Leave a Reply