I had previously been using a DigitalOcean server to do nightly backups, however considering the total file size of what I wanted backed up was only around 10GB, $5/month for their service seemed a little expensive.
I thus decided on using Amazon S3 for the following reasons:
- It lets me learn what AWS is about, if only superficially
- At the time of writing this for the initial account you get 12 months of free usage while under some specific bandwidth, storage and API limits
- It’s very cheap for the 10GB I want backed up, costing my about 0.50 cents a month
- In the future, should my backups bloat, I can move to their glacier storage, which seems even cheaper
When I started setting up this, I had no real experience with AWS, tho I had used Duply in the past. To estimate the cost of your backups, you can use the AWS calculator here.
Duply is a wrapper around Duplicity, which makes configuration of duplicity much easier. Essentially all configuration data needed to make Duplicity work, can be kept in a config file as part of a profile configuration in your home directory.
Duplicity is an application which enables the user to make encrypted, incremental backups to a server. Duplicity can be a little cumbersome to use, which is why, for the
purposes of this exercise, we will be employing duply to make a lot of the configuration easier for us.
2.0 Setting up AWS
Setting up the AWS account was fairly straightforward. Head over to Amazon AWS, click Signup, follow the prompts, enter your CC info and confirm your account over the phone.
2.1 Creating an IAM User
Past that, you will need to go to My Account->Security settings->Configure Security Credentials and save the credentials for the root user, we will be creating a new user for our Duply configuration rather than using the root user.
Go Back My Account->Security settings->Configure IAM Users and create a new user, named duply. This user will be the one which we will use for our backups.
Make sure you save the duply user credentials to the machine which will initiate the backup.
2.2 Setting AWS User Permissions
Following the creation of the user, we now need to to give the user permissions to access an S3 bucket.
Head over to Groups->Create New Group.
Specify the name as: my_s3 and in the Policy Type box, type S3, select AmazonS3FullAccess.
Click next and click Create Group.
Go back to the Users tab, select the duply user click Add User to Groups, and add the user to the created my_s3 group.
2.3 Creating the S3 bucket
The S3 bucket will be the destination of our backups. Click the Services tab at the top of the page, and select S3. Click “Create a Bucket”, name the bucket as duplybck or any name whose format could be considered a valid domain name, that means no odd characters, spaces or upper case letters, numbers are OK.
Enable logging, this may come useful later for debugging and create the bucket.
3.0 Setting up Encryption and Signing Certificates
Since we don’t really want our data to be tampered with, it is wise to create two certificates, one for encryption and one for signing.
Luckily GnuPG allows us to do just that.
The following represents the process I went through:
gpg --gen-key gpg (GnuPG) 1.4.16; Copyright (C) 2013 Free Software Foundation, Inc. This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Please select what kind of key you want: (1) RSA and RSA (default) (2) DSA and Elgamal (3) DSA (sign only) (4) RSA (sign only) Your selection? 1 RSA keys may be between 1024 and 4096 bits long. What keysize do you want? (2048) 4096 Requested keysize is 4096 bits Please specify how long the key should be valid. 0 = key does not expire n= key expires in n days w = key expires in n weeks m = key expires in n months y = key expires in n years Key is valid for? (0) 0 Key does not expire at all Is this correct? (y/N) y You need a user ID to identify your key; the software constructs the user ID from the Real Name, Comment and Email Address in this form: "Heinrich Heine (Der Dichter) <firstname.lastname@example.org>;" Real name: John doe Email address: email@example.com Comment: duplicity_test_key You selected this USER-ID: "John doe (duplicity_test_key) <firstname.lastname@example.org>" Change (N)ame, (C)omment, (E)mail or (O)kay/(Q)uit? O You need a Passphrase to protect your secret key. gpg: gpg-agent is not available in this session We need to generate a lot of random bytes. It is a good idea to perform some other action (type on the keyboard, move the mouse, utilize the disks) during the prime generation; this gives the random number generator a better chance to gain enough entropy. ................................+++++ ....+++++ We need to generate a lot of random bytes. It is a good idea to perform some other action (type on the keyboard, move the mouse, utilize the disks) during the prime generation; this gives the random number generator a better chance to gain enough entropy. ..........................+++++ +++++ gpg: key B0F54D91 marked as ultimately trusted public and secret key created and signed. gpg: checking the trustdb gpg: 3 marginal(s) needed, 1 complete(s) needed, PGP trust model gpg: depth: 0 valid: 5 signed: 0 trust: 0-, 0q, 0n, 0m, 0f, 5u gpg: next trustdb check due at 2020-04-15 pub 4096R/B0F54D91 2015-07-17 Key fingerprint = 69C3 8D3D 9519 CF32 74D6 69E0 6DEA B8FC B0F5 4D91 uid John doe (duplicity_test_key) <email@example.com> sub 4096R/FDED2C85 2015-07-17
Note, generating a key which is 4096 bits long, requires a lot of random bytes. In this case the Linux kernel entropy pool might run out of numbers.
If this happens and you happen to be doing this on a headless server where connecting a keyboard and a mouse to the machine is not an option, a simple
ls -lR /
in a different shell may do the trick. This should generate enough disk activity to refill the entropy pool and allow the completion of the key creation process.
A word of caution, use a strong passphrase to protect your key, something randomly generated by KeePassX would do nicely.
3.1 Backing up the keys
Should something go horribly wrong with the machine initiating the backups, if our GPG keys go, our backed up data is pretty much useless.
It would be a good idea to backup these keys to some USB stick which we can keep in a drawer.
So, following the above example, run:
gpg --export-secret-keys keyIDNumber > duplicity_key.asc
the key ID number form the above example, would be B0F54D91.
Copy duplicity_key.asc to some thumb drive and keep it secret, keep it safe.
I do not create a revocation certificate here since we are only using this key for backups, and we are not sharing it with other people.
3.1.1 Restoring your keys
Lets assume that our main backup machine broke down, or we want to migrate the entire backup process to a different machine, it would be useful to know how to restore the keys.
Get that thumb drive out and run:
gpg --import duplicity_key.asc
GPG will complain, and rightly so, that it cannot validate the author of the key. We have to mark the key as ultimately trusted, ourselves.
adanaila@lucifer ~ $ gpg --edit B0F54D91 gpg (GnuPG) 1.4.16; Copyright (C) 2013 Free Software Foundation, Inc. This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Secret key is available. pub 4096R/B0F54D91 created: 2015-07-17 expires: never usage: SC trust: unknown validity: unknown sub 4096R/FDED2C85 created: 2015-07-17 expires: never usage: E [ unknown] (1). John doe (duplicity_test_key) <firstname.lastname@example.org>; gpg> trust pub 4096R/B0F54D91 created: 2015-07-17 expires: never usage: SC trust: unknown validity: unknown sub 4096R/FDED2C85 created: 2015-07-17 expires: never usage: E [ unknown] (1). John doe (duplicity_test_key) <email@example.com>; Please decide how far you trust this user to correctly verify other users' keys (by looking at passports, checking fingerprints from different sources, etc.) 1 = I don't know or won't say 2 = I do NOT trust 3 = I trust marginally 4 = I trust fully 5 = I trust ultimately m = back to the main menu Your decision? 5 Do you really want to set this key to ultimate trust? (y/N) y pub 4096R/B0F54D91 created: 2015-07-17 expires: never usage: SC trust: ultimate validity: unknown sub 4096R/FDED2C85 created: 2015-07-17 expires: never usage: E [ unknown] (1). John doe (duplicity_test_key) <firstname.lastname@example.org>; Please note that the shown key validity is not necessarily correct unless you restart the program. gpg>; quit
Phiew, back to where we started.
4.0 Setting up Duply
On the machine which will be performing the backup, run:
sudo apt-get install duply duplicity python-boto haveged duply create s3_backup vim ~/.duply/s3_backup/conf
Since we did all the hard work above, this should be a cakewalk:
The following will be important
GPG_PW='Your secret passphrase'
GPG_PW_SIGN='Your secret passphrase'
Ok, so everything is setup, let’s take it for a spin:
duply s3_backup full
If it worked, which first time around it never will, duplicity will start backing things up.
5.0 Debugging Duplicity
Debugging duplicity can be a bit of a nightmare, mostly because there are just so many places where things can fail, mostly relating to incorrect configuration.
First things first, edit the configuration file to enable detailed logging:
and run the duply command again.
If the issue is with accessing amazon s3, check the S3 logs, which we enabled above.
If a backup was successful, restoring is a doddle:
duply s3_backup restore /path/to/restore/to
Overall setting duply up and getting it to work, is a lot of headache. There are plenty of guides out there which make it seem easy, but all is well and good until you run into a problem or a restoration needs to be had of any portion of this process.
I wish there was an easier way to enable all this to work, but with enough tinkering you should be able to get a reliable backup.
8.0 Additional Resources
- Debugging the Linux Kernel via Crashdumps Part 2
- I2C – Cannot find smbus_* functions