Automatic Backups to Amazon S3 are Easy
Push important files to the cloud with s3cmd and cron
You have good reason to backup your files. Amazon S3 is a cost-effective storage option. It doesn't take the place of a dedicated drive that you own, it can be useful for redundancy nonetheless. With a few easy command-line steps (plus some pre-requisites), you can set up your machine to automate backups to S3 in no time.
- An Amazon web services account and your Amazon access credentials
s3cmd: command line interface to S3.
cron is pretty standard on unix-based systems. As of this writing,
s3cmd should be straightforward:
# Mac users $ brew install s3cmd # Linux $ yum install s3cmd # or $ apt-get install s3cmd
gpg: opensource encryption program
First you'll need to configure s3cmd:
s3cmd --configure. Have your Amazon access key and secret key at the ready.
If you plan to store sensitive data on S3, enter the path to your
s3cmd will encrypt your data before transferring from your machine to S3. It also decrypts when downloading to your machine. Keep in mind, encrypted files won't be readable by others with direct access.
Here's a sample result:
$ s3cmd --configure Enter new values or accept defaults in brackets with Enter. Refer to user manual for detailed description of all options. Access key and Secret key are your identifiers for Amazon S3 Access Key: xxxxxxxxxxxxxxxxxxxx Secret Key: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Encryption password is used to protect your files from reading by unauthorized persons while in transfer to S3 Encryption password: xxxxxxxxxx Path to GPG program: /usr/local/bin/gpg When using secure HTTPS protocol all communication with Amazon S3 servers is protected from 3rd party eavesdropping. This method is slower than plain HTTP and can't be used if you're behind a proxy Use HTTPS protocol [No]: Yes New settings: Access Key: xxxxxxxxxxxxxxxxxxxx Secret Key: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Encryption password: xxxxxxxxxx Path to GPG program: /usr/local/bin/gpg Use HTTPS protocol: True HTTP Proxy server name: HTTP Proxy server port: 0 Test access with supplied credentials? [Y/n] Y Please wait... Success. Your access key and secret key worked fine :-) Now verifying that encryption works... Success. Encryption and decryption worked fine :-) Save settings? [y/N] y Configuration saved to '$HOME/.s3cfg'
Now all you need is a file to backup and an S3 bucket to store it.
Let's say you're a web developer like me and you want to back up your MySQL or Postgres development data. First, generate the backup file (you may need to add database credentials command-line options, of course):
# mysql $ mysqldump my_app_development > backup-`date +%Y-%m-%d`.sql # or postgres $ pg_dump my_app_development > backup-`date +%Y-%m-%d`.sql
You can use
s3cmd to create a bucket. This is essentially a top-level directory in your S3 account. Since bucket names must be unique to all S3 users, you won't be able to call it something like "backups". It's helpful to use a prefix like your email or handle.
Creates an S3 bucket called 'myname-backups':
$ s3cmd mb s3://myname-backups
Now you're ready to deliver. Encrypt and send your sql dump file to your new S3 bucket:
$ s3cmd put backup-2014-02-01.sql s3://myname-backups/backup-2014-02-01.sql --encrypt
You can verify it's in the bucket:
$ s3cmd ls s3://myname-backups/ 2014-02-01 22:32 1109702 s3://myname-backups/test/backup-2014-02-01.sql
And retrieve it (with automatic decryption when performed on your machine):
s3cmd get s3://myname-backups/backup-2014-02-01.sql
s3cmd supports a wide range of configuration options beyond those entered during the setup phase.Once set, your global configuration is editable in your
.s3cfg file, typically saved in your home directory. You can also set options at the command line.
Backing up is good but automatic, recurring backups are even better; like saving money, it's more likely to happen when you make a computer do it for you.
Let's add a cron task:
#!/usr/bin/env bash TIMESTAMP=$(date +%Y-%m-%d) TEMP_FILE=$(mktemp tmp.XXXXXXXXXX) S3_FILE="s3://myname-backups/local/data/backup-$TIMESTAMP" pg_dump directory_development > $TEMP_FILE s3cmd put $TEMP_FILE $S3_FILE --encrypt rm "$TEMP_FILE"
Save this in a directory for your local scripts, like
$HOME/bin/database_backup.sh and add execute permissions with
chmod +x ~/bin/database_backup.sh.
To edit your crontab,
crontab -e, and set it to run everyday at 10PM:
# Backup database to S3 daily 0 22 * * * /Users/myname/bin/database_backup.sh