As I centralise more and more digital content on my home server, my need for a decent backup strategy has dramatically increased. I now keep photographs, emails, music, video and the backups from all my other systems on that server, and losing that data would be a catastrophe.
Initially I ran my server with a RAID5 array, simply to achieve a large enough disk store at a reasonable cost. This gave me a little protection against an individual disk failure, even though I had no simple way to take cost-effective backups. However, technology moves rapidly, and I’ve now moved away from the complexity of a RAID array to a large single disk. However, should that disk fail, my entire server would fail, and all my data would be lost.
Initially I tried to mitigate this risk by using a second large hard drive in a USB caddy as a backup medium. Initially I simply plugged the drive in once a week, and manually copied over all the files I wanted to take a backup of. However, with nearly 1.5TB of data, it took hours to complete (not terribly practical) and to make matters worse, I kept forgetting to run the backup. Which was hopeless.
What I needed was something that took incremental backups – ie, only copy the things that had changed. So I decided to use Duplicity, since it was installed as the default backup program on Ubuntu. It first takes a complete initial backup, and then only captures the changes to any of those files. Which sounds great, but problems became apparent immediately; my initial backup took 6 days (yes, days) to complete. I hoped that having got that done, subsequent incremental backups would be significantly faster. And to be fair, the next backup was; but it still took nearly 4 days to complete.
This wasn’t working out well; my server was spending all it’s time (trying) to run backups (none of which were very chronologically consistent), and because my backup device was running all that time, it was aging (and therefore as susceptible to failure) as the main drive. Worse, rather than forgetting to start my backups, I was also now actively dissuaded from starting them. Overall, I would be as well off simply running the two drives in a RAID1 configuration and accepting the complexity and failure rates.
So after some research I switched to rsnapshot, which uses rsync and hard links to create a series of snapshots (as the name suggests) where there is only ever one instance of a given version of any file in the backup. This is exactly the same approach that Apple take with their Time Machine product. It both saves an enormous amount of space on the backup device, and is relatively fast (very fast compared to Duplicity!) in operation, taking only 30 minutes or so to process my 1.5TB.
At the same time, I have installed the disk that was in the USB caddy into my server as a second drive. This means I don’t have to remember to connect it to my server and power it up, and makes it easy to automate the backup process with a script and cron. However, it put me back in the situation where both the main and backup drives were spinning all the time, aging together, potentially failing close to one another. The solution to this is to use some fairly low level disk utilities to spin down the backup drive, and then mark it offline so it cannot be accidentally spun back up again. The backup script brings the disk back online and mounts it (spinning it up) prior to starting a backup, and then spins it down and takes it offline again afterwards.
For the curious, the commands to do that are (as root):
/bin/echo "running" > /sys/block/sdb/device/state
/bin/mount /dev/sdb1 /media/backup
/sbin/hdparm -Y /dev/sdb
/bin/echo "offline" > /sys/block/sdb/device/state
I’ve also segregated the data I want to backup into two lists – stuff that changes a lot and needs frequent backups, and the rest. I then take two sets of backups; every day I take a “frequent” backup, and keep those for 14 days. Then in addition, once a week I backup everything and keep a rotation of 4 of those weekly backups, which feed into a monthly rotation of backups that are then kept for 12 months. So eventually I will have backups of (only) my frequently changing data covering the last 14 days, plus backups of everything, made on the last four Sundays, plus a further 12 backups of everything, made on the first Sunday of every month, for the last year. 30 backups in total.
The drawback (of course) is that this only protects me against hardware failure. There is no off-machine or off-site backup involved in this. So if my machine were to catch fire, it would be game over. However, if I look critically at the data, I could in extremis either stand the loss of the data or (with enough effort or money) reproduce everything except the photographs. So we also keep copies of the photographs (and only the photographs) in 2 separate cloud services, because as the title says, I want “to be sure, to be sure” 😉