How I (might) roll: off-site backups

Knowing is half the battle

According to Backblaze, June is Backup Awareness Month, so my timing on this post is superb. Everyone who uses a computer knows (or should know) they’re supposed to be making regular backups of their precious data, yet hardly anyone actually does make regular backups. I don’t mean to brag, but I’m way past that. I’ve been kicking around a few ideas on backups for quite some time now, and I think I’ve finally got a decent solution worked out.

I’m posting my thoughts here in an attempt to get some feedback, and also for anyone else who’s in the same boat and might end up finding this later on. The technical implementation will focus mostly on Macs, since that’s what my wife and I use. You could extrapolate from this article and apply the same techniques to nearly any OS, though. I’ll have a Linux box (or two) in the mix as well, but it doesn’t really change anything.

Back that thang up (cash money, something, something)

The Macs in the house are already doing regular Time Machine backups to a QNAP NAS (which is pretty awesome in its own right, by the way). This handles the “oops, I deleted a file” scenario, and it works great over 802.11n and GbE. It’s automatic, so we don’t have to think about it. As long as the computers are powered on, Time Machine happily runs every hour in the background. The weakest link in this setup is the NAS.

Despite that the NAS is running in RAID 5, RAID is not backup—and it never will be. It can sustain a single drive failure and (hopefully) keep going long enough to add a new drive and rebuild the array. It can’t, however, sustain any number of fires, thefts, or zombie attacks and keep going. For that kind of protection, we specifically need off-site backups.

The underground bunker I don’t have

There are many options when it comes to off-site backups, and that’s probably why it’s taken me so long to come up with something that will work for us. Initially, Jungle Disk seemed like the best option, and I’ve used it since 2008. It’s backed by Amazon S3, which is highly redundant, and you only pay for what you use ($0.15/GB/month, plus bandwidth in and out). Jungle Disk is quite nice, and it’s also cross-platform (Windows/Mac/Linux). Unfortunately, it has some pretty strong cons stacked against it:

  • Initial upload takes forever over residential broadband connections
  • Limitless scalability, limitless cost (500 GB would cost $75/month)
  • Recovery could take almost as long as the initial backup
  • Cost is ongoing for as long as your data is backed up

Our DSL connection at home is roughly 7 Mbps down and 1 Mbps up, but even with cable, those numbers aren’t looking any better (12/1 or 16/2 or something close to that). Dropbox (which uses Amazon S3 itself) is great, but it suffers from the same slow pipe problem. In fact, any “cloud-based” backup solution needs to be ruled out for that reason. Mozy, Backblaze, CrashPlan, SugarSync, SpiderOak, etc. are all out.

Low-tech doesn’t necessarily mean no-tech

What we need, then, is a low-tech solution. And the answer lies in cheap hard drives. Seriously. At the time of this post, 1 TB WD Caviar Green drives are $59.99 on Amazon and Newegg. That comes out to $0.058/GB, but it’s not a recurring monthly cost. You pay that once and you’re done with it. Add in a USB/eSATA external enclosure for $20, and then multiply the whole thing by two for the total cost.

“Multiply by two?” you ask. Yes, by two. We’re skipping the internet and going straight to the sneakernet, baby. Here’s how it works. Once a week (probably on Sunday night), I’m going to take images of each computer using SuperDuper. I’ll have to do this twice the very first time—once for each hard drive. Then, I’m going to bring one drive to work and leave it there, along with its power supply for the external enclosure. Every Monday, I’ll rotate the drives. This way, there’s always a backup off-site, and it’s never more than a week old.

The important media files on the NAS will get backed up in the same way with a third+ drive. I say third+ because the capacity of the NAS is 4 TB of protected storage. That will probably happen monthly, and there won’t be any rotation on those drives. I’ll probably bring the drives home on a Friday, run the backups, and then keep them in the safe until they go back to work with the others on Monday. That leaves a brief single point of failure (i.e., the house burns down and melts the safe), but I’m only talking about movies, music, and TV shows here.

Getting some closure on enclosures

One last thought before I wrap it up. I had considered using bare drives (sans external enclosures) with my drive toaster, but I don’t like that idea as much. I’d still need to buy cases to transport the drives, and the risk of electrostatic discharge or some other kind of damage is much higher. In my opinion, it makes more sense to put that money towards enclosures. Plus, I can keep both the enclosure and its power supply off-site. If the house did burn down, I’d have to go out and buy another drive toaster. It’d be the least of my worries.

What sayest thou?

Thoughts? Suggestions? Any glaring omissions on my part? Anyone already doing something like this? Use the comments to describe any backup victories (or failures) you’ve experienced.

  1. i think you have a sound strategy, provided no earthquakes, tornadoes or massive city fires happen to wipe out the whole city.

    i personally don't have offsite backups and it doesn't sit well with me. unfortunately, maintaining backups can be expensive. i already laid down a ton of cash for a windows home server and hard drives to fill it, so i don't really have any money left to buy more hard drives and enclosures.

    when i finally do, i'm pretty sure i'll be using sneakernet just like you.

    • True. If Tucson were destroyed, all of my backups would be destroyed as well. Then again, since I’m in Tucson 98% of the time, I’d also be destroyed, and thus, probably wouldn’t care too much about not having any remaining backups. ;-) My most critical files (up to 50GB of them, anyway) are encrypted and stored in my Dropbox account.

      My friend Vince pointed out that I should really have dual-redundancy backups since a 1TB drive could easily fail during a recovery operation. I think I’ll add another drive to the set so that one disk at work is 1 week old and another is 2 weeks old.