Backups are Useless

November 30th, 2005

At the very bottom of a long list of sobering lessons to be learned in the aftermath of Hurricane Katrina, is one having to do with the security of digital data.

Katrina Lesson #1432: Backups are Useless

I don’t have precise statistics, but I’d venture to say that most people don’t make regular backups. Between the dearth of reliable, easy to use software, and the relative technical naivete of the computer-using public, backups are simply not on the radar in most folks’ lives. So when the unexpected happens, most people are simply left to start over without the benefit of their accumulated data. All those pictures of the kids at Little League, gone. All the ambitious spreadsheet plans that show you might actually afford to own a home, gone. All the email you exchanged with your sweetheart before you had a clue that you’d one day be married, gone.

That’s sad.

But what’s really sad is that the small percentage of people who do make the heroic effort of backing up are kidding themselves if they think they’re much better protected. There are many, many ways a person can lose data. Too many to count, yet the backup strategies typically employed by “plain folks” protect against only two: hard drive failure and careless erasure. In a world where hard drive reliability is at an all time high, and most data sits in the “Trash Can” or “Recycle Bin” for some time before being permanently deleted, these particular dangers may be much less probable than other risks that are obliviously under-addressed.

Are you backing up your important data? Where does your backup live? If your house erupts into flames while you’re out shopping, will the smoke from your backups mingle with the smoke of your originals? The proximity that most backed up data keep with their originals is an extremely risky component of most backup strategies.

Many people have taken to using redundant hard drives as their main backup strategy. One problem with this solution is that your data, which is probably by itself unattractive to a thief, suddenly has implicit street value. If a burglar enters your house and, God forbid, steals your computer with all of your original data on it, will they also be stealing your backup? If you’re using a redundant internal drive, you’ve just lost both in one fell swoop. If you’re using an external drive, you may not be much luckier. A glimmering FireWire enclosure screams “take me” much louder than a stack of anonymous DVD discs in a filing cabinet. Your family photos may be protected from mold and oxidation in their digital format, but the chances of a thief stealing your dusty old photo album are about a billion times lower than their walking away with your shiny new iPod.

Another risk of the “redundant hard drive” strategy is that, at any given time, both your backup and originals are hooked up to an extremely powerful, unpredictable power source that could cause havoc at any moment. Surge protectors assuage some of the fear, but my completely unscientific and paranoid assessment is that “there are still no guarantees.” The problem is, whatever your computer is plugged into, your backup is probably plugged into the same thing. This means that when the freak bolt of lightening crawls up through the wires and obliterates your computer, your backup hard drive has just melted as well.

Depressing. Isn’t it? Backups are useless!

Well, they’re not useless. They’re just highly under-useful. Of all the possible scenarios involving data loss, only these two, hard drive failure and accidental erasure, spring to mind as scenarios that are actually addressed by most strategies. All the infinitely diverse other scenarios fall into the “gosh, I sure hope that doesn’t happen!” category. This means that despite your well-intentioned backup plans, you are either oblivious to the real dangers or else live in constant paranoid anxiety about how you will cope in the absence of your data. All that anxious fear is unhealthy! You need to address this!

Making Backups Useful

So how do we improve our backup strategies? How can we Katrina-proof our data? Unfortunately, there is no sure-fire strategy for preventing data loss. No matter what you do, there is an imaginable scenario that you didn’t account for. Perhaps you lost the data 5 minutes before your regularly scheduled backup would have saved it. Perhaps the Earth is destroyed to make way for an intergalactic superhighway. You can’t prevent all data loss, but you can take major steps to ensure survival of all but the most catastrophic of scenarios.

The components of a successful backup strategy include regularity and remoteness. Most people make a gallant effort of satisfying the first, but do nothing to satisfy the second. Your data backup is useless if not remote from the original. What does it mean to be remote from the original? This is highly contextual and can mean a number of things. On a very philosophical level it means that the backed-up data should exist in as different a form as possible from the original. If the original is on a hard disk, the backup should not be on a hard disk. If the original is vulnerable to fire and flood, the backup should be safe from fire and flood. If the original is in Atlanta, the backup should not be in Atlanta! We want to get the backed up data as far away from the original as possible, and we want it to be ideally as different from the original as possible (while still containing the same information). If a hard-drive eating fungus descends upon the earth, you’ll be happy to know that your backups are all on tape. And if a tape-eating fungus should arrive instead, you’ll be glad to know that your hard drive is in fact such a good backup of your backup, that it’s already backed up your future backup! Or something!

Practically speaking, many levels of remoteness can be built-in to your backup strategy. How easy or difficult these levels are to introduce is highly dependent on your personal situation, and how much data you have to back up. If you happen to have a fire-proof safe in your house, you can make your data remote in terms of fire vulnerability by popping the backup into the safe. If you are only backing up a small amount of data, it might make sense to upload it to a server at your ISP or on a friend’s computer. If you’re shopping for a web hosting provider, you might be wise to go out of your way to choose one that is not local. When the ball of flame engulfs your town, your data will be safely stored on your web account halfway around the world.

Remoteness is easier to achieve when the data is stored on inexpensive and lightweight media. If you make regular backups to DVD, why not add a weekly backup that gets mailed off every Monday to your parents house, or somebody else you know on the other side of the country? You can encrypt the data if its sensitive, so you don’t have to worry about your parents’ house-guests getting cozy with your data. Just getting the data into their unread pile of mail gets it “safe.” You want your backup data to live a dramatically different lifestyle than you. While you’re freezing your butt off in Ontario, and the screen of your computer flickers on and off to the rhythm of the power surging, you can think of your backup data living it up on the beach in San Diego.

In the aftermath of a mind-bogglingly catastrophic event like Katrina, many people have been reminded to revise their personal disaster plans. Most plans focus on the preservation of personal safety, and rightly so. Your backed up data will be worthless to you if you and your family don’t survive to enjoy it. But the last thing you’ll want to learn after being successfully pulled to safety from the wreckage of your home and town, is that you’ll be starting your life over sans data. We invest so much into these little ones and zeroes, we owe it to ourselves to protect them as much as we would any other incredibly valuable possession. The overwhelming unlikeliness of such disasters make it easy to “hope for the best,” but doing so is offensively irresponsible.

I am not typing this from a high-horse. Until this blog entry makes its way from my Massachusetts fingers to its California database, it’s as vulnerable as the rest of my data. I take a moment now to look down at my redundant firewire hard-drive. Lazily backed up now more than a week ago. I think of my source code, copied a few days ago to my California server – at least I think I got all of it! I consider the music files I laboriously imported from my CD collection. Not backed up, because I can “always reimport them.”Shudder! I agree: it’s hard to do this right. My strategy as it exists today is unacceptable. It’s time-consuming. It’s difficult. It’s not Katrina-proof.

My strategy is useless. I’m betting yours is, too. It’s time we made them useful.

9 Responses to “Backups are Useless”

  1. Todd Ransom Says:

    I do rsync backups to a local server every 30 minutes. I do nightly off-site backups of anything I consider critical.

    When thinking about backups you should prioritize your data. Don’t backup the entire system because off-site backups become prohibitive in terms of storage and bandwidth cost. Besides, you can re-install your OS in half an hour. Why would you choose to back it up instead? It doesn’t make financial sense unless the cost of that downtime justifies.

    Anything that I absolutely cannot afford to lose is backed up off-site nightly using Apple’s Backup tool. It’s easy and cost effective if you do not have that much data to backup. I backup my entire Documents directory (which contains all of my writing projects, personal documents and my Subversion repository for source code) as well as my development projects (to capture things that have not been committed to svn yet) and preferences.

    It’s not that hard if you think about the problem logically. Are your MP3s really that important if your house burns down? No, because they are 99% likely to be replaceable by insurance money. Only the truly irreplaceable stuff (photos, dev and writing projects, etc.) should be shuffled off-site and for most people that should be a relatively small chunk of data.


  2. Nat Says:

    Hard disks may be more reliable than ever, but that’s like saying that the human lifespan has gone up — you’re still going to die, and so is your disk. Actuarially speaking, the disk will die first, then you, then your pet tortoise.

    I would welcome better software with big wet kisses, but reliable offsite backups seem more a function of much faster bandwidth than any other factor (especially as long as “broadband” remains asymmetrical). When that changes, backups will change, just as quickly as SOHO users like myself were glad to stop buying tape drives.

    Even blessed with speedy pipes, I strongly suspect affordable, invisible RAID on desktop machines would rescue more from the bit bucket than the availability of cheap offsite network-based backup tools and infrastructure, despite the continuing risks of theft and local catastrophe. I’m not crazy about SuperDuper because I’m in that 1% who demands incremental backups, but the transparency of ubiquitous RAID would still be a giant leap forward.

    In the meantime, it’s perhaps a touch flamboyant to brand as useless a system that “only” protects you from the plurality or majority form of data loss, all risks considered. When a Katrina comes, I expect the little league photos will be among the least of my concerns.

  3. Jeff Says:

    I certainly believe in off-site storage. However, I think that same-city storage is sufficient for most people. Each person has to weigh the importance of the data and the likelihood of losing it. Even in New Orleans, you wouldn’t necessary have lost both of your storage sites. Besides, people knew that the hurricane was coming and had time to evacuate. It’s easy enough to throw a few DVD’s in a bag and take them with you. Have a disaster kit ready to go, including your backups.

    I don’t think that my parents would be happy about receiving a DVD every week (assuming that everything I wanted to back up could fit on one DVD). At the end of a year, they’d have at least 52! Some could be thrown away after a while, but then I’d be forcing my parents to organize and search through the DVD’s.

    My strategy is to burn an encrypted disk image of my home folder to DVD a few times a week and store half of the backups at the office. It’s possible that a tornado could destroy both my home and my office yet leave me alive. That scenario is pretty darn improbable, though, and it’s the most probable of the worst-case scenarios. I’m willing to take that risk.

  4. Faried Nawaz Says:

    If you use rsync for backups, you might like rsnapshot. It uses some filesystem tricks to let you store incremental backups of a directory/filesystem without using a lot of disk space.

    For remote, distributed, encrypted (buzzwords!) backups, try allmydata. Windows only, for now.

  5. Michael Tsai - Blog - Making Backups Useful Says:

  6. foresmac Says:

    I backed all of my photography on a removable FW drive. I only pulled it out to back up the photography and kept it put away otherwise. Recently, the hard drive in my PB melted down and I needed it replaced. Since it happened right at the beginning of the semester, I satisfied myself that everything was on the external drive, and I would get back to transferring back to my PB during Christmas break. Then I would buy a big stack of DVDs and start backing up all 60GB of pictures (something like 20,000 or so…). You know, a backup for the backup.

    Nothing could have prepared me for what actually happened to my drive: my girlfriend’s nephew found it and tossed it out of a third story window.

    Three years of my career, gone in about 6 seconds.

    Yes, backup is hard, it’s a pain in the ass, and it often seems like no matter what you do the data will find a way to destroy itself, if just to spite you.

  7. Daniel Jalkut Says:

    Ouch! Well, thanks for sharing that nightmare of a story, foresmac. I’m really sorry for your loss.

  8. dvb Says:

    The idea of having a physical hard disk (or even DVD) in your home with your data on it is insufferably quaint.

    Your home/computer should maybe have a terabyte of local cache, but the remaining petabytes of your accumulated life work and memories should be managed by the Data Office, that’s what you’ll be paying them for! (Historical note: For obscure legal reasons, when Google took over maintenance of the Library of Congress, they had to adopt the commercially neutral name of the “United States Data Office”.)

    Remember, too, you won’t need your “own” copies of the Star Wars movies and all that iTunes crap you “downloaded” back in the 200x’s.

    Should run about $10 a month for typical users.

    Cases of actual data loss will be extremely, extremely, extremely rare, and often settled out of court for around $1 to $10 million. Small consolation for your tragic loss, at that. (Still, one can predict the occasional settlement in the billions.)

    Me? Today? 2005? Yup, stack of shiny firewire drives right next the computer. No iPod, though.

  9. macophilia » Die dezentrale Sicherheitskopie Says:

    […] Das Backup der eigenen Daten entfällt, zumal die wenigsten ein ausreichendes Backup der eigenen Daten vornehmen und noch weniger dabei auch noch eine externe Kopie ihrer Daten besitzen, da Backups ansonsten nicht viel nutzen, wie die Opfer der Hurricans zeigen. […]

