A while back, a friend of mine asked if I could help him recover his data and digital photos. His PC had had problems and the manufacturer's support team had recommended he use the "restore" disc that came with the PC. They said his data would be preserved, but it was not. He lost a great deal of important data and invaluable photos of children and family.
It convinced me I needed to become more serious about my own backups. While I have always maintained backups of my text and documents, digital photography and digital music have greatly increased the size of my own dataset. In working through the issues involved, I gathered a great deal of data I thought I would share.
Background
In the world of Mac backup solutions, I think most solutions differ in five generalized types of functionality:
Cloning vs. Snapshots
Solutions creating clones or duplicates help you maintain an exact copy of your most recent data. They do not help you maintain a history of snapshots or changes. By benefit of not keeping track of multiple versions of your files, they can be simpler and faster to deal with in a data loss situation. In short, if you have a clone of your disk, you can be back up and running quickly, in case of a disk failure. However, a clone probably will not help you find a file you deleted or changed a month ago.
A snapshotting solution will keep different time-based versions of your data. If you work on a single data file over the course of a month, a snapshot system will keep copies of different versions of that file at various points in its change history.
Onsite vs. Offsite/Internet
Onsite backups are stored at the same physical location as the original data. This is inherently dangerous. In case of a fire or flood, your backup would be lost along with the original data.
Offsite backup consist of backups stored at any location different than the location of the original data. At a bare minimum, it is desirable to move your backup away from your computer to help in case your equipment is stolen. At the other end of the offsite spectrum are enterprise backup solutions provided over the Internet. In that case, you data is likely stored far away in a secure and protected bunker.
The trade offs are fairly clear: the closer your backup is, the easier it is to use or lose. Internet services are good for data protection, but your ability to get your data to them will be limited by you upload speed to the Internet.
Manual vs. Automatic
Some backup solutions are automatic and go about their business without any human intervention, and some do absolutely nothing until told to do so. As humans are generally the least reliable part of a backup system, automatic is usually better than manual.
Backup Medium
While I am assuming your original data resides on a hard disk or disks, your backup could be in many different formats. Some people make backups of their data to optical disc such as CD or DVD or to a flash key. Many modern backup systems backup to hard disks or tapes, and Internet solutions are "black boxes" in that you don't really know (or care) how they store your data (although it is likely stored on hard disks in a SAN environment).
Full Backups vs. Differential
Most backup programs will back up your data in one of two ways: 1) a full backup that is a new and complete backup of your data, or 2) an differential or incremental backup consisting only of data that has changed or been added since the last backup. A full backup can take longer to backup because it must backup all your data, whether it has changed recently or not. However, a differential backup can take longer and be more difficult when restoring data, because you need all the differential backups to have all of your data.
While everyone would like to be dealing with full backups in a recovery situation, datasets are becoming so large as to become impractical in some cases to make frequent full backups.
Solutions
SuperDuper!
SuperDuper is an excellant cloning application. It can automatically maintain a bootable backup of your boot disk and/or maintain clones of your data disks. This functionality can invaluable, if downtime is your biggest fear. If you have maintained a bootable clone of you disk and have a failure, you could be back in business by simply booting from your clone.
It has recently been updated to v2.5 for Leopard (Mac OS X 10.5) and Time Machine compatibility. The basic functionality of manually cloning drives is free, and the full app is only US$27.95 .
iBackup and rsync
"rsync" is an open source utility providing file synchronization on many platforms. It provides for too many features to mention, but it is mainly used for cloning of particular datasets/folders and keeping them up-to-date by only overwriting the files that have changed. This makes it very efficient.
iBackup expands on and provides a GUI to rsync. It allows you to use the GUI to select what files you want to sync and to schedule the synchronization to occur automatically. It is free for non-commercial use and relatively inexpensive for business use.
Time Machine and Time Capsule
Time Machine is Apple's new backup solution in Leopard Mac OS X 10.5 providing snapshot-style backups of your data to a separate disk on an automatic basis. Once you have activated it, it checks for changes every hour and archives those changes. As Apple's web site says, "Time Machine saves the hourly backups for the past 24 hours, daily backups for the past month, and weekly backups for everything older than a month."
Time Capsule is Apple's combination of an Airport Extreme Base Station and a hard drive. It provides storage and is available for use by Time Machine over your wired or wireless B, G, or N network. This allows you to use Time Machine to backup all the Mac's on your network to one device, and it comes in Time Capsule (500GB)
and Time Capsule (1TB)
models.
ZFS
ZFS is Sun's open source files system which provides many useful features. I include it here, because it can provide many backup-like features, and Apple may be relying on it in future releases of Mac OS X. It can provide filesystem snapshotting and RAID features across different hard disks. Right now, it is probably too complex to use for most consumers.
While ZFS is not a backup solution per se, I have included it for completeness, and because it is interesting.
RAID
RAID is a redundant array of inexpensive disks. Various RAID configurations can be used to protect your data by redundantly storing it automatically on more than one disk during live use or used to speed data transfer by transferring data from more than one disk at a time during live use. Some RAID configurations offer a combination of increased speed and increased redundancy.
The most basic RAID configuration for our purposes is RAID 1 which consists of "mirroring" your data across two drives transparently. Basically, any time data is written to one disk it is also written to the other. If one disk fails, the other still has 100% of the data. This can be expensive, because two disks are holding data that would fit on one disk.
RAID is not really a backup solution, but the redundancy it can provide can improve your situation in case of a hardware failure.
.Mac Backup
Backup is Apple's (not very creatively) named backup program for .Mac
subscribers. It can perform automatic differential backups to local disks, optical discs, and to your iDisk.
Mozy
Mozy is an online backup solution for PC and Mac users, and it is now owned by enterprise storage vendor, EMC. It is available for free for up to 2GB datasets and inexpensive for unlimited storage. in any case, Mozy will store your backup data going back 30 days. The Mozy client software allows you to select the files to be stored as well as a few other options.
As with any Internet-based service, Mozy will be limited by you Internet connection speed.
S3 and Jungle Disk
S3 is Amazon's Simple Storage Service, and it provides unlimited storage in a pay-as-you-go format. Once you have an account, Amazon will charge you a relatively small amount of money for the data you upload, download, or leave stored on their service. The storage is fairly free form, so it does not directly provide any backup features.
Jungle Disk is an application front-end for S3 and provide easy access to S3 features.
CrashPlan
CrashPlan is a slightly different animal. It is a software package that allows you to use another computer running CrashPlan software as a storage location. You and a friend could provide an agreed upon amount of backup space to each other, for example. It is a friend-to-friend (as opposed to anonymous P2P) solution providing offsite backup and working on Mac, Linux, and Windows platforms.
What are you using to backup your data?