Bruce's Do-Over Week 1: Photo Archive
Kristin is participating in a genealogy “do over” where she is starting from an empty tree and inserting information from scratch with everything done the way she now knows is correct. We are taking this opportunity to look at a number of different aspects of how we work on family history. My job is back up and recovery, and I'm revisiting that.
We use Macs, and have TimeMachine configured--and have tested it under fire. That is our first level of backup. I also clone disks using Clonezilla. Both of these are occasionally swapped with drives in a safe deposit box. The third backup is through Backblaze, a cloud backup service that offers unlimited storage for $5/month per machine, but only for native and USB attached disk drives.
The problem is that our photos are stored on a Western Digital Wordbook II NAS device. It has mirrored drives and has proven to be quite reliable, but there is no convenient way to back it up for offsite disaster recovery. As we become more dependent on digital photography and scanned images, I needed to come up with something better for storing photos.
Linux and OS X have long had the ability to set up software based RAID arrays, and Windows 8 has this capability as well. It was time to get a pair of high capacity USB drives, configure them in a RAID array and transfer all of our digital images to the USB RAID array where Backblaze would back them up. RAID (Redundant Array of Inexpensive Disks) is a term for setting up a high-reliability and in some configurations high performance disk system. For a mirrored configuration (RAID 0), both drives in the pair must fail before you lose data. RAID 0 protects you from disk drive failures, but not from doing stupid things like deleting a file. It is not a substitute for offsite backup which protects you from fire, flood and theft.
Setting up a RAID 0 Array (Mirrored Disks)
On OS X, go to Applications->Utilities->Disk Utility. Select the RAID tab, and add your two empty drives to a RAID array--it is really quite simple. There are a number of videos on Youtube describing the process.
On Windows 8 and above, you will use the Disk Manager to configure the RAID array. Search on Youtube for a video describing the process.
On Linux, there are many web sites that describe how to set up RAID on the various distributions. Search on “mdadm raid” for instructions on how to set up a RAID array for your machine. Setting up and using a RAID array is quite easy, but getting it set up as the boot device can be challenging.
Copy Files from Old Drive to New Drive
For copying the files to the new drive, the
rsync command is really the best way to do this, since it pick up where it left off if it is interruped and you restart it--a likely event for a file transfer that is likely to take a day or two.
rsync is built-in to both Linux and OS X, but to use the command on Windows, you will need to install Cygwin, a collection of Linux programs that have been ported to Windows.
Once the RAID array was defined, I used the
df -h command (native in Linux and OS X, available in Cygwin on Windows) to list the volumes and mount points that needed to be copied:
mymac:iMovieCapture myuser$ df -h
Filesystem Size Used Avail Capacity iused ifree %iused Mounted on
/dev/disk0s2 931Gi 748Gi 183Gi 81% 196093759 47886983 80% /
devfs 198Ki 198Ki 0Bi 100% 686 0 100% /dev
map -hosts 0Bi 0Bi 0Bi 100% 0 0 100% /net
map auto_home 0Bi 0Bi 0Bi 100% 0 0 100% /home
/dev/disk5 3.6Ti 2.7Ti 921Gi 76% 367565852 120752474 75% /Volumes/RAID_0_Media
/dev/disk10s2 1.8Ti 1.2Ti 648Gi 66% 318433606 169861060 65% /Volumes/iMovieCapture
With a list of directories, I copied the files using the rsync command:
rsync -arvpogt /Volumes/myuser /Volumes/RAID_0_Media/myuser | tee rsync_disk_copy.log
The various arguments copy with archive mode set (a), copy recursively down into subdirectories (r), copy permissions, ownership and group (pog), and preserve modification and create timestamps (tN). The <code>tee</code> command sends the output to the console and also creates a log file that you can review after the fact.
The copy process took about a two days to complete, followed by some work to review the log file and fix some permissions problems.