Sunday, November 14, 2010

Lazyman's back up - the only kind that works! (Part 1)

Backup and backup often, right?  We all do it, don't we?  Yeah, I didn't do it for a while, either.  I did get better at it once my daughter was born.  Those pictures of her first few years of her life are irreplaceable, so I became diligent in ensuring they were as safe as I could make them. Since I had a machine in our living room that served as our media center and file server, I would backup to that.  When out of town, I'd always have my laptop with me to serve as my "offsite backup".

The problem was, I did everything manually and that's the fastest way into Mistakeville I've found (Most folks take I-380S and then get off at Exit 19. That's the long way). Whenever I'd want to backup, I'd look to see what backups I had out there on the network. Since I did it haphazardly, I'd have a few different copies of previous dumps from my laptop. Since I never knew what was the most current version, I'd just create a new dump. At one point I had about a dozen copies of the same data occupying half a terabyte of drive space.

It was absurd, and I one day I finally decided to get it under control. I went through and consolidated them as best I could as I MOVED them from the server onto my laptop. At that point, the laptop contained the master copy of my data. I then used Microsoft's Synctoy to sync it back to the server. Synctoy gets a bad rap from folks, but it really is a simple and easy to use tool. I created a sync pair for my data and once a week scheduled it to run automatically. I kept my backups down to just one copy and life was good.

Fast forward to today.  New OS means new paradigms.  Since the laptop's been traveling with me a bit more, using it as an off site backup still works.  But, there might come a time where I'll want to leave it at home when I'm at work.  Worse, if there's a disaster in the house (more on this topic at another time), I'm probably not going to make an effort to get my laptop!  So, I need an additional backup I can keep permanently out of the house.  Since I had an extra 200G hard drive lying around, and that's more than enough to hold my REALLY critical data, I ordered an inexpensive USB enclosure to put it in.  Once a week, I'll bring it into the house and do a backup, but the remainder of the time it will live in the car.  I need to figure out how to handle the very cold winters we have 'round here, though.  A week in the car could get it down to absolute zero!

That's a detail for another day, though.  Today, we just want to get it setup so I can use it to backup my laptop quickly, easily AND securely.  This drive's going to be in my car, so there's always a chance it could "disappear" one day.  I don't want my data so easily absconded with!  Fortunately, there's an easy solution: Truecrypt.  A free, open source encryption application.  It's cross-platform, so no matter where my brain leads me in terms of OSes in the future, I'll be able to read it.

Installing on Ubuntu's a doddle: download from here, follow the instructions. Now we create an encrypted volume. I'm not going to take you through this step as Truecrypt's docs are top-notch. I will mention that I created the volume on my Windows box and I did so for a reason: I wanted it formatted as NTFS. My plan is for this to be accessible regardless of the platform or machine I plug it into. NTFS support has become much more ubiquitous, so that's the route I went. I could've probably gotten it formatted with Ubuntu, but it was simplest to just plug it into a Windows box and do it. The final detail: create a keyfile for the disk and tell Truecrypt to use it. You'll see why later, but for now Truecrypt's docs on keyfiles are the best place for a howto.

Everything's in place, now we just need to make it really easy to use.  First, I create a couple of scripts to automate the process with.  I'll be writing a number of scripts to manage my machine with, so I'm creating a .bin directory in my home directory. Since a lot of these scripts will be for automated processes, I'm going to hold off adding it to my PATH for the moment.  Also, I'm putting it in my home directory instead of /usr/bin for a reason which you'll see with the first one I write for this process.

We start it with: joe ~/.bin/mountrose


#!/bin/sh

truecrypt /dev/rosewill1 /mnt/rosewill --password=ComplexPassword --keyfiles=/home/tonyk/.bin/rosekey --protect-hidden=no --fs-options="uid=1000,gid=1000"

rm -rf ~/.bash_history
history -c


Yes, I use joe.  I like joe.  You also now have one question and one complaint, I think: the maker of my external drive enclosure is named Rosewill and I'm not worried about putting the password for the volume in the script because my home directory is encrypted.  If someone were to steal my laptop and the enclosure, they'd have to decrypt my home directory to get this script to get my password to get it open.  Of course, at that point, they have the primary of the data anyway.  The makers of Truecrypt do make a good point we have to acknowledge.  If someone should get onto my machine while it's backing up, they could see the whole command line passed to the Truecrypt process and thus be able to get it that way.  Again, at that point my machine's been compromised while I'm logged in with the home directory decrypted.  I'm done no matter what. Since the plan is for this to be plugged in once a week for only a few minutes, I'm not as concerned about this particular security issue.

So, this script mounts the encrypted partition we created before and then wipes history to make sure there isn't a trail with my password, just in case.  When I started playing around with this, I thought I'd be able to just use the password option on the command line, but I found it absolutely would not work unless I put all of the other options in, including using a keyfile. I don't know why that is, nor do I feel like digging far to find out. The remaining two options included:

--protect-hidden=no tells Truecrypt that there's no hidden partition it needs to be concerned about

--fs-options="uid=1000,gid=1000" tells it to set the newly mounted volume as owned by my user account so I can work with the files.

We need another command to unmount the disk when backup's done. As mentioned above, I'm not adding my personal .bin directory to my path until I'm sure I need it, and the command to unmount is only one line, so I decide instead to create a Bash alias. Append the following to the end of your .bash_aliases file:

alias urose="truecrypt -d /mnt/rosewill"

Now we have the mounting and unmounting of the disk automated.  Of course, being the lazy fellow I am, I certainly don't want to have to TYPE these commands when I want to back up.  I want it to do even that trivial task for me.  S'okay.  Computers are here to do our work for us, remember?  Besides, having the computer do this means I literally have to remember to plug the thing in once a week.  Hmmmm...I think I just came up with an excuse to build a robot!

Back to the matter...how do we automate this process?  Fortunately, this part's trivial.  We write what's known as a udev rule. udev is the Linux subsystem that detects hardware and creates device nodes for them. When you plug in a device hot, udev reads its pre-defined rules to determine what to do with it. We're just adding one of our own that's particular to this single, unique device. First, we find out the perts* about the drive by plugging it in, turning it on and using udevadm to find out more info:

udevadm info -a --name=/dev/sdb

This will give you TONS of useful information about your drive, but we only need a little bit. Specifically, I need to know the manufacturer and model number of what just got plugged in so I ensure it's the disk I want and not some USB flash drive or something. Now, you'll notice this info is broken up into sections that begin "looking at parent device". This is because udevadm looks at the device and then everything that links this device to the OS. Since this is a USB enclosure, that means: the IDE-USB interface, the USB interface, the USB hub, the USB-PCI bridge, etc, etc. I'm most interested in the section below. Keep in mind, when you're writing udev rules, the attributes you're looking at must all reside in one of these blocks. You can't pull the vendor attribute from the SUBSYSTEMS="usb" section and the model from the SUBSYSTEMS="scsi" one. The reason I chose the block below is because that block had all of the relevant data I need.


looking at parent device '/devices/pci0000:00/0000:00:0b.1/usb1/1-1/1-1:1.0/host6/target6:0:0/6:0:0:0':
KERNELS=="6:0:0:0"
SUBSYSTEMS=="scsi"
DRIVERS=="sd"
ATTRS{device_blocked}=="0"
ATTRS{type}=="0"
ATTRS{scsi_level}=="0"
ATTRS{vendor}=="Initio "
ATTRS{model}=="WD2000JB-00EVA0 "
ATTRS{rev}=="1.06"
ATTRS{state}=="running"
ATTRS{timeout}=="30"
ATTRS{iocounterbits}=="32"
ATTRS{iorequest_cnt}=="0x48"
ATTRS{iodone_cnt}=="0x48"
ATTRS{ioerr_cnt}=="0x1"
ATTRS{modalias}=="scsi:t-0x00"
ATTRS{evt_media_change}=="0"
ATTRS{dh_state}=="detached"
ATTRS{queue_depth}=="1"
ATTRS{queue_type}=="none"
ATTRS{max_sectors}=="240"


We now have all we need to create a rule. First, su and joe /etc/udev/rules.d/10-local.rules. We only need one line for this:


KERNEL=="sd*", SUBSYSTEMS=="scsi", ATTRS{vendor}=="Initio", ATTRS{model}=="WD2000JB-00EVA0", SYMLINK+="rosewill%n", RUN+="/bin/sh /home/tonyk/.bin/mountrose"


What that's saying is "when you see a device that the kernel labels a scsi disk, from the vendor Initio with this model number symlink its devnode as rosewill. Also, since this is a disk, symlink the devnodes for each partition as rosewill1, rosewill2, etc and then run the script /home/akarakas/.bin/mountrose using /bin/sh".

So, why do we do it like this? Let's say, for example, I've plugged in a USB flash drive prior to plugging in this hard drive. There's a chance that flash drive might become /dev/sdb and my backup hard drive /dev/sdc. Can't really write a script for variable disk names, now can I? By setting up the udev rule in this way, this disk will ALWAYS be accessible via /dev/rosewill and its partitions via /dev/rosewillX. I don't need to poke and prod as part of my script, I'm always going to have the right devnode. As to why I launch the script this way, for some reason a bash script doesn't honor the shebang properly when run by udev. I could not find any way to run this script by simply calling it. No biggie, it works.

And now, the moment of truth...plug it in, turn it on and...../mnt/rosewill is mounted, and I can see and modify the files. Woo hoo! Now we have my media setup! Kewl!

At this point, I'm going to conclude this article and leave the remainder for Part 2. In that article, I'll present the full script the system runs when the disk is plugged in and reasons for some of the choices I've made.

*perts = Pertinent details.  I'm trying it out, what do you think?

2 comments:

  1. [...] in digital format, along with my documents, ebooks, pictures and movies. As pointed out in an earlier article, if there’s a fire in my house, I won’t lose a single picture of my daughter. Can the [...]

    ReplyDelete
  2. Hiya! Quick question that's totally off topic. Do you know how to
    make your site mobile friendly? My website looks weird when viewing
    from my iphone 4. I'm trying to find a template or plugin that might
    be able to resolve this problem. If you have any suggestions, please share.
    Cheers!

    ReplyDelete