Friday, August 25, 2006

Performancing for Firefox

So, I've decided to try and spend a little more time with my blogs. My peeps deserve that, afterall. They do love me for a reason, so I have to keep giving it up! :)

But, blogging can be a pain in the ass...enter Performancing for Firefox. It's a nifty little extension for Firefox that allows you to blog pretty easily. Just install it, restart FF and you're off. You then setup your accounts and done! It puts a little button on your statusbar and when you click it, your window splits in half and you blog in the bottom. There's even an addon to publish to Myspace blogs. Pretty cool stuff!

Monday, August 14, 2006

Foxyproxy...nice!

Been a while, huh? Wonder if anyone has me in their RSS feeds to see if I update this again. Hehehehe. Well, I'm going to try and be a bit more regular with all of my blogs, let's see how that works out.

Anyway, wanted to stop in and tell you how cool this Firefox extension is...FoxyProxy. I used to use the SwitchProxy Tool because I was using my laptop both at work and home. SPT allowed me to right-click and use the proxy at work, then right-click and use no proxy at home. Made life easier...until Firefox 1.5 came out and the developer stopped working on SPT. Fortunately, at about the same time I got a real desktop at work, and didn't need to switch proxies anymore, so I wasn't concerned...

...until a couple of weeks back when they "upgraded" the proxies at work. Now they're up, they're down, they're left, they're right. Half the time I can't even get out. Fortunately, there's five of them, so if one's down I might be able to get out on another. Back to needing a proxy switching util...FoxyProxy, how wonderful a tool. See, after upgrading the proxies, the rulesets aren't the same on them anymore. So, for example, the primary proxy I use (which is closest to my location and fastest) won't allow me to go to RPGNow to see what's new. However, the default one used by most folks will allow it. What a PITA to have to switch, no matter how easy FP makes it, back and forth between proxies for one or two sites.

Not so! FoxyProxy allows you to set patterns for URLs, so it can use different proxies depending on where you want to go. So, I use my "close" proxy for all URLs, unless it contains "*.rpgnow.com/*". Then, it uses the user-default proxy. All of this happens behind the scenes and I don't have to even think about it. Nice!

If you need a proxy solution, this is so the way to go, folks.

Wednesday, February 15, 2006

Little webmin thing

Webmin: think of it as a web-based MMC for your Linux box. That oughta piss off a lot of people, huh? :) It's a great tool once you get it setup. Fortunately, on Fedora it's as simple as yumming it down and it'll be preconfigured for the way things are setup on that OS. Otherwise, I recommend following the docs for installation. It's as simple as putting it into the directory it will eventually run from (think /opt!), and running setup.

Now, as we know, I like to have as much of my network and systems available on the web. The problem is, Webmin uses its own internal webserver, and that server runs on port 10000 by default. Dang! That ain't gonna work from the office, and I hate typing in URLs that include port numbers. Fortunately, there's good instructions on setting it up behind or with Apache here. However, I learned the hard way that they ain't complete (once I finish this post, I'll send them an update). I use the third method provided on that page where I use Apache as a reverse proxy. This hasn't really worked very well for me in the past, but I finally figured out why and here's the solution:

Firstly, my Apache only runs in SSL mode, it doesn't accept connections on port 80. When Webmin is installed, it too sees that SSL is installed and apparently enables it in itself, so it expects you to make https requests. If you keep getting "this server is runnning in SSL mode, try https://..." after setting up your Webmin, you have this problem. To fix, firstly this should be the setup in httpd.conf:

RewriteRule ^/webmin(.*) https://127.0.0.1:10000$1 [P,L]
# ProxyPass /webmin/ https://localhost:10000
ProxyPassReverse /webmin/ https://127.0.0.1:10000/
SSLProxyEngine On

(Note, I use rewrite, rather than ProxyPass. I've left the commented ProxyPass version in place.)

You see, without that, the proxy only makes plain requests to Webmin, but it wants SSL. You also need the SSLProxyEngine On statement to enable the right requests to be presented to Webmin (also missing from the docs).

That's it. A simple solution to a problem that's been bugging me for some time.

Monday, February 6, 2006

Bash completion

You know what's funny? I "grew up" using both Linux and Windows. Unlike most people, I find Windows a whole lot easier to use, a whole lot more stable, and generally a better day-to-day OS. Linux is fun to play with, but I get so aggravated trying to do the simplest stuff some days. It's just not worth the level of aggravation! The latest thing that was bugging me has also been bugging me for years, I just never got around to seeing if there was a solution. Turns out a solution's only been around for a short while anyway...

Believe it or not, there IS a command line in Windows. Not only that, but if you've got a clue it can be quite a powerful tool. Since NT 3.1, the shell processor of choice has been cmd.exe. Command.com was available, but it was a DOS-emulation mode that lacked certain features. The most important of these to me being command line completion. See, I type really fast. I think the last test I took had me at about 85 wpm. Not bad for someone who can't touchtype. :) The problem is, I can't type as fast as I can think, so anything that helps me type faster is critical.

Now, bash has command completion, too, but it works significantly different from the cmd.exe version. With cmd.exe, if I hit tab, it cycles through my choices. So, if I want to enter "cd \program files", I'd hit "cd \P-tab" and be there. Bash works the same, unless there's more than one file or directory that begins with "p" in the root. With cmd.exe, tabbing again would cycle through the "p" filenames and then I could hit enter to execute the command. With bash, it would bitch, then if I hit tab again it'll list all of the choices and then I have to continue to type until I've pretty much narrowed it down to the file I want.

Personally, I prefer the cmd.exe way of doing things. And, it's not because "that's what you're used to". I've used both shells over the years equally enough to make an informed decision, and I find the Windows version faster and easier. Apparently I'm alone in this thinking because in searching for a way to make bash act like cmd.exe I found a lot of people bitching about how it doesn't act like bash. Just yet another example of how *nix nerds are incapable of learning something that they're not used to. It's that whole "projection of shortcomings" thing. :)

Anywho, I did find the answer here. It seems it's a simple (and poorly documented) matter of adding the following four lines to your /etc/inputrc. In my case (currently using Gentoo, that's a story for another day), I didn't need to make any other changes than the addition of these. I've split the configs from the comments since Blogger squinches the pages down.

set completion-ignore-case on
# Ignore case when doing completion

set mark-directories on
# Completed dir names have a slash appended

set visible-stats on
# List ls -F for completion

"\C-i": menu-complete
# Cycle through ambiguous completions instead of list

#set show-all-if-ambiguous on
# List possible completions instead of ringing bell

You can edit the file and test it using bind -f /etc/inputrc to activate your changes immediately.

BTW, if your command shell doesn't currently support this feature, that's because it's disabled by default. Here's a post about enabling it, with the obligatory clueless and wrong Windows bashing (enabling this feature does NOT cause anything to crash. It never has, it never will. I've enabled and used this feature on literally THOUSANDS of Windows boxes and never had a problem with it).

Tuesday, January 31, 2006

Kernel patching for fun and profit

One of the nice things about Linux is that if a certain feature you need isn't available, you can write or create your own. The obvious drawback, of course, is you need to be a programmer with some mad skillz in order to do that. Fortunately, there are a few of them out there, and here are some of the kernel patches I sometimes use and where to get them. I'm thinking of starting a kernel patch repository here because frankly there isn't one anywhere else!

Suspend2
Those of us with laptops know how nice suspending to disk can be. I use it every day. I standby my laptop before I leave the office, and when I plug it in back at home, I'm right back where I was half an hour before. This comes in real handy when you're reading something online and want to finish when you get home. Anyway, this is the patch that emulates Windows hibernation feature. Not a small or easy change to implement, but worth it in the end.

OpenMosix & OpenSSI
These allow you to turn all of the computers in your location into one big supercomputer. Sort of...Beowulf-style clustering is the most well known Linux parallel-processing hack. The problem with Beowulf is you have to write your software to directly support parallel processing. Your software does all the work, Beowulf just lets it do it. OpenMosix & OpenSSI do this at the kernel level, meaning you kernel moves your threads between machines. You don't get the level of performance you do with Beowulf, but for things like render farms, this comes quite in handy. Unfortunately, neither supports the 2.6 kernel yet, but OpenMosix is pretty close.

User Mode Linux
Esentially, this is a console-mode VMWare. It allows you to compile a special kernel that can be run ("booted") just like a regular executable. Good for testing software before you put it into production.

USB/IP Project
Easily one of my favorites. And the fact that it works on recent kernels so I can use it doesn't hurt much, either. :) Basically, this is just a driver that emulates a USB hub. The hub, however, transmits/receives all traffic to/from the USB device connected to over an IP-based network. In other words, connect a USB device to your machine, and any other machine with this driver can use it as if it were a local device. Imagine a USB-enclosed hard drive that everyone can mount and backup their stuff to. Or, a desk in your house with a server that lives underneath. You plug your scanner into it, and to use it you just take your wireless laptop over and start working. No plugging things in just to work. Sweet!

Thursday, January 26, 2006

A....P....C....DEAD!

I tell you, I've got to stop writing this damn thing. I've had more serious troubles with my server since I started writing about how to do stuff. It's almost like every time I make some progress, something else comes along to force my hand...

The other night, we came home from work to find the house dark. My house is automated with HomeSeer. I don't have a shitload of tasks setup, but the most important: turn on the outside and living room lights just before sunset apparently hadn't run. There's usually only one reason for that: loss of power to the server. The server's plugged into a huge surge protector* along with the TV and other electronic equipment in the living room, and a few weeks back I moved it from the floor to an inaccessible spot behind the equipment rack. Beeper, the cat, likes to sleep back there 'cause of the heater and it's pretty isolated. Unfortunately, the huge switch on the protector was too easily hit by the fat cat. If I couldn't get to my mail at any point in the day, I knew it was due to her. :)

But, as we moved closer to the house, we could hear the alarm beeping. Uh-oh. I pulled out my handy Husky pocket flashlight, and took a tour around the house, peeking in the windows and such. The house was secure, and I could see the clock flashing on the oven. Power outage. Fucking RG&E. Well, at least it wasn't anything serious.

Now, as the three people who read this blog know, I've got my drives setup in RAID arrays. But, that don't help much when your drives have become corrupted, or you corrupt the array yourself. I'll spare you the details because, frankly, I'm not 100% sure what I did, or why I had to do it. Suffice it to say, two hours later, I'd pretty much had enough with computers for life!

The next night, I ran to CompUOverpay and grabbed a 305va APC Back-UPS ES. I made sure it was supported under Linux before buying, of course. :) It's not a bad little UPS for $40. Considering you're slightly better protected from power surges, I'd recommend it as a good investment.

Anywho, fortunately setting it up is easy as pie. The first thing you need to do is install apcupsd. This is a reasonably simple install on pretty much any distro. On Fedora, it's as easy as "yum install apcupsd". Typically, you'd take the time to verify your UPS was being recognized by hotplug before bothering to setup the daemon, but I figured "fuck it". So far, Fedora's been pretty good at that stuff, so let's barrel on!

Even better than expected, the rpm containing apcupsd was already pre-configured for a USB UPS (prolly 'cause that's the most common kind now. Ya think?). So, for shits and giggles I typed "apcaccess" and was rewarded with tons of useful info!

APC : 001,034,0884
DATE : Thu Jan 26 16:10:35 EST 2006
HOSTNAME : someplace.oranother.com
RELEASE : 3.12.1
VERSION : 3.12.1 (06 January 2006) redhat
UPSNAME : someplace.oranother.com
CABLE : USB Cable
MODEL : Back-UPS ES 350
UPSMODE : Stand Alone
STARTTIME: Wed Jan 25 20:44:09 EST 2006
STATUS : ONLINE
LINEV : 120.0 Volts
LOADPCT : 68.0 Percent Load Capacity
BCHARGE : 100.0 Percent
TIMELEFT : 3.9 Minutes
MBATTCHG : 5 Percent
MINTIMEL : 3 Minutes
MAXTIME : 0 Seconds
LOTRANS : 088.0 Volts
HITRANS : 138.0 Volts
ALARMDEL : Always
BATTV : 13.5 Volts
LASTXFER : No transfers since turnon
NUMXFERS : 0
TONBATT : 0 seconds
CUMONBATT: 0 seconds
XOFFBATT : N/A
STATFLAG : 0x07000008 Status Flag
MANDATE : 2005-02-16
SERIALNO : XXXXXXXXXX
BATTDATE : 2000-00-00
NOMBATTV : 12.0
FIRMWARE : 00.e5.D USB FW:e5
APCMODEL : Back-UPS ES 350
END APC : Thu Jan 26 16:11:29 EST 2006

Yaay! (I took this at 4PM the next day, so that's why the battery's so well charged). I see I don't get a whole lot of time before I die, though. The drawbacks of using a dual-proc server. But, 4 minutes is more than enough time to gracefully shutdown the server and hopefully protect my data and such.

The first thing I need to address is the fact that I've got a W2K3/Exchange 2003 virtual machine running. That needs to be shutdown gracefully first to minimize damage to the database. I have a copy of GSX server, but unfortunately, the newest version of GSX doesn't support machines built with the newest version of Workstation. I've tried a couple of times to wedge it in there, but finally decided to wait for a new GSX. (I know, there are plenty of ways to do it, and I've tried a few with no success for various reasons. Don't bother, it's not that important at the moment). Well, here's the problem, once VMware came out with their "server" products, they removed the ability to shutdown machines gracefully at shutdown (you used to be able to put a line in the VMX file telling it to hibernate the machine on SIGHUP). Since it's a GUI app, I can't just script it, so I'd need a tool to do so, and I looked at a couple. None really did easily what I needed it to do (esentially: bring focus to that window, hit ctrl-Z).

Then, I remembered an easier solution: telnet. W2K3 includes a telnet server, and while I have it disabled by default, that's easy enough to change! So, I enabled and started the service and ran this on the Linux host:

autoexpect -f serversdn.exp telnet hostname

Expect is a nifty little scripting language with a specific purpose: automate other console apps. It's perfect for scripting a telnet session because you can tell it "wait for 'ogin:' and then send the username". Autoexpect simplifies this further. You tell it the name of the file to save your tasks to, and then the command you want it to run. When you're done, you have an expect script that needs no more than a tiny bit o' tweaking to get you up and running.

So, I scripted it to telnet into the server, shutdown the Exchange services** and then shutdown the machine:


set force_conservative 0 ;
if {$force_conservative} {set send_slow {1 .1}
proc send {ignore arg} {sleep .1 exp_send -s -- $arg}# }

set timeout -1
spawn telnet server
match_max 100000
expect "login: "
send -- "administrator\r"
expect "password: "
send -- "easypass\r"

expect "Administrator>"
send "net stop MSExchangeIS /y\r"

expect "Administrator>"
send -- "net stop MSExchangeMTA /y \r"

expect "Administrator>"
send -- "net stop MSExchangeSA /y \r"

expect "Administrator>"
send -- "net stop WinHttpAutoProxySvc /y\r"

expect "Administrator>"
send -- "net stop HomeSeerService /y\r"

expect "Administrator>"
send -- "tsshutdn 0 /powerdown /delay:0\r"

interact


Does it work? Oh, hell yeah it works! I had to do a little tweaking of the server first, though. On the first few passes, it took two minutes and fourty five seconds to shut down. Since I've got just under four minutes of battery power, that might not leave enough time to shut the box down. Fortunately, I've got a little experience with Winders, too...

Open regedit, and change the following:


"HKCU\Control Panel\Desktop\AutoEndTasks" change from "0" to "1"

"HKCU\Control Panel\Desktop\WaitToKillAppTimeout" This one defaults to 20000 milliseconds, I believe. Change it to 2000.

"HKCU\Control Panel\Desktop\HungAppTimeout" Same as above.

Duplicate the above two entries for HKEY_USERS\.DEFAULT so it'll apply to new users as well.

Finally, change "HKLM\System\CurrentControlSet\ControlWaitToKillServiceTimeout" to 2000 as well.


The difference? The Exchange VM now shuts down in one minute and ten seconds. That's a whole lot better, huh?

Now, all I need to do is tell apcupsd what to do when the power goes out, and BOOM! everything shuts down easy as pie. This part's easy enough to figure out. Edit /etc/apcupsd/apccontrol and put your shutdown commands in the various case blocks.

I did a test run by pulling the cord on the UPS. Within a couple of seconds, I watched the VM shutdown and turn itself off. The Linux box then followed soon after without too much issue. I had to tweak the timings as the VM didn't entirely shutdown fast enough, but I think I've got it all set now.

Oh, one final step: go into your BIOS and look for a setting called "Restore on AC/Power Loss". Change it to "Full On" or "Power On". ATX-based machines don't automatically power back on, but changing this setting will make it happen. That way, if the power's only out for a short time, your machine'll be back up and running when you come back!


* I don't put a lot of stock in surge protectors. Even the best triacs used to clamp the circuit are generally not fast enough to stop a lightning bolt from killing Stevie and his siblings. However, I WILL generally spend the extra $10-20 and get a good one 'cause they usually come with guarantees that cover zapped equipment. :)

**This is a single machine acting as domain controller and Exchange server. In that combo, it's best to shutdown your Exchange services before you shutdown. If you take the machine down without doing that, it'll enter a race condition where it tries to shut the services down, but it can't query the domain controller properly because that's going down...the short of it is, in this condition, it can take 30-40 minutes for the box to shut itself down. I don't got that kind of time. Oh, and to prevent accidently doing it when I'm in the machine, I've removed the Shutdown command from the start menu via a policy and replaced it with a batch file that does it right. Where possible, always put a cover over the power switch. ;-)

Monday, January 23, 2006

Rules to entice Open Source adoption

Over the years, I've seen some pretty consistent mistakes done by a large portion of the open source community that have forced me to stay out of it. I think the biggest issue is open sourcers seem to think everyone's a mind reader and just KNOW everything there is to know about their product, and if you don't, you shouldn't be using Linux anyway. Well, if that's your attitude, fuck you and go somewhere else. It never ceases to amaze me that people will put their software out on the web for others to use, and then bitch when people ask them questions. For those that are interested in having people use their software, read on....

Rule #1. Screenshots should be useful and viewable. Screenshots go a long way to telling people about your software. They can see if it's laid out well, if it has the features they need in a way that makes it easy to use them. Sometimes it'll even tell you more about what the software does. Now, I know it's antithetical to the *nix philosophy of "GUI bad, command line GOOD", but too bad. If you don't want to look, don't look. More often than not, I can decide if a particular piece of software will work for me by just looking at it.

Now, that being said, please read this part: SMALLER IS BETTER! This is the other thing that blows my mind. Wanna piss off an open sourcer? Send them a mail in HTML format. You'll get so much crap about sending "bloated" mail. Then, what do they do? They take a screenshot of their entire 1600x1200, 32 million color desktop to show you their new tray widget. Three hours after it finishes downloading the screenshot you can then decide if you want to go further. Seriously, cut 'em down. Resize the app so it's a minimum size necessary to view the functionality, take a screenshot of it, and then put it through the Gimp to cut down on the colorspace, and perhaps compress it. Hell, crop out anything that's not your app while you're at it. It is not necessary to see every icon on your desktop in order to see the new word processor you wrote.

Rule #2. Docs are very useful ways of populating your website with useful information. I used to use LFS, and how often did I find a package that SEEMED to fit my needs, only to find it required hundreds of packages installed, or really didn't fit those needs at all? Too often. You wanna save me some time and just link your INSTALL and README files on your page? They're plain text, so they're not going to kill your space or bandwidth, and will save me a ton of time. I shouldn've have to download a package, untar it, and then go into an editor just to find out the prereqs for it.

Rule #2a. Know your prereqs.
In an ideal world, every developer would have an LFS machine around so they know exactly what libs need to be installed in order for their software to work. There's nothing worse than having a long build appear to finish successfully, only to have the executable panic the machine when run. If they fail to test build it that way first...up against the wall!

Rule #3. About first, history second. The first thing on your website should NOT be a changelog. It doesn't help me to know that you "modified mallocs to use less memory" if I don't know what your software does. Tell me what it is, then put the history and changelogs elsewhere. If I'm interested, I'll look. And, a real-world example or two can sometimes go a long way to helping me decide. Occasionally, I'll come across a project and after looking at it for some time, still have no idea what it does or why I'd use it. I have, more than once, come across a project and then dismissed it only to be directed to it by another site that says "try this, you'll love it!" When I look again, I slap my head. I don't like slapping my head. While we're on this topic, put in an "English" changelog, too. Instead of the above entry, simply say "this version uses less memory".

Rule #4. Not everyone is a programmer. For the love of Cthulu, take pride in what you do! Not everyone can develop software. It's a gift you have, don't take it lightly! If someone asks a question, "look at the source" is not an answer unless they specifically hand you a code block and say "how does this work?"

Rule #5. Don't use Sourceforge. I'm sorry, but I hate SF. It wouldn't be so bad if I didn't have to fight my way through thirty different extremelt slow-loading pages just to download the tarball. Besides, no one uses SourceForge right, anyway. Think i'm being a dick? You find me ONE project on there that actually uses the "Docs" link. Most of the "Project Home Page" links point to an empty index, if you're lucky.

Rule #6. Your software really isn't that revolutionary, work with someone else. How many window managers are there now? The apptrove at Freshmeat tells me it's got 130 projects in that category. Is that really necessary? Really? You mean to tell me no one can create a basic, simple fucking window manager and then add functionality in via plugins? Beyond window managers, though, let's look at some others: instant messengers, how 'bout using a the same config file as one of the others? The information in there should be the same, so if I want to install Gaim and a command line IM, I can and not have to worry about how they're each configured. Would it really be that hard to keep my username/password for each network in just ONE location? For a group that goes nuts about some of the choices Microsoft has made with Windows, you dorks have sure made a lot more mistakes in terms of simplicity and flexibility!

See, the problem is too many choices is NOT a strength. You can spout your stale rhetoric about that all you want, but that don't make it true. Too many choices, especially the apples-to-oranges choices offered only make life more difficult. We have all of these tools and such, and that's great, but they all have different purposes and functionality. You can't just choose one over the other easily...especially when you want to choose on functionality, but you're limited in KDE-type apps or something, and what you want are only available in Gnome. Now I gotta install another fucking environment just to use YOUR app. No thanks.

Okay, this rant's over. Seriously, just make it easier on people. Complexity for complexity's sake is a stupid way to put your software out there.

Wednesday, January 4, 2006

Your drive is dead, you are so screwed

Remember me saying how you really need to protect your data? Boy, do I know how to predict failure or what?! :) Seriously, I had a panic attack over the last couple of days as I started getting mails from mdadm:

A DegradedArray event had been detected on md device /dev/md0.

Yipes! So, I cat /proc/mdstat and found that all three arrays were showing a degraded state, with one drive missing from each. hdg was not showing in any of the arrays. A mdadm -Q /dev/hdg confirmed that it was not part of any array.

A quick inspection of /var/log/messages shows:

Jan 4 16:09:47 alfred kernel: ide: failed opcode was: unknown
Jan 4 16:09:47 alfred kernel: hdg: task_out_intr: status=0x50 { DriveReady SeekComplete }
Jan 4 16:15:47 alfred kernel: hdg: dma_intr: error=0x84 { DriveStatusError BadCRC }


BadCRC?! BADCRC!?!? This hard drive is just over a month old!! It better not be failing. So, I did me some searching, just to be sure I didn't need to go through the trouble of trying to get this drive out and shipped back to the manufacturer (especially since I JUST threw out the damn box!) I figured I'd give it a little test to make sure the drive itself wasn't bad. Since I knew it wasn't part of any array, I could play with it how I wanted. So, I repartitioned it and created a new ext2 filesystem on it. I then copied a large amount of data to and from the drive, with no errors in /var/log/messages. Hmmm...

Unfortunately, you know all that rhetoric about how Linux support is just BUSTLING on the Internet? How it's so much better than commercial support? Yeah, I've heard it, too...After about two hours of searching, I finally came to the conclusion that this error did not indicate a dying drive, but a problem with either the driver, the card, the cable or the drive (as in one of these things was not entirely compatible). In other words, there were lots of opinions out there on what these messages meant, but no real information. Most blamed it on the kernel ("it works fine with a 2.4 kernel"), some blamed it on the drive's manufacturer ("if it can't keep up with DMA requests, you'll get that error. Get a new drive"), others said it was ACPI ("add pci=noapic to your boot option"). Even on the kernel-dev list, a number of people had posted the exact same problem, few found a solution. None of their solutions worked for me.

I'll spare you the exact details on everything I had to do to fix this, but suffice it to say I believe the problem was that I had two different speed drives on the same controller card. It didn't make any sense to me, either since both drives were on their own controller on the card. The real reason for it was more likely a combo of that, and the fact that I'm using these drives in an array (quite a few of the folks with this issue were using arrays). So, I issued the following:

hdparm -X udma3 /dev/hde
hdparm -X udma3 /dev/hdg

This sets the speeds down to 66Mhz. So far, after a half an hour, I haven't seen any errors reoccur in the event log. I'll try moving them up to UDMA4 (100Mhz) at some point, but for now the RAID1 array is rebuilding and I'm not seeing the error, so I'm pretty sure this is a valid fix.

Update: it's the next day, and I had no errors last night or this morning. The arrays are still holding, so I think we're good. I modified /etc/sysconfig/harddisks to use the following command line on each of my drives at boot time:

hdparm -X udma3 -d1 -c3 /dev/hdx

The drives will be a little slower and not running at max efficiency, but I can deal. For the most part, these drives are for storage, and I don't need quick like a bunny access. I think this confirms my theory of it being an md driver issue. If one drive is faster than the other, md should "wait" for the other to catch up. Of course, that would probably introduce many other timing issues in a driver that's designed to minimize data loss.

This situation could have been a LOT worse had I not setup those arrays (ignoring the fact that it probably wouldn't have occured had it not been for the arrays...). The machine just chugged along nicely without a burp, even with one drive missing. It's also a good thing I told mdadm to send me alert e-mails. The machine would have plodded along without me ever knowing the drive had failed. I would have found out the seriously hard way: when one of the others died, too...