May 17, 2006

A no good, horrible, very bad day

You ever have one of those days were absolutely nothing goes your way? That pretty much sums up my day today -- just about every computer that I touched today did something odd, unexpected, unexplainable, or just all together crashy. Worst of all, all of my problems occurred with my home computing infrastructure, which provides not only a home for this blog, but is also my gateway to the Internet.

It all started this morning, when I decided to tinker before going to work. After last night's hardware swap, I had redefine re-compile FreeBSD while I slept. When I checked things out this morning, everything looked good. The compile went well and the machine hadn't crashed. So, I rebooted, in order to go into single user mode, so that I could install the new FreeBSD bits that I had compiled. On the way to single user mode, I decided to actually power the machine off, so that I could put the case back on, and move it into its home, under my desk.

The case installed, I powered back up. But instead of booting into FreeBSD, my Dell Precision Workstation 420 failed to POST. This means that it crashed, but before it even loaded FreeBSD. Several frantic reboots later, not only was the problem not fixed, but now my helpful Dell BIOS was displaying this fantastic error message:

Dell_Vmgr_BIOS_Error.jpg
What a helpful error message, thanks Dell!

Not knowing what the heck that error meant, and being in a rush, I decided to try moving my hard drives and RAID card to a third machine that I had lying around (old computers never die -- they just "lie around"). This machine has been known to be temperamental in the past, but I bought the motherboard at a computer flea market for only $35, so I can't really complain.

After moving all of the necessary parts over, things got off to an auspicious start when this third machine (let's call it "Frankenbox") wouldn't even light up the monitor. Some jiggering of cables provided an image, and everything seemed fine, at first:

Frankenbox_FreeBSD_starting_to_boot.jpg
As you can see, FreeBSD 6.1 is starting to boot just fine...

However, as the kernel started to load, I suddenly noticed that the text on the screen seemed odd somehow. It actually appeared to be garbled, which is something that I have never, ever seen before - not on FreeBSD, Linux, DOS, or heck, even Windows. Check it out for yourself:

Frankenbox_garbled_boot_text.jpg
All of those repeating characters and spelling errors is NOT how FreeBSD normally boots. :(

Thoroughly appalled and dejected by this turn of events, I resolved myself to stop tinkering with my computers, and go to work (it was hard to let go, especially with this sort of problem unsolved, but I managed to do it). While at work, I google'd the error that I got from my Dell machine, and found out that it had to do with the Dell BIOS not properly recognizing some PCI cards. The fix was to pull your card and reboot, clearing the error condition, and then putting them back in.

So after work, I began tinkering with the Dell again. And after not too much effort, I managed to get it booting FreeBSD off of the RAID controller again. Hooray! I was even able to reboot after finishing my FreeBSD upgrade, and everything seemed fine.

But then, I got cocky once again. I powered the machine off, in order to put the case back on (deja-vu, anybody?). And once again, I was greeted with my Dell failing to POST. More messing around ensued, and the basic story at this point, is that the sun, the moon, and the stars all have to be aligned properly in order for the Dell BIOS to accept this 3ware RAID card that I bought. And unfortunately, I still haven't figured out the magic alignment that will reproduce booting whenever I see fit.

So, after all of this time wasted, I am back to where I started -- I put the RAID card and hard drives back into the old machine, and I'm back in business. So, the lessons that I have learned:

  • Don't knock software RAID - you certainly won't have all of these crazy hardware compatibility problems. I wasn't really too interested in software RAID, but based upon the experience that I'm having with this 3ware card, I have suddenly gotten a lot more interested in software RAID.
  • Also, don't knock late 90's technology - the Intel PR440FX motherboard is quite formidable, as it is the only machine that I own that works reliably with the a RAID card from 2006.
  • I was hoping to wait until next year to buy a new server machine (so that I can get my hands on a nice, low-wattage CoreDuo machine), but now it's looking like I might have to invest in some new hardware a might sooner.
  • And finally, computers suck.
-Andy. Posted by andyr at May 17, 2006 11:58 PM