Recent Posts

Archives

Categories

Recent Comments


« I am the Dominator | Main | bit-slapped by karma »

Stupid is as Stupid does.

By Jesse Morgan | January 19, 2006

I don’t even know where to begin. This morning has sucked and it’s no ones fault but my own. It started with the DMA crap- the one machine where it would really help is the fileserver. I needed to simply change the chipset driver, recompile the kernel, and reboot. This should be simple for even a semi-competent linux users.

However, I am a moron. Each mistake I make, I’m gonna put a little * next to it.

So, first thing I decide is “well, since I’m changing kernels, I might as well upgrade*[1]“. so I grab the 2.6.14-r5 source, make oldconfig, and set it up all perfect. I add the lines to my grub config and sit on it for a few days before going through with it.

I notice that the grub config is rather old… like original configuration and note “that’s odd. I must not have known how to deal with scsi when I built this machine, and changed over to lilo. oh well, I know what I’m doing now*[2], so I’ll just install grub to the MBR”

flash forward to this morning. I’ve decided to reboot, and with an hour before work, I figure that’s more than enough time to troubleshoot if problems arise*[3].

I reboot…. notice it’s one of the older bios’ that counts ram. there 640 megs in this machine, so it takes a while… it also has a scsi controller, so it has to count through all of that. all in all, a very slow boot. This is important to remember- it takes 2 minutes of fricken waiting to get through the reboot process.

Nothing happens.

“eh? That’s not good.”

This by the way, is the worst thing you could ever possibly hear me utter. So I reboot. Two minutes later, it does the same thing.

“Crap, lemme just get a recovery CD out and fix it…. CRAP.”*[4] The machine has 1 scsi drive with the OS, and 4 IDE HDs for storage- I took out the CDROM when I added the 200gig HD.

alright… ok, I see what happend… I told grub to check hd0 instead of hd4 for the root directory *[5]- normally it would be hd0(the first ide device) but since I’m using scsi, it comes up as hd4.

“lemme try booting off of hda1 instead of the scsi drive…” well, I couldn’t remember what order they were installed in, and I didn’t want to really crack the case, so I tried booting ALL 4 IDE DRIVES, keeping in mind each try takes not only a 2 minute boot, but I can’t access the bios until it runs through the boot process and the scsi process, then a reset and boot with new settings. so each drive takes a minimum of 4 minutes, plus me mucking with the bios.

Well, nothing works- each one goes to a dead screen. “Well, I guess I can use a boot floppy…”*[6]*[7] I get a floppy from jackie, and start busting out my cdroms looking for a bootimage… Couldn’t use lynx because the fileserver is also the internal dhcp and dns server*[8] couldn’t find one on the redhat CD I had… couldn’t find one on the gentoo CD I had… couldn’t find my suse Cds… couldn’t find a usable one on my ancient slackware CDs (copywrite 1995)… finally found one on a burned slackware CD from who knows when… probably 2002. find the img, dd it to the floppy, and pop the floppy in the fileserver.

I then start to think ahead*[9] “well, if this doesn’t work, I better locate an extra CDROM and pull a HD.” So I boot the machine, change the boot order to take the floppy (another 4 minutes), find an extra CDROM that is of questionable repair, and begin to slide the side panel off.*[10] The machine shuts down. I jiggle the power cable. I press the power button… nothing. The only thing I saw was the power button was blinking.

I freak out.*[11] No if, ands or buts about it.

Jackie asks me “maybe the power supply fell off”. Now I know what she means- she meant two things- “maybe the power cable came loose”, and “maybe the power supply failed.” I know this and I love her very much, but at that moment in time she inspired a very deep and dark rage inside of me.

I took it out on everything that was laying on top of that case.*[12] First I gently moved things like my ceramic change jar, and the foo-foo carpet cleaning powder that got set there they last time the carpet was vaccumed. Then I started throwing things: The nic card that I was meaning to put in the machine; the spare USB dongle; the cane of carpet cleaner that got put there the last time the cats yacked on the carpet.

Then I swept everything else off. Jackie was yelling at me at this point, something about calming down. I grabbed the side panel and threw it on the pile of shit on the floor. my hands were shaking. There was no fucking way this power supply just died on me. The inside of the case was covered in dust, so I sprayed it out with a can of air*[13] and checked all the cables. everything is as it should be. There is no reason that the machine should have powered off when I opened the case….

then I saw the case sensor. it was an old server box, and I forgot it was there. I didn’t even know it worked. well, apparently when this latch is triggered, it powers off the machine. so I followed it, in the hope I could rip it from the mobo and boot back up… it lead to the front of the machine. so I had to peel off the front cover- there was a second button. with both buttons freed, the blinky light went off. I pulled the power cable to make sure everything was cycled correctly, reattached the power and it booted up fine. Finally, I can boot off the floppy and get this fixed….

Jackie left for work at this point- I mumbled an apology, knowing I was now late for work because I was gonna miss the bus. my one hour was up.

the machines boots, and starts to read the floppy….
nothing.

“fuckit, I’m hooking up the CDROM.” I spend a few moments trying to read the jumper settings of the HD I’m trying to replace with a CDrom. it’s the primary slave. I set the CDROM accordingly. I jump through the bios hoops. I get to the CDROM command line.

I start to calm down. I fix the grub command lines to use hd4, and then attempt to install grub to /dev/sda (the scsi drive.)

Grub balks. I don’t know why, but it does not like hd4. I try for 10-15 minutes, looking in a lunux+ certification book for grub, I look in a fedora bible for hints, I look everwhere. I finally check my LPIC 1 Exam Cram book and determine this is total crap, I did everything perfectly- for some ungodly reason it doesn’t like my scsi drive.

“screw it, I’ll just use lilo.” I adjust the lilo.conf, reload lilo to /dev/sda (LILO didn’t have a problem with it) and shut it down. I disconnect the CDROM, put the face plate back on, power it up and wait…

Sweet, sweet lilo boot screen- how I’ve missed you.

I boot up using the new kernel and hold my breath. everything works. my workstation comes back to life (remember, my this was the fileserver I totalled- my /home was on it.)

I notice I have 10 minutes to get ready to go to catch the next bus. Grinning, I put the side cover back on… *[14]

the machine dies. the power light blinks.

“AAAAAAAAAAAAAAAARRRRRRJGHH I HATE THIS MACHINE ”

I pull the power plug, remove the side panel, then remove the face plate. I press the power button a couple of times for good measure, reconnect everything and power it back up. It all comes up properly. I’m throwing on clothes and getting prepared while it’s booting. I haven’t combed my hair, I haven’t eaten, but I have no time now. I hit the lights, lock the door and run to catch the bus. I just barely made it- I got to work half an hour later than normal.

Now, I’ve counted about 14 or so stupid things here. Bonus points to the person who can name what I did wrong at each of those points.

(And yes VP, I know I you don’t know what a DMA is because you have a Mac.)

Topics: IRC, Jackie, Linux, Open Source, Rant, Work |

3 Responses to “Stupid is as Stupid does.”

  1. VP|bofh Says:
    January 19th, 2006 at 9:37 am

    Dear sir, your post is too long. Please provide a summary and banking informations. Thx. Bye.

  2. K_F Says:
    January 19th, 2006 at 10:08 am

    I share your pain. hda on kfs001 died on me and I’ve spent the day moving stuff over to secondary systems. oh, kfs001 is 700km away from me. fricking great

  3. Jesse Morgan Says:
    January 19th, 2006 at 4:46 pm

    ok, since no one took me up on the offer, here are the mistakes I made:

    1) “well, since I’m changing kernels, I might as well upgrade” - This is a bad idea on so many levels it’s not funny. Always do one thing at a time, that way, if something breaks, you can track it down easier. I was lucky that changing kernels didn’t cause a problem, but I got burned on several others

    2) “I know what I’m doing now” - If you have to tell yourself this then, no, you really don’t.

    3) “I’ve decided to reboot, and with an hour before work, I figure that’s more than enough time to troubleshoot if problems arise” - never ever ever EVER do something when you have a time limit like that- that is tempting fate.

    4) “Crap, lemme just get a recovery CD out and fix it…. CRAP.” - always make sure you have recovery media on hand THAT WILL WORK.

    5) “I told grub to check hd0 instead of hd4 for the root directory” - Had I been paying attention, I would have noticed it didn’t DETECT a hd4. I should have tested.

    6) “Well, I guess I can use a boot floppy…” Make sure you keep at least one working floppy around. not a crappy used one, a new, fresh one.

    7) “Well, I guess I can use a boot floppy…” Make sure you have a floppy Image on hand for booting off a floppy

    8) “Couldn’t use lynx because the fileserver is also the internal dhcp and dns server” - make sure if you’re gonna trash your dns server, to have the secondary up and running.

    9) “I then start to think ahead” - …think, but don’t act. acting ahead gets you in trouble.

    10) “and begin to slide the side panel off” - just leave the damn thing alone. if you can’t make sure that it doesn’t have a case trigger to kill the machine.

    11) “I freak out.” - freaking out does not help. it makes things worse. proven fact.

    12) “I took it out on everything that was laying on top of that case.” - smashy-smashy may make things feel better at first, but it can get expensive.

    13) “The inside of the case was covered in dust, so I sprayed it out with a can of air” - If you’re panicky and pissed off, blowing a giant duststorm back into your face and setting off your allergies is NOT going to help you.

    14) “Grinning, I put the side cover back on…” - pay attention to wtf you are doing, you tool.

Comments