Archive for January, 2006
Stupid is as Stupid does.
I don’t even know where to begin. This morning has sucked and it’s no ones fault but my own. It started with the DMA crap- the one machine where it would really help is the fileserver. I needed to simply change the chipset driver, recompile the kernel, and reboot. This should be simple for even a semi-competent linux users.
However, I am a moron. Each mistake I make, I’m gonna put a little * next to it.
So, first thing I decide is “well, since I’m changing kernels, I might as well upgrade*“. so I grab the 2.6.14-r5 source, make oldconfig, and set it up all perfect. I add the lines to my grub config and sit on it for a few days before going through with it.
I notice that the grub config is rather old… like original configuration and note “that’s odd. I must not have known how to deal with scsi when I built this machine, and changed over to lilo. oh well, I know what I’m doing now*, so I’ll just install grub to the MBR”
flash forward to this morning. I’ve decided to reboot, and with an hour before work, I figure that’s more than enough time to troubleshoot if problems arise*.
I reboot…. notice it’s one of the older bios’ that counts ram. there 640 megs in this machine, so it takes a while… it also has a scsi controller, so it has to count through all of that. all in all, a very slow boot. This is important to remember- it takes 2 minutes of fricken waiting to get through the reboot process.
“eh? That’s not good.”
This by the way, is the worst thing you could ever possibly hear me utter. So I reboot. Two minutes later, it does the same thing.
“Crap, lemme just get a recovery CD out and fix it…. CRAP.”* The machine has 1 scsi drive with the OS, and 4 IDE HDs for storage- I took out the CDROM when I added the 200gig HD.
alright… ok, I see what happend… I told grub to check hd0 instead of hd4 for the root directory *– normally it would be hd0(the first ide device) but since I’m using scsi, it comes up as hd4.
“lemme try booting off of hda1 instead of the scsi drive…” well, I couldn’t remember what order they were installed in, and I didn’t want to really crack the case, so I tried booting ALL 4 IDE DRIVES, keeping in mind each try takes not only a 2 minute boot, but I can’t access the bios until it runs through the boot process and the scsi process, then a reset and boot with new settings. so each drive takes a minimum of 4 minutes, plus me mucking with the bios.
Well, nothing works- each one goes to a dead screen. “Well, I guess I can use a boot floppy…”** I get a floppy from jackie, and start busting out my cdroms looking for a bootimage… Couldn’t use lynx because the fileserver is also the internal dhcp and dns server* couldn’t find one on the redhat CD I had… couldn’t find one on the gentoo CD I had… couldn’t find my suse Cds… couldn’t find a usable one on my ancient slackware CDs (copywrite 1995)… finally found one on a burned slackware CD from who knows when… probably 2002. find the img, dd it to the floppy, and pop the floppy in the fileserver.
I then start to think ahead* “well, if this doesn’t work, I better locate an extra CDROM and pull a HD.” So I boot the machine, change the boot order to take the floppy (another 4 minutes), find an extra CDROM that is of questionable repair, and begin to slide the side panel off.* The machine shuts down. I jiggle the power cable. I press the power button… nothing. The only thing I saw was the power button was blinking.
I freak out.* No if, ands or buts about it.
Jackie asks me “maybe the power supply fell off”. Now I know what she means- she meant two things- “maybe the power cable came loose”, and “maybe the power supply failed.” I know this and I love her very much, but at that moment in time she inspired a very deep and dark rage inside of me.
I took it out on everything that was laying on top of that case.* First I gently moved things like my ceramic change jar, and the foo-foo carpet cleaning powder that got set there they last time the carpet was vaccumed. Then I started throwing things: The nic card that I was meaning to put in the machine; the spare USB dongle; the cane of carpet cleaner that got put there the last time the cats yacked on the carpet.
Then I swept everything else off. Jackie was yelling at me at this point, something about calming down. I grabbed the side panel and threw it on the pile of shit on the floor. my hands were shaking. There was no fucking way this power supply just died on me. The inside of the case was covered in dust, so I sprayed it out with a can of air* and checked all the cables. everything is as it should be. There is no reason that the machine should have powered off when I opened the case….
then I saw the case sensor. it was an old server box, and I forgot it was there. I didn’t even know it worked. well, apparently when this latch is triggered, it powers off the machine. so I followed it, in the hope I could rip it from the mobo and boot back up… it lead to the front of the machine. so I had to peel off the front cover- there was a second button. with both buttons freed, the blinky light went off. I pulled the power cable to make sure everything was cycled correctly, reattached the power and it booted up fine. Finally, I can boot off the floppy and get this fixed….
Jackie left for work at this point- I mumbled an apology, knowing I was now late for work because I was gonna miss the bus. my one hour was up.
the machines boots, and starts to read the floppy….
“fuckit, I’m hooking up the CDROM.” I spend a few moments trying to read the jumper settings of the HD I’m trying to replace with a CDrom. it’s the primary slave. I set the CDROM accordingly. I jump through the bios hoops. I get to the CDROM command line.
I start to calm down. I fix the grub command lines to use hd4, and then attempt to install grub to /dev/sda (the scsi drive.)
Grub balks. I don’t know why, but it does not like hd4. I try for 10-15 minutes, looking in a lunux+ certification book for grub, I look in a fedora bible for hints, I look everwhere. I finally check my LPIC 1 Exam Cram book and determine this is total crap, I did everything perfectly- for some ungodly reason it doesn’t like my scsi drive.
“screw it, I’ll just use lilo.” I adjust the lilo.conf, reload lilo to /dev/sda (LILO didn’t have a problem with it) and shut it down. I disconnect the CDROM, put the face plate back on, power it up and wait…
Sweet, sweet lilo boot screen- how I’ve missed you.
I boot up using the new kernel and hold my breath. everything works. my workstation comes back to life (remember, my this was the fileserver I totalled- my /home was on it.)
I notice I have 10 minutes to get ready to go to catch the next bus. Grinning, I put the side cover back on… *
the machine dies. the power light blinks.
“AAAAAAAAAAAAAAAARRRRRRJGHH I HATE THIS MACHINE ”
I pull the power plug, remove the side panel, then remove the face plate. I press the power button a couple of times for good measure, reconnect everything and power it back up. It all comes up properly. I’m throwing on clothes and getting prepared while it’s booting. I haven’t combed my hair, I haven’t eaten, but I have no time now. I hit the lights, lock the door and run to catch the bus. I just barely made it- I got to work half an hour later than normal.
Now, I’ve counted about 14 or so stupid things here. Bonus points to the person who can name what I did wrong at each of those points.
(And yes VP, I know I you don’t know what a DMA is because you have a Mac.)
I am the Dominator
so I created a list of things I needed to do, and started incrementally going through it, finishing little projects and problems that I’ve been meaning to get to. I’ve got a whole lot done in the past week:
- got squirrelmail working properly
- fixed the DMA on both workstations
- Fixed Jaxon’s cdrom
- Set up Kmail
- Fixed the DNS issues
- Swapped HDs around
- get spanish dictionary working for jackie in OOo
- Set up LDAP
Some of these have been floating in my head for years (ldap) while others have been broken for just as long (DMA, squirrelmail). It feels good to accompish something, no matter how trivial.
So I’m running out of space. on everything.
my workstation has a 40 gig split in half with 1/2 linux and half a long dead windows partition (I got this drive in 2001 or thereabouts.)
the windows partition has been repurposed, but it’s still a pain in the ass. since the windows partition appears first on the drive, there was no way to wipe it and merge it with the other parition (easily). The problem I was having was large software installs (Neverwinter Nights, World of Warcraft) required a LOT of space- 5 gigs each. It adds up quickly.
Jackie’s machine is in the same situation- not barely enough room for neverwinter nights.
When I took the job at CSX, I needed a redhat install to tinker with- so I went out and purchased a 200gig and put it in my workstation.
My webserver has a 10 gig drive, which has been holding for a good 2 or 3 years. Well, the webserver had a bit of an accident a week or 3 ago and it’s HD filled up. so I’m left in a bit of a pickle.
So here’s what I did. I moved all the contents of the 40gig in my workstation to the 200 gig. took a bit of tinkering, but it’s done. it should be ok for a good long while.
I’m going to put my 40 gig in jackie’s machine, and transfer everything of hers onto my HD with a single partition. She has windows at work, so she really doesn’t need the partition anymore- I don’t think she’s used it in a long time anyways.
finally, I’m going to place jackie’s 40 gig in the webserver and transfer the contents of the 10 gig to the 40 gig.
I’ll then shelf the 10 gig.
sounds simple, huh?
Update: 1:00 pm
So I got my workstation taken care of- now I just need to transfer jaxon’s stuff around- I’ll wait for jackie to get back before attempting that though.
Update: 9:00 pm
Jackie’s workstation, jaxon, as transitioned smoothly. It’s now running on my hard drive and she has 20 gigs of free space.
P-nut the webserver is next. While P-nut is down, morgajel.net, morgajel.com and myjaxon.com will be down as well, as well as the audio stream and IRC.
Update: 12:45 am
P-nut is up and running. I switched him over to grub (rather than lilo), set up a boot partition and increased his swap space.
it is gratifying to see this:
Filesystem Size Used Avail Use% Mounted on /dev/hda3 36G 6.3G 28G 19% / /dev/hda1 183M 16M 157M 10% /boot
before it was at 6.3G out of 9.2G, and that was AFTER I cleaned.
So I think my work on this project is done. I’d like to figure out why DNS is still not working properly, but that can wait for tomorrow.
End of an Era
I finally let go of draccus.net.
it was a sad day. I’ve not used it in over a year, and it’s just sat unused. I’ve replaced it with morgajel.net for the most part. I figured it was time to move on.
So what’s the history of draccus.net? well, it was the first domain I ever bought. it was my first real website. It was there when I was hosting on my CSIS account, it was there when I set up my first list server at Brookmeadow, and it was there when I proposed to jackie. Lotta history there. I’ve still got all of the data that was on it, and maybe I’ll bother going through and back-log posting it.
eh, we’ll see.
draccus.net is dead. long live morgajel.net.