08:30am: Wake up.
09:00am: Get out of bed.
10:00am: Arrive at work -- a little early. =) (1)
10:15am: Alphapager erupts with messages: "RED: web3.somewhere.net: (SVC DOWN: HTTP/80) VRFYd 3x. CTLCODE 23D3F4C" .. ah. The automated Big Brother system doing its job again. A quick look illuminates the problem -- some moron added a vhost entry and didn't remember to close the directive before restarting. It's a quick, easy fix.
10:17am: I hit #00 on my phone to activate the company-wide intercom; "All Technical departments: web3 is currently down. This will affect domains starting with M-R. ETA for uptime is 5 minutes. ETA for full repair is 2 hours. Thank you."(2)
10:23am: Technical supervisor comes into the network room laughing. In his hand is a small tape. They record most incoming calls to the ISP, for quality control purposes. (translation: so they can scream at the techs). He shoves the tape in the radio on my desk, and queues it to the appropriate spot. He hits play, and I gain new respect for one of the techs named Dave on the tape:

Dave: Thanks for calling inter.net, this is Dave, can I have your account name please.
Customer: BigSukr is my account name. Hey. WTF happened to your server. I have a domain hosted there and its down. Nancy.com. What the hell. We pay you guys good money.
Dave: Yes sir. Ah yes, your account is hosted on web3. Excellent. Well sir, we are currently performing a systems upgrade on that webserver. We are doubling the ram to 1gig, adding a new 32 gig scsi in addition to the existing drives, and adding a second Xeon P3/600 to that machine. We think you'll be pleased with the performance of the machine once we are finished.
Customer: Hmmm. Well. When will you guys be finished with that then?
Dave: Oh, about 5 minutes sir.
Customer: Great! hey, thanks guys. You guys are great.
Dave: No problem sir. We do what we can.
Customer: Thanks again.
Dave: Absolutely. *click*

Wow. That was good. Very good. hehehe. He should have been in goverment. =)
(1) Ah, tis good to be part of the network team. While billing and tech support are kept to strict regiments regarding their attendance, this restriction does not apply to the network guys. On the rare occasion that management really does ask, "why aren't you here at 9am?" - be sure to answer: "I was rebuilding the MTA aliases to avoid having to fsck due to IOCTL kernel errors on the FreeBSD box until 2am this morning." This will buy you additional time as their brain cells overload. Nevermind that you need only to type: "newaliases" as root to do this 2 second job -- they dont know that. heh.

(2) Never make it seem as easy as it is. Give yourself extra time to make the fix, and 'save the day'. Always. You'll thank yourself when reviews for raises come up, and you've 'saved the day' and 'worked wonders' a myriad of times recently.

What happens here is simple. Get your favorite brand of music, a good set of headphones and plant yourself in front of the machine for a while. Usually you know how to fix it, you just can't think of it right away. Ben Folds Five or R.E.M. have proven to aid me immensely in such mental explorations.

Headphone design is very important -- it must allow you to hear system sounds, but conveniently ignore screaming people right behind you.

Here's what to do:

  1. DON'T PANIC! Take a deep breath and remember that it's not your server that's down.
  2. Put them on hold, immediately. This is important. From experience I know that it is ten times as difficult to fix a problem with a customer yammering in your ear. They will inevitably distract you with irrelevancies: how much money they are paying for your service, questions about software that you haven't touched for years ("So would you recommend Windows 95 or Windows 98 for my home machine?"), amusingly innacurate guesses as to what the problem is.
  3. Check the obvious things first. Do not assume that the customer has, or that they did it right if they did.
  4. Trust your intuition. If you get a feeling that you should check such-and-such a system, it's worth a look.
  5. If you can't figure it out, take the client off hold and lie. Tell them it's a hardware problem and you'll need more time. Get them off the phone at all costs. Then go to lunch.

Log in or registerto write something here or to contact authors.