You know the one if you manage computers
for a living. You've got a server
that's been running great for months, years even, and you haven't changed anything recently, but all the sudden, it's being cranky. And although you know you haven't changed anything, you also know you never take that as an answer from your users
--if the system is behaving differently, something changed, and since you are the one who does things to it, it's probably you, even though you know
nothing changed. . .
This one (a NetWare box, if you care, but that's really not the point) just stopped letting people log in to it on Friday, right before lunch. Then, it wouldn't come back up right--we ended up doing fun things to it (removing namespaces on a NetWare volume) to get it to boot at all. We had the helpdesk telling people to go to lunch and just take a chill pill.
We eventually got it booted, it stayed running the rest of the day, I go home, and just as I'm crawling into bed, my wife says, "Honey, is that your pager?" It's on vibrate mode in my nightstand, so it's hearable but not jarring. I said, "no, it's just the trucks" (we live near a state highway with a stoplight and frequently hear the rumble of downshifting trucks,) but I checked, and sure enough it had died again.
So I go and do more things to it. Now, mind you, we have no indications of what's going wrong--it's just locking up cold. No pattern of what might have happened or anything. The possibility is that it's a hardware problem, or a side effect of something we did weeks ago that the system just got around to caring about, or just completely random. I do more stuff and it seems fine, so I go home at 2 am.
Then it crashes again the next night. This time, the pager woke neither of us, and I found it in the morning, drove in on a brilliant sunny day to reboot it and do some more stuff.
On Monday we check into buying one of those cool remote management cards that lets you reboot the machine remotely. I live half an hour away.
It's running now, but I keep waiting for it to die. I've got more diagnostics running, have disabled some programs and I'm hoping some cry for help will come before the hang if it does hang at all. We shall see.