Re: mute crash / DDB



"jpd" <read_the_sig@xxxxxxxxxxxxxxxxxxxxxx> wrote in message
news:4h4b2rF1q2r04U1@xxxxxxxxxxxxxxxxx
Begin <IradnQyJctWRcDHZnZ2dnUVZ8qednZ2d@xxxxxxxxx>
On 2006-07-06, Steve at fivetrees <steve@xxxxxxxxxxxxxxxxxxxxx> wrote:
These machines, and their predecessors, and *their* predecessors, have
been very reliable - apart from the odd event such as this every few
months.

I don't know if your boxes are on a UPS, but if it really is that
sporadic and apparently fairly independent of the hardware, it even
might be anomalies in the power. If you have logs of previous incidents,
how regular are they, really?

They are indeed on a UPS. It's the machine that's taking most of the load
(active webserver) that goes mute every once in a long while (in the old
days, it'd reboot rather than die, as previously noted).

Re regularity: somewhere around 4-6 months. My guess (as I've said) is that
it's a resource issue. Note that I've seen this issue (assuming it's the
same issue, which seems likely) over several generations of both OpenBSD and
hardware [1]. I wasn't too worried about it when the symptom was a reboot;
I'm slightly more concerned now that it freezes, since it knocks out the
webserver etc until I notice and/or start getting phone calls. I have
monitoring enabled via my coloco provider, but since this works on the basis
of pings, it doesn't help :(.

I note recent discussion on the misc@ list re "3.9 freeze" - which exactly
describes what I'm seeing - i.e. completely dead but still responds to
pings.

[1] Except 2.6, on which I managed to get around 480 days of uptime. I'm a
bit more proactive on patches and controlled reboots these days ;). Back
then I ran a custom kernel; I've double-checked for significant differences,
but I'll repeat the exercise.

Steve
http://www.fivetrees.com


.



Relevant Pages

  • Re: XP sometimes boots only part way
    ... I was mixing up services and start ups. ... If it has then put back half of them and reboot ... Into Safe Mode it will always boot. ...
    (microsoft.public.windowsxp.basics)
  • Re: Its Here
    ... completed just in time before another reboot. ... UPS would be able to handle the new computer and monitor. ... But the '725' wasn't the UPS wattage: it was the VA rating, ...
    (alt.sys.pc-clone.dell)
  • Re: SBS 2000 strange behavior
    ... You said the UPS batteries when out and you think it started after that? ... I have a SBS SP1a server that is running ... >> sudden..it freezes up again, ...
    (microsoft.public.windows.server.sbs)
  • Re: Spontaneous Reboot SBS2K3
    ... if it continues to reboot by itself, turn off the automatic reboot on error, ... so that if the server has an error (as opposed to dodgy ups etc) then you ... also had an instance where someone plugged a hover into UPS ... could be protected) - that overloaded UPS causing it to down the server :-), ...
    (microsoft.public.windows.server.sbs)
  • Re: UPS recommendation
    ... There was a fault in the underground power line that supplies my house and one or two others, not even enough to flicker the lights but enough to cause a PC to reboot. ... I can assure you that after the 3rd or 4th reboot in an afternoon, you're off to the store for a UPS. ... workstations as well - I've had some bad experiences with the cheapest APC ...
    (microsoft.public.windows.server.sbs)