SUMMARY: automatic poweroff after high temperature detected
- From: Bob Vickers <bobv@xxxxxxxxxxxxx>
- Date: Mon, 05 Jun 2006 14:38:55 +0100 (BST)
Thanks go to the various people who replied, mostly saying the same thing:
software triggered poweroff on a 4100 is impossible, but you can at least
cause the system to halt after a failure as follows:
# /sbin/consvar -s auto_action HALT
# /sbin/consvar -a
I have included Peter Stern's reply verbatim as he went into quite a bit
I do not think that there is any command which will cause your system to
power off. Even when you shutdown the system with 'shutdown -h' it
doesn't power off by itself. I think that the best you can do is:
setenv auto_action = halt
which will stop your system at the console prompt (>>>).
Then at least, the cpus won't be working generating heat and it won't
keep shutting down and rebooting.
And yes, you should be worried about the damage heat may cause.
Hopefully, nothing bad happened. but....
By the way, you can set ENVMON_USER_SCRIPT to some script which gets
executed when the system gets hot, e.g. sends you email. Then you can
try to solve the problem or power off the machine until you can.
This weekend we had an air conditioning failure and the temperature in the
machine room rose dramatically. The result was that every 18 minutes
envmond shut down our Alpahaserver 4100, only for it to reboot. This
carried on for about 27 hours until eventually (I presume) the hardware
detected an error and refused to come up.
The machine was still extremely hot the following morning (about 12 hours
later) so I'm not certain the power got turned off even then.
I am worried about the damage that might be caused by this, also the heat
being generated which increases the problem for the other servers in the
room. Is it possible to configure the system so that when a high
temperature is detected the machine shuts down and powers itself off? In
fact I don't mind if any kind of crash causes a power off, because tru64
is so reliable we never get crashes (touch wood!).
Our relevant variables are:
# /usr/sbin/envconfig -q
ENVMON_CONFIGURED = 1
ENVMON_HIGH_THRESH = 40
$ /sbin/consvar -l |grep -i boot
auto_action = BOOT
boot_dev = rz25
bootdef_dev = rz25
booted_dev = rz25
boot_osflags = A
booted_osflags = A
boot_reset = OFF
Dept of Computer Science, Royal Holloway, University of London
- Prev by Date: automatic poweroff after high temperature detected
- Next by Date: Tru64 v5.1: AdvFS file domain panic
- Previous by thread: automatic poweroff after high temperature detected
- Next by thread: Tru64 v5.1: AdvFS file domain panic