Re: Method for monitoring metastat over multiple machines
- From: "D G Teed" <donald.teed@xxxxxxxxx>
- Date: Wed, 19 Dec 2007 10:42:36 -0400
An early response to this posting suggests the expect script should
be replaced by ssh keys set up. I totally agree. We normally do use that
and I shouldn't have suggested a solution with the hardwired passwd.
I blame a quick and dirty one-off script I made for checking date across
dozens of different systems (around the DST changes) for having
me consider the expect based solution. Please don't follow my
bad example of using the expect based solution.
SNMP is a suggested alternative, but the daemon isn't configured everywhere,
while ssh is, so that influenced my choice to solve this via ssh.
If anyone wants to document an SNMP based solution to meta RAID
monitoring (or provide a link to a howto), please do.
On Dec 19, 2007 9:25 AM, D G Teed <donald.teed@xxxxxxxxx> wrote:
Yesterday I came up with a method for monitoring meta RAID_______________________________________________
devices for possible failures from one central script.
On each Solaris machine running software RAID, I've made a small
script to look for metastat status terms indicating a problem.
This is /usr/local/sbin/metacomplain on each solaris host:
#!/bin/sh
/usr/sbin/metastat | /usr/bin/egrep -i 'maintenance|erred'
If there are no status like maintenance or erred, the output is quiet.
The next step was to make a user "metatest" which had this script as its
/etc/passwd shell on each Solaris host. I gave this user a consistent
password on each machine.
metatest:x:1004:1::/home/metatest:/usr/local/sbin/metacomplain
This allows for the check to run merely by ssh into it, while
keeping the account's ability narrow (preferable when the password
is hardwired into a script). I tested scp and sftp with this
user set up and they are unable to do anything (good).
Using a for loop, with expect for providing the password (handled
by /usr/local/bin/respondpw), one script can check the status of all
Solaris hosts. This is set up on a Linux host :
/usr/local/bin/checkmetastat:
#!/bin/bash
for systems in host1 host2 host3 host4 host5 host6 host7 host8 host9
do
/usr/local/bin/respondpw metatest@$systems pass-secret | egrep
-i 'maintenance|erred'
problem=?$
if [ "$problem" = "1" ]
then
echo "metastat error (software RAID) is found in $systems"
else
echo "$systems is OK for metastat check (software RAID)"
fi
echo " "
done
/usr/local/bin/respondpw:
#!/usr/bin/expect -f
# wrapper to make ssh command not require password
# username is passed as 1st arg, passwd as 2nd, command is third
set password [lindex $argv 1]
spawn ssh [lindex $argv 0] [lindex $argv 2]
expect "*assword:"
send "$password\r"
expect eof
So that is my contribution to the Solaris community if anyone likes it.
I'm sure it could be modified into a nagios script, but I think
cron can do this check often enough.
If there things anyone would suggest in improving it, I welcome
comments and criticism. I know my code isn't the most terse
and efficient, but I prefer making it easier for junior admins
to understand if they need to make modifications.
--Donald
sunmanagers mailing list
sunmanagers@xxxxxxxxxxxxxxx
http://www.sunmanagers.org/mailman/listinfo/sunmanagers
- References:
- Method for monitoring metastat over multiple machines
- From: D G Teed
- Method for monitoring metastat over multiple machines
- Prev by Date: Re: disabling NIS
- Next by Date: SUMMARY: boot problems (was disabling NIS)
- Previous by thread: Method for monitoring metastat over multiple machines
- Next by thread: How to disable interfaces in Jumpstart
- Index(es):
Relevant Pages
|