Method for monitoring metastat over multiple machines



Yesterday I came up with a method for monitoring meta RAID
devices for possible failures from one central script.

On each Solaris machine running software RAID, I've made a small
script to look for metastat status terms indicating a problem.

This is /usr/local/sbin/metacomplain on each solaris host:

#!/bin/sh
/usr/sbin/metastat | /usr/bin/egrep -i 'maintenance|erred'

If there are no status like maintenance or erred, the output is quiet.

The next step was to make a user "metatest" which had this script as its
/etc/passwd shell on each Solaris host. I gave this user a consistent
password on each machine.

metatest:x:1004:1::/home/metatest:/usr/local/sbin/metacomplain

This allows for the check to run merely by ssh into it, while
keeping the account's ability narrow (preferable when the password
is hardwired into a script). I tested scp and sftp with this
user set up and they are unable to do anything (good).

Using a for loop, with expect for providing the password (handled
by /usr/local/bin/respondpw), one script can check the status of all
Solaris hosts. This is set up on a Linux host :

/usr/local/bin/checkmetastat:

#!/bin/bash
for systems in host1 host2 host3 host4 host5 host6 host7 host8 host9
do
/usr/local/bin/respondpw metatest@$systems pass-secret | egrep
-i 'maintenance|erred'
problem=?$
if [ "$problem" = "1" ]
then
echo "metastat error (software RAID) is found in $systems"
else
echo "$systems is OK for metastat check (software RAID)"
fi
echo " "
done

/usr/local/bin/respondpw:

#!/usr/bin/expect -f
# wrapper to make ssh command not require password
# username is passed as 1st arg, passwd as 2nd, command is third
set password [lindex $argv 1]
spawn ssh [lindex $argv 0] [lindex $argv 2]
expect "*assword:"
send "$password\r"
expect eof

So that is my contribution to the Solaris community if anyone likes it.
I'm sure it could be modified into a nagios script, but I think
cron can do this check often enough.

If there things anyone would suggest in improving it, I welcome
comments and criticism. I know my code isn't the most terse
and efficient, but I prefer making it easier for junior admins
to understand if they need to make modifications.

--Donald
_______________________________________________
sunmanagers mailing list
sunmanagers@xxxxxxxxxxxxxxx
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



Relevant Pages

  • Re: sunmanagers Digest, Vol 28, Issue 14
    ... I have an V65x server x86 base, how do I format this box? ... I inser a solaris 9 cd and boot up try to do stop a so I can boot from ... need script to measure system performance. ... Moving a disk from SPARC to x86 ...
    (SunManagers)
  • RE: Telnetd exploit for solaris
    ... Subject: Telnetd exploit for solaris ... > You owe script kiddies... ... The computer and software industries owe script kiddies NOTHING. ...
    (Vuln-Dev)
  • Re: Massive Memory Structures
    ... (Solaris on sparc, Solaris on x86_64) ... script, I get an out of memory error and the script dies. ... what options do I need to compile in to make this happen. ... It will create a 5g test file, ...
    (comp.lang.perl.misc)
  • Re: syslog to send msg to last user in printer queue
    ... > Jetdirect software to configure the printer on a Solaris ... > of Solaris that deliver each print job to the printer. ... and the interface initializes the printer and then ... the HP interface script does more than initialize ...
    (comp.unix.solaris)
  • Re: syslog to send msg to last user in printer queue
    ... >Where boven1 is the printername in Solaris. ... of Solaris that deliver each print job to the printer. ... own printer interface script. ...
    (comp.unix.solaris)