Re: harddisk diagnostics
From: Ben (ben_at_en-ninguna-parte.com)
Date: Sun, 18 Apr 2004 08:31:59 -0400
Michael Tosch wrote:
> In article <firstname.lastname@example.org>, Irenicus <email@example.com> writes:
> > Hi guys,
> > any command to check health of hard disk. Recently I've got some bus
> > errors, any commands for this also?
> Check /var/adm/messages for disk errors.
> Errors stating "Vendor ..." means the disk has reported a problem.
> Further diagnostic in Solaris is
> > analyze
> > read
> If you're having SCSI transport errors, you should watch out for
> a disk firmware upgrade, or Solaris disk- and SCSI-driver updates.
> If a program aborts with
> bus error
> segmentation fault
> it is usually a bug in the program!
> Michael Tosch
> IT Specialist
> HP Managed Services Germany
> Phone +49 2407 575 313
> Mail: firstname.lastname@example.org
And make sure you've got a good/recent backup.
Try a physical inspection if possible. Reseat the drive. Check your
disk or backplane cabling for crimps/breaks.
On SCSI disks you can use
format -> <disk> -> defect -> grown
Grown defects should be zero on a healthy disk. Check it periodially to
see if there's an increase.
iostat -E ... see the manpage for other things to try/watch.
Run detailed OBP diags to see if the system board/controller is having
an issue or failing. Controllers and system boards can and do go bad,
too, so it may not be a disk problem in itself.
If the physical doesn't turn up anything, watch for increases in those
counters over a few days of normal use, provided performance isn't
clearly nose-diving. Balance out your facts and decide what to do. On
a production box, or any box really, frequent errors or increase in
those counters makes a pretty clear case for replacement.