Re: A1000: Determining bad disk
From: Mr. Johan Andersson (johan_at_solace.mh.se)
Date: 09/29/03
- Next message: Mr. Johan Andersson: "Re: A1000: drivutil vs. raidutil"
- Previous message: Ben: "Re: Any tool like rsync that uses ssh/scp as it's transport???"
- In reply to: Vikas Agnihotri: "Re: A1000: Determining bad disk"
- Next in thread: Darren Dunham: "Re: A1000: Determining bad disk"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Mon, 29 Sep 2003 08:51:13 +0200
On Sat, 27 Sep 2003, Vikas Agnihotri wrote:
> On Thu, 25 Sep 2003 18:33:09 GMT, Darren Dunham <ddunham@redwood.taos.com>
> wrote:
>
> >>> I am seeing some SCSI transport failures in /var/adm/messages on one of
> >>> my LUNs. The A1000 has all RAID5 luns.
> >>>
> >>> I suspect the disk is going bad.
> >
> > Why? If you do, you should run rm6 and run a healthcheck.
>
> I dont like the rm6 GUI, the CLI equivalent is 'healthck', right? I did a
> 'healthck -a' and got 'Optimal'. I didnt expect anything else.
Well, if you get Optimal, then the A1000 itself thinks its OK.
> I dont know how thorough 'healthck' is anyway. Say the disk was going bad,
> and I knew about it proactively, I could, on-demand, mark the drive failed
> using 'drivutil' and take the reconstruction hit when I want to instead of
> waiting for it to happen anytime!
If a disk was failing, the A1000 would probably give you a few events
anyway, its quite good at kicking bad drives. Thats why you use Raid5, so
that it CAN kick a drive without you loosing data. If you have a hotspare
activated, thats even better.
> How about 'parityck', is that a more exhaustive disk check?
Yes and no, it checks the parity of the raid5, which as it happens it does
by reading all the diskdata, which would in a way, test the disk, but its
the data on them thats really checked.
> Anyway, in this particular case, as it turned out, my SCSI errors were due
> to the "disconnected tagged commands", for which Sun support suggested that
> I consider reducing 'set sd:sd_max_throttle' (in /etc/system) to something
> like 10 (default is 256) or so.
Yup, its in the best practive for A1000-A3500
> Is this common practice to throttle down the 'sd' driver with the RAID
> A1000? Is this because the disks are too fast for the sd driver? [Or is it
> the other way around?]
I'll leave that for a SCSI expert, but basically it doesnt have to do with
slow or fast, but rather on how many scsi commands you can "queue" to the
controller. Dont remember all the facts, but I'm sure someone else does
:-)
/Johan A
- Next message: Mr. Johan Andersson: "Re: A1000: drivutil vs. raidutil"
- Previous message: Ben: "Re: Any tool like rsync that uses ssh/scp as it's transport???"
- In reply to: Vikas Agnihotri: "Re: A1000: Determining bad disk"
- Next in thread: Darren Dunham: "Re: A1000: Determining bad disk"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|