Failed drive in Raidshelf - hint needed

From: Christian Wessely (christian.wessely_at_uni-graz.at)
Date: 08/28/03

  • Next message: Parkin Frank - fparki: "SUMMARY: Tru64 v5.1a Memory Grabber"
    Date: Thu, 28 Aug 2003 08:40:44 +0200
    To: tru64-unix-managers@ornl.gov
    
    

    hello admin wizards,

    this is no straight tru64 - question, but I risk to place it here anyway :o)

    today in the morning, I found that on my mini-cluster (2x DEC Alpha
    1000/366 with 2x HSZ40 and Raidshelf) one of the disks of the main rais
    set (6x4 GB Raid 5) has obvious problems (amber LED blinking); also the
    acustic alarm on the HSZ40-Controllers went off.

    Fortunately, the system did what it was suppose do do - moved to one of
    the defined spares, so everything works fine at the moment.

    some questions remain, however:
    1) I tried to use HSZTERM to find out WHY the disk (200/0/0) has failed,
    but a show failedset full does not reveal much information except that
    the disk ist now part of FAILEDSET :o\ - any other commands that I can
    use? And any possibility known to "revive" the disk? The cluster is not
    under service any more ...

    2) It happens that our two spare disks are located in the row above the
    main set and in the first two columns - i.e. the main raid is 100/0/0 to
    600/0/0, the spares are 110/0/0 and 210/0/0.
    The failed disk is the 200, and the controller has chosen the 210 as a
    hot spare.
    I always thought the controller would use the first disk of the spareset
    for reconstructing the raid - but it used the second (same column as the
    failed).
    So my naive question is, does a spare have necessarily to be in the same
    column as the main disk, so that - if e.g. the 500 would fail now - the
    controller would not find the remainig spare on 110 ?

    thanx for any hint

    CW

    -- 
    YS, CW
    -----------------------------
    Christian Wessely
    http://www-theol.uni-graz.at
    

  • Next message: Parkin Frank - fparki: "SUMMARY: Tru64 v5.1a Memory Grabber"

    Relevant Pages

    • Re: Replacing 7133 SSA Raid 5 disk without Hot Spare
      ... After replacing the offending drive with a spare, ... is it possible to simply add the replacement disk to the array as ... I've replaced drives in the following manner with a RAID1 configuration but ... Replacing 7133 SSA Raid 5 disk without Hot Spare ...
      (AIX-L)
    • Re: SCSI Question
      ... You should be able to add a hot spare at any time. ... take a 5 disk array and convert it to a 4 disk with hot spare, ... drives in an array to be the same size. ...
      (microsoft.public.windows.server.sbs)
    • Re: Replacing 7133 SSA Raid 5 disk without Hot Spare
      ... Replacing 7133 SSA Raid 5 disk without Hot Spare ... Legg Mason therefore recommends that you do not ...
      (AIX-L)
    • Re: Resizing NTFS partition to make room for FC10
      ... I use the g4u disk cloning tool for chores like this. ... spare drive. ... Boot the resized NTFS partition. ... it still, according to fdisk, thinks its 150GB. ...
      (Fedora)
    • Re: SCSI Question
      ... take a 5 disk array and convert it to a 4 disk with hot spare, ... could add a 6th disk and configure that as a hot spare. ... drives in an array to be the same size. ...
      (microsoft.public.windows.server.sbs)