Strange messages during cluster reboot

From: Erdei Tamás (Erdei.Tamas_at_lnx.hu)
Date: 05/30/03

  • Next message: Dominic Clarke: "Documentation for L8 SDLT autoloader and it's barcode reader ?"
    Date: Fri, 30 May 2003 14:21:29 +0200
    To: "Sun Managers List" <sunmanagers@sunmanagers.org>
    
    

    Hi all,

    I am building a new cluster based on two Netra 20/T4 servers and two D2
    storage, using Solaris 8, Sun Cluster 3.0 and DiskSuite 4.2.1.
    The disks in a D2 box are striped together, and the two boxes are mirrored
    onto each other. I think this is a quite standard setup. (The slicing of the
    disk is the default created by SDS, eg. slice 7 for the SDS metadb (2
    clusters) and the rest for slice 0).
    Everything went well during the install, I installed the SC3.0, set the
    cluster quroum to one of the disks in the first D2, then set up the disk group
    (metaset), metadb, stripes and mirroring. I created and mounted the filesystem
    on the new disk group successfully.
    The system seems to be working properly, except for some strange system
    messages during a node or cluster reboot:

    May 30 00:40:27 n2 scsi: WARNING: /pci@8,700000/pci@3/scsi@4/sd@8,0 (sd67):
    May 30 00:40:27 n2 Error for Command: read(10) Error Level:
    Informational
    May 30 00:40:27 n2 scsi: Requested Block: 0 Error
    Block: 0
    May 30 00:40:27 n2 scsi: Vendor: FUJITSU
    Serial Number: 0303X78024
    May 30 00:40:27 n2 scsi: Sense Key: Unit Attention
    May 30 00:40:27 n2 scsi: ASC: 0x29 (<vendor unique code 0x29>), ASCQ:
    0x2, FRU: 0x0

    This message gets repeated a few times.
    According to a SCSI Sense Key table, this message means only that the disk was
    reset (probably because of the reboot of the other node). The strange thing
    is, that this message is generated only for the disk which is set as the
    cluster quorum.

    Strange messages are generated on the booting node as well:

    May 30 00:41:51 n1 cl_runtime: [ID 606467 kern.warning] WARNING: CMM:
    Initialization for quorum device /dev/did/rdsk/d4s2 failed with error EACCES.
    Will retry later.
    May 30 00:41:53 n1 cl_runtime: [ID 847496 kern.warning] WARNING: CMM: Reading
    reservation keys from quorum device /dev/did/rdsk/d4s2 failed with error 2.

    I am new to Sun Clustering, so I am not sure if these messages are harmless
    and can be ignored, or I missed something during the install. Could someone
    help me interpret these messages, or confirm that they are harmless?

    What I do not understand is, that I can set the quorum disk only, not a
    specific slice. The software selects slice 2 automatically, which is the
    overlap slice on normal disks, but this slice is not defined on the shared
    disks (it has 0 length). Now where does the SC software store quroum info
    (reservation keys??) on the quorum disk, does it use cluster 0? Is it
    possible, that this conflicts with the SDS metadb, which is also stored on
    cluster 0?

    Besides these strange messages, the cluster seems to work correctly. After the
    reboot, both nodes and the quorum disk are online (according to scstat), the
    mirror and stripes are in Okay state.

    I appreciate any help. I searched through the docs, list archives and usenet
    groups, but found nothing relevant.

    Thanks,
    Tamas

    -----------------------------------------------------------
    Tamas Erdei E-mail: erdei.tamas@lnx.hu
    Systems Engineer
    LNX Ltd.
    -----------------------------------------------------------
    _______________________________________________
    sunmanagers mailing list
    sunmanagers@sunmanagers.org
    http://www.sunmanagers.org/mailman/listinfo/sunmanagers


  • Next message: Dominic Clarke: "Documentation for L8 SDLT autoloader and it's barcode reader ?"

    Relevant Pages

    • SUMMARY: changed WWID on cluster member boot disk
      ... disk and quorum disk of a single-member cluster, ... I could no longer boot from the cluster disks, ... the pre-cluster stand-alone system disk; ... the root1_domain on LUN containing the member boot disk was found ...
      (Tru64-UNIX-Managers)
    • Re: Quorum disk removal
      ... I was somewhat expecting that each non-QW would maintain the disk quorum somewhere - and my goal of avoiding a cluster reboot may be in vain. ... I have booted a server with the DISK_QUORUM set - however the quorum disk was not mounted. ...
      (comp.os.vms)
    • Join an existing cluster
      ... I had a cluster setup with 2 computers running windows ... shared disk array. ... Creating a dummy Local Quorum resource. ... on the same storage bus as the boot disk... ...
      (microsoft.public.windows.server.clustering)
    • Re: Creating a wide area VMS Cluster
      ... > My goal is to provide a disaster tolerant cluster for both OS and data. ... disrupting the balance of the effect of votes between sites A and B. ... You have the option of a single shadowed system disk between the ...
      (comp.os.vms)
    • RE: Cluster IP Address Does not fail over
      ... The cluster IP has no dependicies at all. ... Node1 disk manager sees LUN5. ... [DiskArb] ...
      (microsoft.public.windows.server.clustering)