Cluster crash

From: Rudolf Gabler (rug_at_usm.uni-muenchen.de)
Date: 03/18/05

  • Next message: Farmer, John: "EVA/SAN - Risks with disk grouping??"
    Date: Fri, 18 Mar 2005 12:40:30 +0100
    To: tru64-unix-managers@ornl.gov
    
    

    Hi managers,

    My 3 member cluster under V5.1b PK4 crashed after one member lost one of his
    dimms. cluster_root was badly broken when I tried to rebuild it with fixfdmn
    (from a rescue installation).
    So I rebuilt cluster_root exactly like documented in the "cluster
    administration manual: Troubleshooting clusters" section: make it new but on
    the same disk and with a fresh backup restore.
    I have a rescue system on which the harware view is like on the cluster and
    I can mount any of the filesystems.

    When I try to boot the first member with the orig configuration and
    specifying the maj,min devices:

         vmunix cfs:cluster_root_dev1_maj=19 cfs:cluster_root_dev1_min=277
    clubase:cluster_expected_votes=1 clubase:cluster_qdisk_votes=0
    I get an:
         ..
         Waiting for cluster mount to complete
         panic (cpu 0): cfs_issue_localroot_do_mount: namei on boot partition mp
    failed

    The same is true if I boot the first member (without the kernel clubase:..
    Specifications; this member waits until quorum is reached by another member)
    and crashes as soon as the second member reaches cluster connect also with
    this message.

    I googled for the cfs_issue error without any success.

    Who knows an advice?

    Best regards,

    Rudolf Gabler


  • Next message: Farmer, John: "EVA/SAN - Risks with disk grouping??"

    Relevant Pages

    • [SUMMARY: Deleting a member node from TruCluster]
      ... to remove the member upon which the cluster software was ... Now my problem is what to do if I need to remove the master node? ... It is possible to delete any non-running member of the cluster from any ... from the original installation, and by default would have chosen member 1 as ...
      (Tru64-UNIX-Managers)
    • Tar Restore problem
      ... I took backup using tar on tru64 5.1A & when I tried to restore on another ... So I rebuilt cluster_root exactly like documented in the "cluster ... The same is true if I boot the first member (without the kernel clubase:.. ...
      (Tru64-UNIX-Managers)
    • SUMMARY: Cluster crash
      ... nature to save only filesystems. ... Betreff: Cluster crash ... make it new but on the same disk and with a fresh backup restore. ... The same is true if I boot the first member (without the kernel clubase:.. ...
      (Tru64-UNIX-Managers)
    • Process are blocked for several seconds in a cluster
      ... I have an application running on one member as a cluster service. ... This application communicates over TCP/IP with a remote host and ...
      (Tru64-UNIX-Managers)
    • Re: Multiple Clusters with shared passive nodes
      ... Sort of - I'd create all 5 servers as members of a single cluster, then set up 4 instances of SQL Server with cluster server member 1 configured ... While you are able to have cluster member 5 be the failover for all the other instances, there might be a better way to set which member servers can run each instance, but its not difficult to change this at a later date - it might not be an optimal configuration if all the instances are of different memory needs or if you want to handle a situation where more than one cluster member might be unavailable at a time. ...
      (microsoft.public.sqlserver.clustering)