SUMMARY: restore a crashed test cluster

lars.rieneck_at_tellabs.com
Date: 04/13/04

  • Next message: Douglas B. Jones: "evmget errors - follow-up"
    Date: Tue, 13 Apr 2004 10:39:24 +0200
    To: tru64-unix-managers@ornl.gov
    
    

    Hello all

    Thanks to: Joachim Jaeckel, Martin Rønde Andersen, Bob Collins
    For the very fine feedback

    Problem: I have a 5.1B cluster with dupatch 3. To try our crash restore
    procedure,
    I have taken a full backup with vdump to tape. I have then zerorised all
    disks including
    the quorum disk. Booted on the UNIX cdrom, and restore the following
    filedomians/filesets:

            1) cluster_root#root
            2) cluster_usr#usr
            3) cluster_var#var
            4) root1_domain#root

    I then go to the boot prompt again, and try to start the first member by
    booting, with the following command:

    >>> boot -fl ai dkb100.1.0.15.0

    I then boot with the following kernel parameters, to
    be able to boot without a quorum disk:

    Enter: <kernel_name> [option_1 ... option_n]
      or: ls [name]['help'] or: 'quit' to return to console
    Press Return to boot 'vmunix'
    # vmunix clubase:cluster_expected_votes=1 clubase:cluster_qdisk_votes=0

    The boot procedure starts, but stops after the following statements:

    Waiting for cluster mount to complete
    clsm: checking for peer configurations
    clsm: initialized
    CNX QDISK: Successfully claimed quorum disk, adding 0 vote.

    Solution 1: From Bob Collins i got a restore procedure, to restore a
    cluster on different hardware, that was very useful, Thanks.

    Solution 2: When I boot on the UNIX cd-rom I got a different disk
    naming, than
    the restored unix had. So I had to rename the disk devices to the
    correct
    numbering according to the unix system i would like to restore.

    Solution 3: When I make the backup of the cluster, the backup includes
    the disklabels.
    It is a good idea to keep the cnx partition on the member boot disk,
    that spared me
    from setting the cfs:cluster_root_dev1_min/maj values in the boot
    sequence.

    Solution 4: When the members in the cluster has to be booted, both
    members MUST be
    booted with the following commad: vmunix
    clubase:cluster_expected_votes=1
    clubase:cluster_qdisk_votes=0. When both members are up, the quorum has
    to be deleted,
    before it can created again.

    /Lars

    -----------------------------------------
    ============================================================
    The information contained in this message may be privileged
    and confidential and protected from disclosure. If the
    reader of this message is not the intended recipient, or an
    employee or agent responsible for delivering this message to
    the intended recipient, you are hereby notified that any
    reproduction, dissemination or distribution of this
    communication is strictly prohibited. If you have received
    this communication in error, please notify us immediately by
    replying to the message and deleting it from your computer.

    Thank you.
    Tellabs
    ============================================================


  • Next message: Douglas B. Jones: "evmget errors - follow-up"

    Relevant Pages

    • Re: HACMP question
      ... Boot and service must be on the same VLAN but seperate subnets. ... cluster services is started. ... to aixoradb2 in the event of a failover. ... You have the vg which is on the disk imported, ...
      (AIX-L)
    • Re: change harddisk - Further explain of command lines
      ... >>ended up doing was to boot the install CD then boot into rescue mode and ... my experience in using the CD booting up in rescue mode also ... I had a successful restore ... Although it will copy the first disk to the second, ...
      (Fedora)
    • SUMMARY: Restoring cluster from tape
      ... However, after about a week of effort on my part, I still didn't have a bootable cluster and decided instead to concentrate on migrating all our users, data and applications to the Operton system that we had already bought to replace the alphas. ... A RAIDset on the HSG died recently, which contained the cluster_root, cluster_usr, cluster_var, root1_domain and root2_domain AdvFS domains AND the quorum disk. ... Of course the WWIDs of the new partitions are not the same as their original equivalents, so I have had to use wwidmgr at the SRM prompt to enable the new boot devices. ... halted CPU 0 ...
      (Tru64-UNIX-Managers)
    • Re: Restoring Windows 2003 Server (Enterprise) system state results in continuous rebooting
      ... The primary disk on the server failed this past weekend, ... Restore procedure and results are as follows: ... boot into directory restore mode, get the .bkf file conaining the ...
      (microsoft.public.windows.server.general)
    • Re: Disk with NTDS failing
      ... My boot drive is F: ... Your suggestion of "replace the disk and do a system-state restore" sounds ... I've never tried a restore from one of these SBS backups. ... When you said "need to do some repairs," can you be more specific? ...
      (microsoft.public.windows.server.sbs)