Re: hdisks missing

From: Klaus Oberle (Klaus.Oberle_at_LINDE-MH.DE)
Date: 04/25/03

  • Next message: Bill Verzal: "SSA Loop Verification Question (graphics)"
    Date:         Fri, 25 Apr 2003 15:32:10 +0200
    To: aix-l@Princeton.EDU
    
    

    >Hdisks will not be configured if they are in use by another machine.
    >Yes you will need to varyoff the hdisks from the production node in
    >order for the failover machine to configure them.

    Is it possible to break the disk reservations with "varyonvg -b -u"
    from the production node - or is this too risky?

    >My question would be, How did you lose them in the first place.
    >Simply shutting down and restarting the node, the devices that were
    >configured before, should have stayed configured.

    I am not absolutely certain if the current situation exists not until the
    upgrade. I remember that this has occured sometimes in the past. When a
    node was rebooted it lost the other nodes hdisks SOMETIMES but i always got
    them back with rmdev and cfgmgr.

                          Bruce Zimmer
                          <b.r.zimmer@WORLDN An: aix-l@Princeton.EDU
                          ET.ATT.NET> Kopie:
                          Gesendet von: IBM Thema: Re: hdisks missing
                          AIX Discussion
                          List
                          <aix-l@Princeton.E
                          DU>

                          25.04.2003 14:56
                          Bitte antworten an
                          IBM AIX Discussion
                          List

    Hdisks will not be configured if they are in use by another machine.
    Yes you will need to varyoff the hdisks from the production node in
    order for the failover machine to configure them. My question would be,
    How did you lose them in the first place. Simply shutting down and
    restarting the node, the devices that were configured before, should
    have stayed configured.

    Bruce Zimmer
    Central Data Systems
    (248) 615-4644 (direct)
    (248) 320-1175 (cell)
    bzimmer@centraldata.com

    -----Original Message-----
    From: IBM AIX Discussion List [mailto:aix-l@Princeton.EDU] On Behalf Of
    Klaus Oberle
    Sent: Friday, April 25, 2003 6:44 AM
    To: aix-l@Princeton.EDU
    Subject: Re: hdisks missing

    Thanks Simon,

    > I guess from your post that you have a resource group running on each
    node,
    > in mutual takeover. So some disks are used by one node, some by the
    other
    > when everything's running normally. At the moment, each node is OK -
    taken
    > in isolation - so the actual disk drives must be working.

    YES.

    They are two MCA Highnodes (Node1, 7x24 production + Node7, testbox)
    connected to one 7133-020. Each node has two SSA Enhanced Adapters with
    identical FW (3202). At all adapters only the A-Ports are used and each
    node can see pdisk0 to pdisk7 in one loop and pdisk8 to pdisk15 in a
    second
    loop.

    To clarify the HW-Upgrade:
    We inherited a Highnode (fully developed with 8 procs and 4GB of RAM)
    from
    another company. Our Node1 has had 4 procs and 2GB only, so we made the
    decision - together with our IBM-TA - to replace the complete CPU/RAM
    area
    with those from the inherited node. Therefore the I/O-Part of the Node1
    (including cabling) was left untouched. After this modification and when
    Node1 was booted successfully, we plugged the 2GB RAM from "old" Node1
    into
    Node7.

    maymap shows the loops correctly at both nodes and lscfg lists all
    pdisks
    at both nodes.

    I did a "varyoffvg testvg" at Node7 and removed all hdisks at Node1
    owned
    by Node7 with rmdev -dl.. Then i ran cfgmgr which brought the disks
    back:

    hdisk4 00061189b103c28c testvg
    hdisk5 00061189b66695a2 testvg
    hdisk6 00061189b66699e4 testvg
    hdisk7 00061189b66840af testvg
    hdisk8 00201586ae7f0a89 prodvg
    hdisk9 00201586ae7f0d9f prodvg
    hdisk10 00201586ae7f10a2 prodvg
    hdisk11 00062764f9e07176 prodvg
    hdisk12 00061189b0fce6b7 testvg
    hdisk13 00061189b0fcfd25 testvg
    hdisk14 00061189b0fd03cb testvg
    hdisk15 00061189b10026fd testvg
    hdisk16 00061189b1002a6e prodvg
    hdisk17 0020158654365297 prodvg
    hdisk18 00061189b1002e16 prodvg
    hdisk19 00062764f9e075b3 prodvg

    However, this didn't help at Node7. After cfgmgr the hdisks are still
    missed. Of course, i cannot varyoff the prodvg but i believe it's not
    necessary, is it?

    /klaus

                          "Green, Simon"
                          <Simon.Green@EU. An:
    aix-l@Princeton.EDU
                          ALTRIA.COM> Kopie:
                          Gesendet von: Thema: Re: hdisks
    missing
                          IBM AIX
                          Discussion List
                          <aix-l@Princeton
                          .EDU>

                          25.04.2003 11:35
                          Bitte antworten
                          an IBM AIX
                          Discussion List

    I guess from your post that you have a resource group running on each
    node,
    in mutual takeover. So some disks are used by one node, some by the
    other
    when everything's running normally. At the moment, each node is OK -
    taken
    in isolation - so the actual disk drives must be working.

    I can't really think of anything which would definitely cause the sort
    of
    problem you're seeing, but here are a few things to check: maybe one of
    them
    will suggest something to you.

    What sort of SSA drawer is it? If it's a 7133-020 or D40, how is it
    caballed and how are the bypass cards set?

    What does SSA Link Verification tell you? (From the diagnostic Service
    Aids.) Run "maymap" if you have it. Although you have not made any
    deliberate changes to the SSA loop it's possible that the cables were
    disconnected in order to gain access to the node for the upgrade. Are
    you
    certain everything got put back in the right place?

    Do you still have all of the volume groups defined on both systems? (If
    you've been deleting and re-defining disks, you'll probably need to
    export
    and re-import some of these.)

    What are the microcode levels of the adapters? Make sure that they're
    both
    the same.

    Did you re-boot the two nodes simultaneously? I have had problems -
    particularly with old MCA nodes using Enhanced 4-port Adapters - that if
    two
    nodes in the same loop try to configure their SSA devices at the same
    time
    strange things can happen, including devices going missing. Always
    stagger
    a reboot - even if it's only by half a minute or so.

    I think I'd want to shutdown both nodes, then reboot just one of them
    and
    examine the SSA devices BEFORE re-starting HACMP. If you have HACMP
    starting automatically, disable that temporarily. Once one node is OK,
    boot
    the second one. Only when both nodes' SSA config is OK should you start
    HACMP.

    Simon Green
    Altria ITSC Europe s.a.r.l.

    AIX-L Archive at http://marc.theaimsgroup.com/?l=aix-l&r=1&w=2
    AIX FAQ at http://www.faqs.org/faqs/aix-faq/

    N.B. Unsolicited email from vendors will not be appreciated.

    > -----Original Message-----
    > From: Klaus Oberle
    > Sent: 24 April 2003 12:01
    > To: aix-l@Princeton.EDU
    > Subject: hdisks missing
    >
    >
    > Hi *,
    >
    > I have a HACMP-Cluster consisting of two old SP Highnodes
    > (AIX4.3.3 - ML
    > 08) which shares one SSA-Drawer. Recently they were both
    > being upgrated by
    > adding additional procs and memory from other obsolete
    > Highnodes. After the
    > upgrade, both machines came up and the cluster applications runs fine.
    > Problem is, "lspv" on both nodes only lists hdisks which
    > belongs to the
    > active VG of that node - hdisks form the other node are no
    > longer there. On
    > the other hand, every node can see beside its own pdisks the
    > pdisks that
    > belongs to the other node. (ok - cabling or something else
    > wasn't changed
    > during the hardware upgrade).
    >
    > To get the missed hdisks back (for properly failover), i
    > removed it first
    > (rmdev -dl hdiskX ..) and ran "cfgmgr" without success. The
    > hdisks still
    > remain lost. Any hints how to solve this???

    This e-mail may contain confidential and/or privileged information.
    If you are not the intended recipient (or have received this e-mail
    in error) please notify the sender immediately and destroy this e-mail.
    Any unauthorised copying, disclosure or distribution of the material
    in this e-mail is strictly forbidden.
    Any views expressed in this message are those of the individual
    sender, except where the sender specifically states them to be
    the views of Linde Material Handling.

    Since January 2002 we use the e-mail domain linde-mh.de instead
    of linde-fh.de.

    This mail has been swept for the presence of computerviruses.


  • Next message: Bill Verzal: "SSA Loop Verification Question (graphics)"

    Relevant Pages

    • Re: hdisks missing
      ... I guess from your post that you have a resource group running on each node, ... So some disks are used by one node, ... What does SSA Link Verification tell you? ... > active VG of that node - hdisks form the other node are no ...
      (AIX-L)
    • Re: hdisks missing
      ... So some disks are used by one node, ... After cfgmgr the hdisks are still ... What sort of SSA drawer is it? ... deliberate changes to the SSA loop it's possible that the cables were ...
      (AIX-L)
    • Re: hdisks missing
      ... Hdisks will not be configured if they are in use by another machine. ... So some disks are used by one node, ... What sort of SSA drawer is it? ... Only when both nodes' SSA config is OK should you start ...
      (AIX-L)
    • Re: Replace hdisk in a RAID 1/0, HACMP env
      ... => smit dev, ssa disks, ssa physical disks, show physical to logical ... It depents what you need to know, i think you want to know which pdisk is ... have to check also within the ssa_raid config via smit dev SSA raid manager. ...
      (comp.unix.aix)
    • Need a little help with AIX 4.3.3 and VG limitations
      ... This is an Oracle Database server that reports High IO rates some times on the 2 hdisks I have configured as RAID10 for datafiles. ... I'm running out of space so I am planning on growing that storage by utilizing another D40 I have with 16-36GB disks. ... I believe there were some mixed reviews here. ...
      (AIX-L)