Re: OpenVMS Management Station - cluster storage ????
From: Keith Parris (keithparris_NOSPAM_at_yahoo.com)
Date: 9 Jul 2004 15:30:05 -0700
email@example.com (Dave Baxter) wrote in message news:<firstname.lastname@example.org>...
> As far as I am aware, I have my SAN/CLUSTER configured for high
> redundancy, and therefore I am trying to figure out why my system had
> such a bad time when all that happened was that a controller failed.
Based on your description, it certainly sounds like you have a
fully-redundant configuration (dual HBAs, dual fabrics, dual
controller-pairs, shadowing across controller pairs). So things should
have worked. In such a case, it usually involves some in-depth
analysis with the support folks to figure out what went wrong.
They'll look at things like patch status on VMS, SAN configuration
(and error counters on switches), firmware levels on all the hardware,
determine the settings for various timeout values like the SYSGEN
parameter setting SHADOW_MBR_TMO and member timeouts that can be set
with DCL commands, error logs, console logs, and so forth, and get you
an answer if it's humanly possible to do so.
> When a disk controller fails, it is supposed to failover all of the
> drives it has responsibility for to its partner.
Yes, it is supposed to. I've run into a few cases over the years when
it hasn't, but that's the very reason why you shadow between different
controller pairs -- that should have covered that unlikely, but
> The message I received from the Management station was
> "PROD11: Shadow set DSA12: has no member device on NODE01"
> This implies to me that the Shadow set has no members !!? Is this
> what it is really saying ???
Things must be very bad if NODE01 can't see ANY of the shadowset
members. Any chance your dual SAN fabrics got accidentally connected
into one fabric?
It's clear that it is going to take much deeper analysis than the
output from the Management Station to get to the bottom of this.
Console output from both the VMS systems and the HSG controllers (I
hope it was saved) and error logs both on VMS and the controllers are
going to be crucial pieces of the puzzle.