DIFVOLMNT (%X0072832C), then bugcheck whenbooting any node.
From: Galen (gspamtackett_at_yahoo.com)
Date: 10/03/03
- Next message: Bob Koehler: "Re: Fee Based Email (From Re: Process's PreciseMail AntiSpam...)"
- Previous message: JF Mezei: "Re: SMTP receiver logs"
- Next in thread: Galen Tackett: "Problem with CLUSTER_CONFIG? (was: Re: DIFVOLMNT (%X0072832C), then bugcheck whenbooting any node.)"
- Reply: Galen Tackett: "Problem with CLUSTER_CONFIG? (was: Re: DIFVOLMNT (%X0072832C), then bugcheck whenbooting any node.)"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: 3 Oct 2003 04:47:26 -0700
We've gotten into this situation with our cluster twice recently. (I'm
not referring to a VOLALRMNT error, which is a different numerical
status.)
Configuration is:
A single OpenVMS Alpha V7.3-1 system disk which is very current on
patches.
System disk lives on an HSG80, reached via a SAN core switch.
Satellites do not have any shared-storage connections (i.e. no DSSI,
no FibreChannel, no shared SCSI).
13 boot servers and 7 satellites, all Alphas.
Running Storageworks RAID software (not sure how relevant).
In both cases, we had recently run CLUSTER_CONFIG to add a new server
node. However, in each case, the new node had no physical LAN
connection (fiber not hooked up) and took a CLUEXIT bugcheck after a
few minutes.
Each time, shortly after the CLUEXIT, we got the node's LAN connection
working and re-ran CLUSTER_CONFIG. Just after the new node reached the
point where it reports there's no pagefile on the system disk
(%SYSINIT-I-PAGEFILE), it reported:
%SYSINIT-E-Error mounting system device, status = 0072832C
We checked these things:
* No other clusters with same cluster ID (we only have one other
cluster)
* All systems have VAXCLUSTER set to 2.
* The volume label on the system disk has not been changed since the
cluster was last booted.
The only solution we've found is to reboot the cluster (not a pleasant
option, of course).
But we're just as concerned to find out what's causing this. I suspect
that the CLUEXIT during CLUSTER_CONFIG somehow is involved but have
only a little circumstantial evidence, as described here.
HP software support and the maintainer of the MOUNT code have given us
a little script to periodically check the volume's SCB and report if
its checksum changes. Beyond that, they're out of ideas right now.
(FYI, the bad connections occur because our fiber cable plant is very
badly documented, has a lot of old labels, and some of the fibers have
been damaged at one time or another. But this is another issue.)
Thanks for any help or suggestions,
Galen
- Next message: Bob Koehler: "Re: Fee Based Email (From Re: Process's PreciseMail AntiSpam...)"
- Previous message: JF Mezei: "Re: SMTP receiver logs"
- Next in thread: Galen Tackett: "Problem with CLUSTER_CONFIG? (was: Re: DIFVOLMNT (%X0072832C), then bugcheck whenbooting any node.)"
- Reply: Galen Tackett: "Problem with CLUSTER_CONFIG? (was: Re: DIFVOLMNT (%X0072832C), then bugcheck whenbooting any node.)"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|