Re: Cluster mystery: one-way MSCP disk serving?

sms_at_antinode.org
Date: 01/30/04


Date: Fri, 30 Jan 2004 10:22:13 -0600 (CST)


   Answers to recent questions, and a few more data:

--------

   Are you being served? Yes, for the disks of interest:

ALP $ show device /full dka0

Disk ALP$DKA0:, device type FUJITSU MAA3182SC, is online, mounted, file-oriented
    device, shareable, served to cluster via MSCP Server, error logging is
    enabled.
[...]

ALP2 $ show device /full dka0

Disk ALP$DKA0:, device type FUJITSU MAA3182SC, is online, mounted, file-oriented
    device, shareable, available to cluster, error logging is enabled.
[...]
  Volume is also mounted on ALP.

Disk ALP2$DKA0:, device type SEAGATE ST118202LC, is online, mounted, file-
    oriented device, shareable, served to cluster via MSCP Server, error logging
    is enabled.
[...]

   I chose MSCP_SERVE_ALL = 4 to prevent serving a CD-R/RW drive
(ALP$DKA500) which I like to leave turned off most of the time. If it
happens to be on at start-up, and if it is MSCP-served, the error log
gains a lot of complaints when it's turned off. The intent was to use
SET DEVICE /SERVED if/when another hard disk was added, or if any reason
arose to serve the CD-ROM drive (ALP$DKA400). Currently the only disks
of interest _are_ the system disks, so "Serve the system disk" is
exactly what I want.

> I hope you're using different values for SCSSYSTEMID on the 2 nodes.

   Oh, ye of little confidence. As I said, "suitably adjusted". SYSGEN
SHOW /SCS differences:

ALP $ diff /merged = 0 ALP.SCS ALP2.SCS
************
File SYS$SYSROOT:[SYSMGR]ALP.SCS;1
    4 SCSCONNCNT 21 40 2 32767 Entrie
******
File SYS$SYSROOT:[SYSMGR]ALP2.SCS;1
    4 SCSCONNCNT 14 40 2 32767 Entrie
************
************
File SYS$SYSROOT:[SYSMGR]ALP.SCS;1
    8 SCSSYSTEMID 1119 0 -1 -1 Pure-n
    9 SCSSYSTEMIDH 0 0 -1 -1 Pure-n
   10 SCSNODE "ALP " " " " " "ZZZZ" Ascii
******
File SYS$SYSROOT:[SYSMGR]ALP2.SCS;1
    8 SCSSYSTEMID 1118 0 -1 -1 Pure-n
    9 SCSSYSTEMIDH 0 0 -1 -1 Pure-n
   10 SCSNODE "ALP2 " " " " " "ZZZZ" Ascii
************
[...]

--------

   The consensus is that the disks are MSCP-served:

ALP $ show device /served
       MSCP-Served Devices on ALP 30-JAN-2004 10:36:28.90

                                             Queue Requests
Device: Status Total Size Current Max Hosts
    ALP$DKA0 Online 35680750 0 0 1
[...]

ALP2 $ show device /served
       MSCP-Served Devices on ALP2 30-JAN-2004 10:37:49.83

                                             Queue Requests
Device: Status Total Size Current Max Hosts
    ALP2$DKA0 Avail 35566480 0 0 0
[...]

   The difference between "Online" and "Avail" could be interesting.

--------

> If the first node doesn't have a process named "CONFIGURE"
> running then it's not listening.
>
> $ @sys$system:startup "CONFIGURE"

   Neither system has a process named "CONFIGURE". One of us seems to
be confused about what to expect from 'SYS$SYSTEM:STARTUP.COM
"CONFIGURE"'.

--------

> PWS 500a is NOT a supported platform for OpenVMS.

   Yeah, yeah. Since I added the Qlogic SCSI card, it says it's a
500au. Who ya gonna believe, the firmware or the plastic gewgaw?

   Also, as reported a long time ago, I have the same problem with my
(almost supported) VAXstation 3100 M38, running VMS V7.2.
("%WBM-I-WBMINFO Deleting all bitmaps mastered by this node." I suppose
I should upgrade.)

   Now (WUSS serves everything, by the way):

ALP $ show device dk

Device Device Error Volume Free Trans Mnt
 Name Status Count Label Blocks Count Cnt
ALP$DKA0: Mounted 0 VMS073ALP 13835920 907 3
ALP$DKA400: Online wrtlck 0

ALP2 $ show device dk

Device Device Error Volume Free Trans Mnt
 Name Status Count Label Blocks Count Cnt
ALP$DKA0: Mounted 0 VMS073ALP 13835780 5 3
ALP2$DKA0: Mounted 0 VMS073ALP2 13343820 601 2
ALP2$DKA400: Online wrtlck 0
WUSS$DKA200: Mounted 0 VMS062WUSS 630252 1 2
WUSS$DKA300: Mounted 0 WUSS_SCSI_3 2630232 1 2
WUSS$DKA400: Online 0

WUSS $ show device dk

Device Device Error Volume Free Trans Mnt
 Name Status Count Label Blocks Count Cnt
ALP$DKA0: Mounted 0 VMS073ALP 13835780 2 3
ALP2$DKA0: Mounted 0 VMS073ALP2 13343820 1 2
WUSS$DKA200: Mounted 0 VMS062WUSS 630252 341 2
WUSS$DKA300: Mounted 0 WUSS_SCSI_3 2630232 5 2
WUSS$DKA400: Online wrtlck 0

   So, newcomers to the cluster work fine (see everything), but the
first one appears to be blind/deaf to the newcomers.

--------

   For a good time, I tried SET DEVICE /SERVED on the tape drives on
these systems with similar results:

ALP $ show device /full mk

Magtape ALP$MKA200:, device type EXABYTE EXB-8505SMBANSH2, is online, record-
    oriented device, file-oriented device, served to cluster via TMSCP Server,
    error logging is enabled, controller supports compaction (compaction
    disabled), device supports fastskip (per_io).
[...]

ALP2 $ show device /full mk

Magtape ALP$MKA200:, device type EXABYTE EXB-8505SMBANSH2, is online, file-
    oriented device, available to cluster, error logging is enabled.
[...]

Magtape ALP2$MKA500:, device type EXABYTE EXB-8505SMBANSH2, is online, file-
    oriented device, served to cluster via TMSCP Server, error logging is
    enabled, controller supports compaction (compaction disabled), device
    supports fastskip (per_io).
[...]

WUSS $ show device /full mk

Magtape ALP$MKA200:, device type EXABYTE EXB-8505SMBANSH2, is online, file-
    oriented device, available to cluster, error logging is enabled, controller
    supports compaction (compaction disabled).
[...]

Magtape ALP2$MKA500:, device type EXABYTE EXB-8505SMBANSH2, is online, file-
    oriented device, available to cluster, error logging is enabled, controller
    supports compaction (compaction disabled).
[...]

   Again, everyone says he's serving his tape drive, but only the
non-first cluster members see remote tape drives.

--------

> Not a cluster licensing problem perhaps? [...]

   Everyone has an active VMSCLUSTER (and/or VAXCLUSTER) license loaded,
and they're not NO_SHARE, so I don't see a problem there. Aslo, in case
of conflict, I'd expect the first one up to do better than the others,
but it's the opposite. Only the first one up seems to be handicapped.

--------

   It's still a mystery.

------------------------------------------------------------------------

   Steven M. Schweda (+1) 651-699-9818
   382 South Warwick Street sms@antinode-org
   Saint Paul MN 55105-2547



Relevant Pages

  • Re: Standby Cluster Server 2003 Issues accessing replicated LUNS
    ... Online, volumes not ready. ... RemoveDisk: disk 18bfaac0 not found or previously removed ... standby cluster is expecting on the disks is different to what is ...
    (microsoft.public.windows.server.clustering)
  • Re: Standby Cluster Server 2003 Issues accessing replicated LUNS
    ... My guess would be that the replication tool still has control of the ... Online, volumes not ready. ... RemoveDisk: disk 18bfaac0 not found or previously removed ... standby cluster (may I should of done this first.. ...
    (microsoft.public.windows.server.clustering)
  • Re: Move fails - some drives do not come online?
    ... if disks come online on any node of a cluster, then no disk signatures have changed. ... That step is to set the cluster service to manual and the cluster disk driver to 'demand' on SQL 4 and shut it down. ... Boot up SQL4 and examine disk manager to see if that node sees all disks properly and then do a quick write test to the drives in question. ...
    (microsoft.public.windows.server.clustering)
  • Clustering Problem
    ... LAN Cluster Server and the model 90 Cluster member. ... GAZVX2$DKA0: Online 0 ... Owner process ID ... Disk GARYV1$DKA100:, device type RZ26B, is online, mounted, file-oriented ...
    (comp.os.vms)
  • Re: Cluster group resource partially online
    ... What is the state of the disk in cluadmin? ... am assuming that it is online pending based on your description. ... the drive before bringing the disk online (default cluster behavior). ... it's going to perform the chkdsk before bringing the resource ...
    (microsoft.public.windows.server.clustering)

Loading