SUMMARY: Sun Cluster 3.2, did devices over 100



Hi all,

Thanks for the responses, only three (Francisco Mauro Puente, Dean Ross-Smith,
Martin Pre_laber), but sufficient to help me find my way.

Dean pointed out the did manual page, where it mentions that did devices are
dynamically generated in groups of 100 at a time - the next 100 are
generated only when there are more disks available to the cluster than there
are did devices, hence nothing over 100 for me.

Martin suggested setting the cluster back into install mode and configure the
hdlm so that it doesn't show any disks.

In the end, I got our storage admin to remove all disks from the cluster, I
rebooted, cleared the did device tree, got rid of all did devices, all
references to anything with the storage, rebooted, made sure everything was
cleaned of the hdlm devices and storage devices (wasn't, repeated the
cleaning successfully a seond time), reboot, disks from storage made
available again, reconfigured the hdlm, rebooted, reconfigured the did
devices, rebooted because the devices weren't immediately accessible, renamed
the did devices again with lower numbers, rebooted again, and now everything
seems to be at least partly ok. Somehow I had that win... feeling.

It also seems there is some kind of bug with the hdlm that we are using. It
rears it head with the message "No such device" when trying to access any
disk with format, although I can actually access, partition, and put file
systems on all the disks. Lets see if this changes in a newer version of
hdlm.

Thanks and regards
Markus




On Wednesday 21 November 2007, Markus Mayer wrote:
Hi all,

I'm having a hard time with Sun Cluster 3.2 in a two node cluster. We have
a Hitachi AMS500 and HDLM 5.9.0.0, Solaris 10 update 3, all patches until
the start of September installed. I have 30 disks from the storage array
made available, and to try to keep an overview of what disk came from which
pool on the array and for what purpose it was, I renamed the did devices on
the disks (eg: didadm -t d35:d101). The renaming procedure went through
without any problems, however since rebooting, I can do nothing with these
did devices any more. For example, the boot messages for one of these
disks is: Nov 21 13:12:48 wombat Cluster.CCR: /usr/cluster/bin/scgdevs:
Could not register disk
Nov 21 13:12:48 wombat Cluster.CCR: /usr/cluster/bin/scgdevs: Could not
stat /dev/global/dsk/d123s2:
Nov 21 13:12:48 wombat Cluster.CCR: No such file or directory

didadm -c or -R returns no errors, so it thinks everything is ok.

I have tried everything I can think of to access the disks, or to remove
them

from the device tree, to rename them again to device numbers under 100, to

clean the did device tree, and so on, all with no success. (didadm -C
then -r, devfsadm -Cv, didadm -t d101:d35)

I have examined the problem and seen that there are no did device entries
(links) in /dev/did greater than d100, the devices go from did/d0 to
did/d100, there is nothing for did/d101 and onwards. The same applies
in /devices/pseudo/did. According to Sun documentation, these should
however be dynamically generated.

I cna see two possibilities here -
1. Find a way to get cluster to generate the missing device entries.
2. Find a way to remove all the problem entries and go back to something
more conservative.
So far I have not been able to find a way to do either of these. I have
considered the possibility of manually brutally deleting the troublesome
entries, however I don't know what consequences this will have.

I would be grateful to anyone who could point me in a direction to either
get cluster to delete the troublesome entries and start again, or generate
the missing entries.

Thanks
Markus
_______________________________________________
sunmanagers mailing list
sunmanagers@xxxxxxxxxxxxxxx
http://www.sunmanagers.org/mailman/listinfo/sunmanagers
_______________________________________________
sunmanagers mailing list
sunmanagers@xxxxxxxxxxxxxxx
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



Relevant Pages

  • Sun Cluster 3.2, did devices over 100
    ... I'm having a hard time with Sun Cluster 3.2 in a two node cluster. ... the boot messages for one of these disks is: ... Find a way to get cluster to generate the missing device entries. ...
    (SunManagers)
  • HBVS, shutdown procedures, dismounting disks, SHADOW_MBR_TMO
    ... Hobbyist cluster, all disks are physical SCSI disks (or, in one case, an ... disks, whether shadowed or not, are mounted by all nodes in the cluster. ... I want to reboot just one node in the cluster. ... There is also the question whether the dismount needs to be done on the ...
    (comp.os.vms)
  • Daily Report #4165
    ... The resultant cleaned cluster CMDs will ... well-understood host galaxy environment. ... The Nature of the Halos and Thick Disks of Spiral Galaxies ... ACS, NICMOS, and WFPC2 in parallel. ...
    (sci.astro.hubble)
  • Re: Clustering Newbie - SAN Advice
    ... Senior SQL Infrastructure Consultant ... SAN/Smart array or through a fibre channel switch. ... The SAN or Smart array will dictate what internal connection the disks ... single-instance cluster. ...
    (microsoft.public.sqlserver.clustering)
  • changed WWID on cluster member boot disk
    ... single-member cluster; the second member has not yet been added to ... The disks containing the cluster root, ... but an attempt to boot the DS20E as a single-member cluster failed; ... the boot of the stand-alone system, a number of new special device files ...
    (Tru64-UNIX-Managers)