SUMMARY: Sun Cluster 3.2, did devices over 100
- From: Markus Mayer <mymaillists@xxxxxx>
- Date: Mon, 26 Nov 2007 15:17:14 +0100
Hi all,
Thanks for the responses, only three (Francisco Mauro Puente, Dean Ross-Smith,
Martin Pre_laber), but sufficient to help me find my way.
Dean pointed out the did manual page, where it mentions that did devices are
dynamically generated in groups of 100 at a time - the next 100 are
generated only when there are more disks available to the cluster than there
are did devices, hence nothing over 100 for me.
Martin suggested setting the cluster back into install mode and configure the
hdlm so that it doesn't show any disks.
In the end, I got our storage admin to remove all disks from the cluster, I
rebooted, cleared the did device tree, got rid of all did devices, all
references to anything with the storage, rebooted, made sure everything was
cleaned of the hdlm devices and storage devices (wasn't, repeated the
cleaning successfully a seond time), reboot, disks from storage made
available again, reconfigured the hdlm, rebooted, reconfigured the did
devices, rebooted because the devices weren't immediately accessible, renamed
the did devices again with lower numbers, rebooted again, and now everything
seems to be at least partly ok. Somehow I had that win... feeling.
It also seems there is some kind of bug with the hdlm that we are using. It
rears it head with the message "No such device" when trying to access any
disk with format, although I can actually access, partition, and put file
systems on all the disks. Lets see if this changes in a newer version of
hdlm.
Thanks and regards
Markus
On Wednesday 21 November 2007, Markus Mayer wrote:
Hi all,_______________________________________________
I'm having a hard time with Sun Cluster 3.2 in a two node cluster. We have
a Hitachi AMS500 and HDLM 5.9.0.0, Solaris 10 update 3, all patches until
the start of September installed. I have 30 disks from the storage array
made available, and to try to keep an overview of what disk came from which
pool on the array and for what purpose it was, I renamed the did devices on
the disks (eg: didadm -t d35:d101). The renaming procedure went through
without any problems, however since rebooting, I can do nothing with these
did devices any more. For example, the boot messages for one of these
disks is: Nov 21 13:12:48 wombat Cluster.CCR: /usr/cluster/bin/scgdevs:
Could not register disk
Nov 21 13:12:48 wombat Cluster.CCR: /usr/cluster/bin/scgdevs: Could not
stat /dev/global/dsk/d123s2:
Nov 21 13:12:48 wombat Cluster.CCR: No such file or directory
didadm -c or -R returns no errors, so it thinks everything is ok.
I have tried everything I can think of to access the disks, or to remove
them
from the device tree, to rename them again to device numbers under 100, to
clean the did device tree, and so on, all with no success. (didadm -C
then -r, devfsadm -Cv, didadm -t d101:d35)
I have examined the problem and seen that there are no did device entries
(links) in /dev/did greater than d100, the devices go from did/d0 to
did/d100, there is nothing for did/d101 and onwards. The same applies
in /devices/pseudo/did. According to Sun documentation, these should
however be dynamically generated.
I cna see two possibilities here -
1. Find a way to get cluster to generate the missing device entries.
2. Find a way to remove all the problem entries and go back to something
more conservative.
So far I have not been able to find a way to do either of these. I have
considered the possibility of manually brutally deleting the troublesome
entries, however I don't know what consequences this will have.
I would be grateful to anyone who could point me in a direction to either
get cluster to delete the troublesome entries and start again, or generate
the missing entries.
Thanks
Markus
_______________________________________________
sunmanagers mailing list
sunmanagers@xxxxxxxxxxxxxxx
http://www.sunmanagers.org/mailman/listinfo/sunmanagers
sunmanagers mailing list
sunmanagers@xxxxxxxxxxxxxxx
http://www.sunmanagers.org/mailman/listinfo/sunmanagers
- References:
- Sun Cluster 3.2, did devices over 100
- From: Markus Mayer
- Sun Cluster 3.2, did devices over 100
- Prev by Date: veritas clustering problems
- Next by Date: Zones & Vlan
- Previous by thread: Sun Cluster 3.2, did devices over 100
- Next by thread: Cannot read ACLs: ([90] Number of symbolic links encountered during path name traversal exceeds MAXSYMLINKS
- Index(es):
Relevant Pages
|