880 panicking
From: Anshuman Kanwar (anshuman_at_expertcity.com)
Date: 06/18/03
- Previous message: Smith, Kevin: "FIbre cable specifications"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
To: "'sunmanagers@sunmanagers.org'" <sunmanagers@sunmanagers.org> Date: Wed, 18 Jun 2003 05:17:37 -0700
HI Managers,
I have a 2 node cluster setup :
2 x v880 cross connected to 2 3310's via LVD SCSI running solaris 9 + sun
cluster 3.0 + solaris volume manager
After a deliberate reboot the machine is panicking. Seems like a SCSI bus
issue. The SCSI bus probe (scsi-probe-all) also seems to get stuck at a
point. Outputs arte attached below.
This looks like bad hardware right ? Probably the HBA. Has anyone seen this
happen with SC 3.0 before ?
I got no log messages when the node was running.
Thanks,
-ansh
--------------------
{3} ok probe-scsi-all
/pci@9,700000/pci@2/scsi@5
/pci@9,700000/pci@2/scsi@4
Target 0
Unit 0 Disk SUN StorEdge 3310 0325
Unit 1 Disk SUN StorEdge 3310 0325
/pci@8,600000/SUNW,qlc@2
LiD HA LUN --- Port WWN --- ----- Disk description -----
0 0 0 500000e01020a481 FUJITSU MAN3735F SUN72G 0604
1 1 0 500000e0101fc9f1 FUJITSU MAN3735F SUN72G 0604
2 2 0 500000e0101fd021 FUJITSU MAN3735F SUN72G 0604
6 6 0 50800200001c4fe9 SUNW SUNWGS INT FCBPL9226
3 3 0 500000e0102008c1 FUJITSU MAN3735F SUN72G 0604
4 4 0 500000e010201f51 FUJITSU MAN3735F SUN72G 0604
5 5 0 21000004cf2b91b7 SEAGATE ST373405FSUN72G 0638
/pci@8,700000/scsi@1
Script interrupt: Reserved phase
Fatal SCSI error at script address 8 Unexpected disconnect
Arbitration Complete
Script interrupt: Reserved phase
Fatal SCSI error at script address 8 Unexpected disconnect
Arbitration Complete
Script interrupt: Reserved phase
Fatal SCSI error at script ad
> -----Original Message-----
> From: Anshuman Kanwar
> Sent: Tuesday, June 17, 2003 11:51 PM
> To: 'jh1@sun.com'
> Subject: case no 63595964
>
> << File: cluster1.txt >>
---------------------
Sun Fire 880, No Keyboard
Copyright 1998-2002 Sun Microsystems, Inc. All rights reserved.
OpenBoot 4.7.5, 20480 MB memory installed, Serial #53149406.
Ethernet address 0:3:ba:2a:fe:de, Host ID: 832afede.
Rebooting with command: boot
Boot device: /pci@8,600000/SUNW,qlc@2/fp@0,0/disk@w500000e01020a481,0:a
File and args:
SunOS Release 5.9 Version Generic_112233-04 64-bit
Copyright 1983-2002 Sun Microsystems, Inc. All rights reserved.
Use is subject to license terms.
WARNING: forceload of misc/md_trans failed
WARNING: forceload of misc/md_raid failed
WARNING: forceload of misc/md_hotspares failed
WARNING: forceload of misc/md_sp failed
configuring IPv4 interfaces: ce0.
Hostname: v880-1
WARNING: /pci@8,700000/scsi@1 (glm0):
Resetting scsi bus, got incorrect phase from (0,0)
WARNING: /pci@8,700000/scsi@1 (glm0):
timeout on bus reset interrupt
WARNING: glm0: fault detected in device; service unavailable
WARNING: glm0: timeout on bus reset interrupt
Could not open /dev/rdsk/c0t6d0s2 to verify device id.
No such device or address
device id for '/dev/rdsk/c1t5d0' does not match physical disk's id.
The drive may have been replaced
Booting as part of a cluster
NOTICE: CMM: Node v880-1 (nodeid = 1) with votecount = 1 added.
NOTICE: CMM: Node v880-2 (nodeid = 2) with votecount = 1 added.
NOTICE: CMM: Quorum device 1 (/dev/did/rdsk/d8s2) added; votecount = 1,
bitmask of nodes with configured paths = 0x3.
WARNING: CMM: Initialization for quorum device /dev/did/rdsk/d8s2 failed
with error EACCES. Will retry later.
NOTICE: clcomm: Adapter hme0 constructed
NOTICE: clcomm: Path v880-1:hme0 - v880-2:hme0 being constructed
NOTICE: clcomm: Adapter ge0 constructed
NOTICE: clcomm: Path v880-1:ge0 - v880-2:ge0 being constructed
NOTICE: CMM: Node v880-1: attempting to join cluster.
SUNW,pci-gem0: Using Gigabit SERDES Interface
SUNW,pci-gem0: Auto-Negotiated 1000 Mbps Full-Duplex Link Up
NOTICE: clcomm: Path v880-1:ge0 - v880-2:ge0 being initiated
NOTICE: clcomm: Path v880-1:ge0 - v880-2:ge0 online
NOTICE: CMM: Node v880-2 (nodeid: 2, incarnation #: 1049240903) has become
reachable.
WARNING: CMM: Reading reservation keys from quorum device /dev/did/rdsk/d8s2
failed with error 2.
NOTICE: CMM: Cluster has reached quorum.
NOTICE: CMM: Node v880-1 (nodeid = 1) is up; new incarnation number =
1055914729.
NOTICE: CMM: Node v880-2 (nodeid = 2) is up; new incarnation number =
1049240903.
NOTICE: CMM: Cluster members: v880-1 v880-2.
NOTICE: CMM: node reconfiguration #3 completed.
NOTICE: CMM: Node v880-1: joined cluster.
Could not open /dev/rdsk/c0t6d0s2 to verify device id.
No such device or address
device id for '/dev/rdsk/c1t5d0' does not match physical disk's id.
The drive may have been replaced
The system is coming up. Please wait.
WARNING: md: d1105: (Unavailable) needs maintenance
checking ufs filesystems
/dev/rdsk/c1t3d0s0: is logging.
/dev/rdsk/c1t2d0s0: is logging.
/dev/md/rdsk/d1115: is logging.
NOTICE: clcomm: Path v880-1:hme0 - v880-2:hme0 being initiated
NOTICE: clcomm: Path v880-1:hme0 - v880-2:hme0 online
/dev/rdsk/c1t0d0s7: is logging.
/dev/md/rdsk/d1006: is logging.
starting rpc services: rpcbind done.
Setting netmask of lo0:1 to 255.255.255.255
Setting netmask of ce0 to 255.255.255.0
Setting netmask of hme0 to 255.255.255.128
Setting netmask of hme0:2 to 255.255.255.252
Setting netmask of ge0 to 255.255.255.128
Setting default IPv4 interface for multicast: add net 224.0/4: gateway
v880-1
syslog service starting.
obtaining access to all attached disks
System dump time: Tue Jun 17 22:33:37 2003
savecore: not enough space in /var/crash/v880-1 (362 MB avail, 1711 MB
needed)
Jun 17 22:39:29 v880-1 savecore: not enough space in /var/crash/v880-1 (362
MB avail, 1711 MB needed)
volume management starting.
Jun 17 22:39:35 v880-1 metadevadm: Unnamed device detected. Please run
devfsadm && metadevadm -r to resolve.
Executing devfsadm
Executing metadevadm -r
Unable to resolve unnamed devices for volume management.
Please refer to the Solaris Volume Manager documentation,
Troubleshooting section, at http://docs.sun.com or from
your local copy.
panic[cpu1]/thread=2a100125d40: BAD TRAP: type=31 rp=2a100125500
addr=308000cd500 mmu_fsr=0
sched: trap type = 0x31
addr=0x308000cd500
pid=0, pc=0x108cdf8, sp=0x2a100124da1, tstate=0x880001607, context=0x0
g1-g7: 149e400, 7ffff, 0, 1, 1, 0, 2a100125d40
000002a100125230 unix:die+a4 (31, 2a100125500, 308000cd500, 0, 0, 0)
%l0-3: 0000000000000000 00000300000cd508 000002a100125500 000002a1001253f8
%l4-7: 0000000000000031 000000000000045b 000000000115ccd8 0000030006cb18b0
000002a100125310 unix:trap+874 (2a100125500, 0, 10000, 10200, 308, 1)
%l0-3: 0000000000000001 0000000000000000 0000000001437888 0000000000000031
%l4-7: 0000000000000006 0000000000000001 0000000000000000 0000000000000000
000002a100125450 unix:ktl0+48 (300000cd508, 0, 20, 7fffffff8, 0,
300056cd848)
%l0-3: 0000000000000000 0000000000001400 0000000880001607 000000000102aaf8
%l4-7: 000000000147d7c0 0000000001437800 0000000000000000 000002a100125500
000002a1001255a0 unix:kstat_rele+20 (ffffffffffffffff, 3, 4, 30000373f28,
76, 140e000)
%l0-3: 0000000001428f48 0000030006cb18b0 0000000000000000 00000300056ae076
%l4-7: 00000300001dea48 00000300001deb10 0000000000000076 00000300001dee8a
000002a100125650 md:md_layered_close+d4 (ffffffffffffffff, 3, 4, 0, 0, 10)
%l0-3: ffffffffffffffff 0000000000000002 0000030000373f28 00000000ffffffff
%l4-7: 00000000ffffffff 0000000000000076 00000300056ad850 0000030000361438
000002a100125700 md_stripe:stripe_close_all_devs+dc (30005631be4,
30005631be4, 20, 76, 30007377000, 0)
%l0-3: 0000000000000001 0000000000000001 0000000000000002 0000000000000000
%l4-7: 0000000000000002 0000030005631b88 00000300056ad850 0000030005631bf8
000002a1001257b0 md_stripe:stripe_close+88 (5500000451, 3, 4, 30000373f28,
2, 0)
%l0-3: 0000000000000000 0000000000000002 00000300056a7f30 0000030005631b88
%l4-7: 00000300056a7f30 0000000000000451 0000000000000002 0000000000000001
000002a100125860 md_mirror:mirror_probe_close_all_devs+b8 (5500000451,
300001e3338, 1, 1, 300001e30e8, 300001e3144)
%l0-3: 00000000012e2b00 0000000000000001 0000000000000002 000000000000ffff
%l4-7: 00000300001e30e8 00000300001e3144 00000300001e3318 00000300001e319c
000002a100125910 md_mirror:mirror_probe_dev+208 (108, 45b, 1437888, 1437888,
0, 0)
%l0-3: 0000000000000004 000000000000045b 0000000000000001 000000000000ffff
%l4-7: 00000300056a7a50 000000000000045b 00000300001e30e8 00000300001e3144
000002a1001259d0 md:md_probe_one+40 (300136a2048, 2a100125d40, 20, 148a0a8,
2a100125d40, 0)
%l0-3: 00000000012e99dc 000000000149e570 0000030007bbfe50 ffffffffffffffff
%l4-7: 0000000001400090 000000000142d5b8 0000000001442400 0000030005633740
000002a100125a80 md:md_daemon+220 (0, 149e540, 1437888, 1437888, 148a0b2, 0)
%l0-3: 00000000011c6994 00000300136a2048 0000000000000000 000002a10012bd40
%l4-7: 000000000149e570 000000000149e568 0000030000010e00 0000030005633840
syncing file systems... done
dumping to /dev/md/dsk/d1001, offset 4195221504, content: kernel
9% done
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers
- Previous message: Smith, Kevin: "FIbre cable specifications"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Relevant Pages
|