Re: HELP - stale partitions



Looks like a rootvg hdisk. Try running diag disk verify to find out what
hdisk is bad.

Bill

-----Original Message-----
From: IBM AIX Discussion List [mailto:aix-l@xxxxxxxxxxxxx] On Behalf Of
Hans-Dieter Kutz
Sent: Friday, December 15, 2006 3:01 AM
To: aix-l@xxxxxxxxxxxxx
Subject: Re: HELP - stale partitions

On Fri, Dec 15, 2006 at 12:51:30PM +0200, Zvi Bar-Deroma wrote:
Hi,

[after years without posting it's by second post on different issues
within an hour :-)]

I have a serious problem which I have difficulties resolving, on a
critical machine (our NFS server).

Environment:

IBM pseries 7028-6c1/p610. aix 5.2ML05 . rootvg is mirrored (hdisk0 &
hdisk1, 18GB each). This setup has not changed for the last 5 years.
Latest ML was installed around 5/2005. No apparent disk/scsi/storage
ever with this machine.

Last week (Dec. 6th.) "stale partition" messages began to appear for
all
rootvg lv's, and err msgs about hdisk0. Consulting with my dealer's
technical support, we tried to unmirror and then mirror rootvg.
Mirrroing failed, as can be seen below:

Before command completion, additional instructions may appear below.

0516-1296 lresynclv: Unable to completely resynchronize volume.
The logical volume has bad-block relocation policy turned off.
This may have caused the command to fail.
0516-934 /etc/syncvg: Unable to synchronize logical volume hd6.
0516-932 /etc/syncvg: Unable to synchronize volume group rootvg.
0516-1124 mirrorvg: Quorum requirement turned off, reboot system for
this
to take effect for rootvg.
0516-1126 mirrorvg: rootvg successfully mirrored, user should perform
bosboot of system to initialize boot records. Then, user must

modify
bootlist to include: hdisk1 hdisk0.



Then we replaced hdisk0 after breaking the mirror, hoping to remirror
after adding the new hdisk0 to rootvg. The system wouldn't even add
hdisk0 to rootvg ..

Next we did an alt_disk_install from hdisk1
on hdisk0, rebooted with hdisk0 and replaced hdisk1 (so now both disks

are new). Note that as lat_disk_install filesets were not installed,
we've installed them, but unfortunately we forgot to install fixes for

them, and used just the base from the aix 5.2 CDs, giving level of 10
for these filesets. Then managed to mirror rootvg, but after about 30
hours the machine booted, and I have a zillion messages of the form:

EAA3D429 1215095906 U S LVDD PHYSICAL PARTITION MARKED STALE

Immediately after the restart te following message appeared:
9D035E4D 1214165306 P S SYSVMM DATA STORAGE INTERRUPT, PROCESSOR

The deatiled msg for that event was:

LABEL: DSI_PROC
IDENTIFIER: 9D035E4D

Date/Time: Thu Dec 14 16:53:23 IST
Sequence Number: 3019
Machine Id: 0056BC9A4C00
Node Id: aeserv
Class: S
Type: PERM
Resource Name: SYSVMM

Description
DATA STORAGE INTERRUPT, PROCESSOR

Probable Causes
SOFTWARE PROGRAM

Failure Causes
SOFTWARE PROGRAM

Recommended Actions
IF PROBLEM PERSISTS THEN DO THE FOLLOWING
CONTACT APPROPRIATE SERVICE REPRESENTATIVE

Detail Data
DATA STORAGE INTERRUPT STATUS REGISTER
0000 0000 0000 0000
SEGMENT REGISTER, SEGREG
4000 0000 0000 0000
DATA STORAGE INTERRUPT ADDRESS REGISTER
2000 71A6 F000 0000
EXVAL
2FF3 8DD0 0000 0000




After that began messages of the type:
0BA49C99 1214165606 T H scsi0 SCSI BUS ERROR and then the
physical
partition stale and op. notification msgs.


At the moment, lspv dows NOT show hdisk0 and lsvg rootvg shows 1
active
and 1 stale partition.


If required I'll send the detailed logs, but I didn't think it's right

to send it to everyone on the list ...


I'd appreciate any help, in particular concerning what wrong - is it
s/w
or h/w ? If h/w then what ? scsi ? scsi backplane ?

Regards,
/Zvika



~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~
Zvika Bar-Deroma

Systems and Network manager
Phone: (+972)-4-829-2706 ; Fax : (+972)-4-829-2315
Faculty of Aerospace Engineering, Technion, Haifa 32000, Israel

e-mail : zvika@xxxxxxxxxxxxxxxxxxxxx

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~

Hello Zvika,
please show:
# lsvg root
# lsvg -p rootvg
# lsvg -l rootvg
# lspv

hd6 is your Pagingspace. Maybe it wasn't mirrored.

Cheers,
ku

--
Han Solo:
This is not going to work.
Luke Skywalker:
Why didn't you say so before?
Han Solo:
I did say so before!



Relevant Pages

  • Re: HELP - stale partitions
    ... and err msgs about hdisk0. ... we tried to unmirror and then mirror rootvg. ... Then we replaced hdisk0 after breaking the mirror, ... lspv dows NOT show hdisk0 and lsvg rootvg shows 1 active ...
    (AIX-L)
  • Re: Re-mirroring rootvg
    ... that we're sure all is well, we need to re-mirror rootvg again. ... I use the following commands if both disk have the same size and the ... Mirror rootvg using mirrorvg or mklvcopy. ...
    (comp.unix.aix)
  • stale root partiition - AIX 5.3
    ... to take effect for rootvg. ... 0516-1135 unmirrorvg: The unmirror of the volume group failed. ... shows 88 stale partitions - which is the amount of partitions for /. ... mirror is apparently stale. ...
    (comp.unix.aix)
  • Re: Question on unmirroring rootvg before mksysb
    ... So short of a customized image.data file, the rootvg will be ... you can customize the image.data file to ... restore it without a mirror. ... During he restore you can use a customized image.data file to ...
    (AIX-L)
  • HELP - stale partitions
    ... Last week "stale partition" messages began to appear for all rootvg lv's, and err msgs about hdisk0. ... Then we replaced hdisk0 after breaking the mirror, hoping to remirror after adding the new hdisk0 to rootvg. ... 9D035E4D 1214165306 P S SYSVMM DATA STORAGE INTERRUPT, ...
    (AIX-L)