Re: SYSDUMP.DMP corruption
From: Robert Deininger (rdeininger@mindspring.com)
Date: 04/01/03
- Next message: Tom Wade: "Re: Earth to Andrew, do you get it now?"
- Previous message: John Laird: "Re: Fortran Guru requested"
- In reply to: Steve Spires: "SYSDUMP.DMP corruption"
- Next in thread: Bob Koehler: "Re: SYSDUMP.DMP corruption"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
From: rdeininger@mindspring.com (Robert Deininger) Date: Tue, 01 Apr 2003 08:11:56 -0500
In article <91947A84607D9D48B8E674A5FAB54DA63CAE78@tahiti.tinuk.com>,
"Steve Spires" <Steve.Spires@torex.com> wrote:
>We had a crash on one of our systems which produced the following;
>
>CENTRAL $ anal/crash
>_Dump File: sysdump.dmp
>
>
>
>OpenVMS (TM) Alpha system dump analyzer
>..analyzing a compressed selective memory dump...
>
>%SDA-W-LINKTIMEMISM, link time of =
>SYS$COMMON:[SYS$LDR]SYS$BASE_IMAGE.EXE;2 (28-M
>AR-2002 14:19) does not match link time of image in system dump =
>(23-JAN-2001 08:
>35)
>%SDA-W-SDALINKMISM, link time of SYS$BASE_IMAGE built into SDA$SHARE =
>(28-MAR-200
>2 14:19) does not match link time of image in system dump (23-JAN-2001 =
>08:35)
>%SDA-W-VERSMISM, version mismatch with image =
>SYS$COMMON:[SYS$LDR]SYS$BASE_IMAGE.
>EXE;2
The warnings above are side-effects of applying ECOs (and not rebooting?),
I think. These are annoying, but not fatal to crash analysis.
>Dump taken on 15-MAY-**** 05:40:33.27
>** Invalid bugcheck code **
This could be an invalid dump file, or a may be due to a mis-matched SDA image.
>SDA> clue crash
>
>
>
>
>
>
>
>
>Crashdump Summary Information:
>------------------------------
>Crash Time: 15-MAY-**** 05:40:33.27
>Bugcheck Type: INVEXCEPTN, Exception while above ASTDEL
>Node: 00 `%=A5=D7)=F1=A0=E9V7.2-1 =FDComp(Standalone)
>CPU Type: Compaq AlphaServer ES40
>VMS Version: -1 ....
>Current Process: =
>MS_DSP_48....................................................
>..................
More indications that SDA doesn't understand this dump file very well.
>Current Image: DSA100:[SYS0.SYSCOMMON.][SYSEXE]DSM.EXE
>Failing PC: FFFFFFFF.800EA728 PROCESS_MANAGEMENT+1C728
>Failing PS: 20000000.00000800
>Module: PROCESS_MANAGEMENT (Link Date/Time: 23-JAN-2001 =
>08:37:04.46)
>
>
>
>Anyone seen this before?
Don't know offhand. If you can log a service call, HP can check a
database to see if this crash signature has been seen before. If they
find a match, they can often supply a fix right away. If the crash is
new, of course it takes a while to find the cause and make a fix. If the
problem isn't reproducible, it can take a very long time. Self-diagnosis
of these crashes is not recommended. If you have a support contract,
don't be shy, log the call.
But since you're asking in C.O.V, you probably don't have a support contract.
INVEXCEPTN crashes are often caused by corruption of privileged data
structures. The crash PC inside PROCESS_MANAGEMENT, along with map and
listing files for that image, would likely indicate what data structure is
busted. But finding the cause of the corruption is the fun part. If you
don't have the listings kit, you don't really have a chance.
SHOW CRASH usually has useful summary information for this type of
bugcheck. The invalid exception is often an ACCVIO, and SHOW CRASH is
helpful there.
In these crashes, the "current process" and "current image" are likely
just victims, and might not be directly responsible for the bug.
Some images have "_MON" variants that have additional diagnostics or
tracing built in. PROCESS_MANAGEMENT has a _MON variant on my V7.3
system. The SYSTEM_CHECK parameter can be used to activate all the _MON
images (and more goodies besides), but often slows performance too much
for production systems, and may change timing so much the bug goes away.
Another alternative is to rename selected SYS$COMMON:[SYS$LDR]..._MON.EXE
images to SYS$SPECIFIC:[SYS$LDR]....EXE (without the _MON) so they will be
active after the next reboot. All of this is usually done at the request
of, and with assistance from, the HP support folks.
And you'd want to make sure your dump file is big enough, working, etc.
I'd verify this, reboot, and force a crash to verify that the dump file
is readable.
With all of your diagnostics in place, you sit back and wait for the crash
to recur. When it happens, you hope there's enough clues to find the
cause. Lather, Rinse, Repeat, as the saying goes.
If you go through all this on your own, you'll probably decide the HP
support contract is not so expensive after all.
>Alphaserver ES40 running VMS 7.3 with the following patches [I know some =
>of them are replaced now - I have yet to fit in updating the patches, =
>but will do so if required for this problem];
>
>VMS73_BACKUP-V0100
>VMS73_CLUSTER-V0200 =20
>VMS73_DCL-V0200=20
>VMS73_DDTM-V0100 =20
>VMS73_DRIVER-V0200
>VMS73_F11X-V0100 =20
>VMS73_FIBRE_SCSI-V0300
>VMS73_INIT-V0100 =20
>VMS73_LIBRTL-V0200
>VMS73_LMF-V0100 =20
>VMS73_RMS-V0300
>VMS73_SHADOWING-V0200 =20
>VMS73_SYS-V0400
>VMS73_SYSINI-V0100 =20
>VMS73_SYSLOA-V0200
>VMS73_UPDATE-V0100
Offhand, I'd guess at least half of the "rating 1" patches might fix bugs
that could cause an INVEXCEPTN bugcheck. Best advice for a do-it-yourself
shop would be to install each rating 1 patch, unless the release notes
show a reason not to install it.
(Actually, best advice would be to upgrade to V7.3-1, plus patches.)
-- Robert
- Next message: Tom Wade: "Re: Earth to Andrew, do you get it now?"
- Previous message: John Laird: "Re: Fortran Guru requested"
- In reply to: Steve Spires: "SYSDUMP.DMP corruption"
- Next in thread: Bob Koehler: "Re: SYSDUMP.DMP corruption"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Relevant Pages
|