RE: Cluster hang -- Getting Crash Dump

From: Stuart, Ed (Ed.Stuart_at_austinenergy.com)
Date: 03/18/04


Date: Thu, 18 Mar 2004 13:46:45 -0600

Believe it or not the Alphas have a console command to initiate a crash and
write a dump. At the console command prompt enter: crash

Ed
**Please apply a generous amount of all the usual disclaimers here.**

> -----Original Message-----
> From: dave.baxter@bannerhealth.com
> [mailto:dave.baxter@bannerhealth.com]
> Sent: Thursday, March 18, 2004 11:30 AM
> To: Info-VAX@Mvb.Saic.Com
> Subject: Cluster hang -- Getting Crash Dump
>
>
> In the past month, I have had two occasions when my (2-node ES40,
> OVMS731)cluster has hung. All of the symptoms point to a probable
> Quorum Hang, (quite possible since I don't have a Quorum Disk),
> however there are some indications that this might not be the case.
> (Note: VOTES = 1 each, EXPECTED_VOTES = 1)
>
> 1. Neither node crashed. (so no quorum loss there).
> 2. The cluster uses two fully independent GB ethernet interconnects
> (switches), which are private and do not connect to the network.
> 3. My other cluster, (which uses the same interconnect (i.e. same
> pair of switches) was unaffected by the hang.
> 4. On examination, (i.e. after driving in from home to take care of
> the problem), all link lights are green on the GB Switches. This
> would seem to rule out the interconnects as the source of any quorum
> loss. And even if they did somehow, simultaneously lose their
> connection and cause a loss of quorum, would quorum not be
> restored when the interconnect links reestablished???
>
> I would really appreciate any comments/suggestions here.
> (Please dont start berating me about the lack of a Quorum
> Disk unless you think it would have avoided this problem, and
> can explain why).
>
> On a second, equally important issue. In order to break the
> hang I had to HALT the nodes (one at a time with Cntrl/P).
> (Comment:: Again, with the votes set as above, this should
> have released the hang on the other node, it didn't!!).
>
> Because the nodes were HALTed, they didn't automatically
> generate a Crash Dump, so I dont have any thing to diagnose
> with (Error Log contains no indications of problems, neither
> does Operator.log).
>
> The crash dump file SYSDUMP.DMP is set up off the system
> disk, on an internal drive, (and is correctly set up in
> sysgen and at the console level).
>
> HOW CAN I FORCE A DUMP AFTER I AM AT THE "P00>>" CONSOLE PROMPT ???
>
> I am sure that I remember being told that there is a
> command that can be entered at the "P00>>" prompt that
> forces a dump of the
> Registers. I would really appreciate it if anyone can give me this
> information. This is probably more important to me than the cause
> of the hang since, at the moment, I have nowhere to go except
> to endlessly analyse the symptoms in my head, (this leads to
> insanity, ultimately).
>
> The only other information I can think of which might be
> useful is to mention that I am running the Cerner Millennium
> Clinical Application, with Oracle 8.1.7.4.
>
> Thanks
>
> Dave.
>



Relevant Pages

  • Re: SetUnhandledExceptionFilter
    ... The only way to prevent at all cost a crash in the applicaiton ... Looking at a full dump of Word.exe and/or LotusNotes is the only way ... A bad heap metadata will cause an access violation, ... normally dismissed in some wide exception handler) ...
    (microsoft.public.win32.programmer.kernel)
  • RE: VS .Net 2003 IDE Crashes when Attaching to Debugger
    ... had to run the application much longer when attempting to get the dump than ... The crash occurs when selecting the ... Macaffee can cause some prolems with the IDE since it does ... >>application we're debugging is an inprocess DLL with Visio. ...
    (microsoft.public.vsnet.general)
  • Re: SYSDUMP.DMP corruption
    ... >..analyzing a compressed selective memory dump... ... but not fatal to crash analysis. ... This could be an invalid dump file, or a may be due to a mis-matched SDA image. ... images, but often slows performance too much ...
    (comp.os.vms)
  • Re: Online Crash Analysis for Dump from Other Computer
    ... Why can't the crashing system send the dump itself? ... interested in is knowing what the reason is for the crash you can send it to ... How can I submit a dump file to crash analysis from a different ...
    (microsoft.public.windowsxp.general)