[HPADM] RE: RE: Reboot after panic: , isr.ior = 0'240003.0'14c521f4

From: Krajcovic, Jakub (jakub.krajcovic_at_hp.com)
Date: 04/28/04

  • Next message: Johnson, Craig E: "[HPADM] Summary: Nike Model 10 Question"
    Date: Wed, 28 Apr 2004 21:26:24 +0200
    To: "Taylor, Vince" <Vince.Taylor@logicacmg.com>
    
    

    Awesome help Vince,

    The q4 debugger is useless, because the kernel in not prepared for
    debugging (as i found out), but what you wrote seems to make sense :-)
    (no really, great help, thx a lot). I am just compiling a caselog, where
    i will be putting this info, and i'll see how it turns up. I'll keep you
    guys informed, and when this is resolved, i will compile a summary, like
    i was asked earlier...

    thanks for the great help people, I'm starting to feel like on the
    gentoo ml :-)

    -----Original Message-----
    From: Taylor, Vince [mailto:Vince.Taylor@logicacmg.com]
    Sent: Wednesday, April 28, 2004 7:15 PM
    To: Krajcovic, Jakub
    Cc: Taylor, Vince
    Subject: RE: [HPADM] RE: Reboot after panic: , isr.ior =
    0'240003.0'14c521f4

    Hi,
     
    I think you have a runway bus problem.
     
    You didn't mention what sort of server it was (or if you did I missed
    it), but I am guessing it is a K370/380/570/580 class server. According
    to the K class service manual an HPMC code of 5xy8 (which all of yours
    are) is a Bus Transaction problem, in particular it is a "Processor
    Memory bus broad fault". The "x" is the MID number (Master ID), and the
    "y" is the bus number.
     
    The MID maps to CPU number (almost). MID 0 == CPU 0, up to MID 3 == CPU
    3, and then it jumps to MID 6 == CPU 4, and MID 7 == CPU 5. An MID of 4
    means it is system board IOA 0, and MID 5 means it is system board IOA
    1. If the MID was 6 or 7, then it would be the HSC expansion I/O slots.
     
    The "y" is the bus number, and I would suspect this is the problem, as 0
    is the runway bus, and all your error numbers have this as an unchanging
    "feature". I do know that HP have had some bad problems a long time ago
    with the runway bus on K class servers, and we swapped out multiple CPUs
    several years ago to fix the problem.
     
    Cheers,
     
    Vince.

    -----Original Message-----
    From: Krajcovic, Jakub [mailto:jakub.krajcovic@hp.com]
    Sent: Wednesday, April 28, 2004 17:45
    To: rick.favro@philips.com
    Cc: hpux-admin@dutchworks.nl
    Subject: [HPADM] RE: Reboot after panic: , isr.ior = 0'240003.0'14c521f4

    Hi Rick, thanks for the great step-by-step, but unfortunately i didn't
    find any error codes 0x2???. Ths is the output of grep "HPMC Chassis"
    ts99 :
     
    bbnesd1:/var/tombstones (root) grep "HPMC Chassis" ts99
    HPMC Chassis Codes = 0xcbf0 0x5008 0x5408 0x5508 0xcbfb HPMC Chassis
    Codes = 0xcbf0 0x5108 0x5408 0x5508 0xcbfb HPMC Chassis Codes =
    0xcbf0 0x520b 0x5408 0x5508 0xcbfb HPMC Chassis Codes = 0xcbf0
    0x5308 0x5408 0x5508 0xcbfb HPMC Chassis Codes = 0xcbf0 0x5608
    0x5408 0x5508 0xcbfb HPMC Chassis Codes = 0xcbf0 0x5708 0x5408
    0x5508 0xcbfb

    So, now it looks like the q4 debugger is what i need.. So could you
    please be so kind and send me the procedure, so i can have a look into
    this beast
    :-)
     
    thanks, jakub

    -----Original Message-----
    From: rick.favro@philips.com [mailto:rick.favro@philips.com]
    Sent: Wednesday, April 28, 2004 6:28 PM
    To: Krajcovic, Jakub
    Subject: Re: [HPADM] Reboot after panic: , isr.ior = 0'240003.0'14c521f4

    Jakub:

            This is generally a processor problem. There are 2 files to
    check. One is the /var/tombstones/ts99 file. Do a grep "HPMC Chassis"
    ts99 and you'll see a line for each processor. Each line should begin
    with 0xcbf0 and end with 0xcbfc. You are looking for a 0x2??? type of
    error code.
    That is the indication that that processor is the one with the problem.
    Also, check for a /var/adm/crash/crash.X directory. Inside of that
    directory,
    will be several files. You can run the q4 debugger on the vmunix file to
    tell if the crash was caused by an HPMC error, which would confirm that
    it is
    a processor problem. HPMC ( High Priority Machine Check ) are
    non-recoverable processor problem, while LPMC ( Low Priority Machine
    Check ) is recoverable by the processor.

    If you need the q4 debugger procedure, let me know, and I can send you a
    copy of what we do.

    Hope this helps,

    Rick Favro.

            

    "Krajcovic, Jakub" <jakub.krajcovic@hp.com>

    Sent by:
    hpux-admin-owner@DutchWorks.nl

    04/28/2004 09:15 AM

            
            To: <hpux-admin@dutchworks.nl>
            cc: (bcc: Rick Favro/ATL-BTL/MS/PHILIPS)
            Subject: [HPADM] Reboot after panic: , isr.ior =
    0'240003.0'14c521f4

            Classification:

    Hello HP-UX admins,

    I just joined this list, and already a nasty question for all of you
    great guys (and gals possibly :-)) out there:

    I am working on an incident that happened on one of our servers, and i
    would like to ask for help, because frankly, i have no clue as to what
    to do next.

    A server restarted and this error message:

    16:36 Wed Apr 28 2004. Reboot after panic: , isr.ior =
    0'240003.0'14c521f4

    appeared in the /etc/shutdownlog file. Now can anyone here please direct
    me where to look for a possible cause, or does anyone know the answer to
    this problem? dmesg and /vat/adm/syslog/syslog.log only show the output
    of the boot (which was successful), and contain no mention of the
    possible problem.

    thanks in advance

    jakub

    Jakub Krajcovic
    HP EMEA MSDD UXPS BTV
    jakub.krajcovic@hp.com

    --
                ---> Please post QUESTIONS and SUMMARIES only!! <---
           To subscribe/unsubscribe to this list, contact
    majordomo@dutchworks.nl
          Name: hpux-admin@dutchworks.nl     Owner:
    owner-hpux-admin@dutchworks.nl
    Archives:  ftp.dutchworks.nl:/pub/digests/hpux-admin       (FTP, browse
    only)
               http://www.dutchworks.nl/htbin/hpsysadmin   (Web, browse &
    search)
    This e-mail and any attachment is for authorised use by the intended
    recipient(s) only. It may contain proprietary material, confidential
    information and/or be subject to legal privilege. It should not be
    copied, disclosed to, retained or used by, any other party. If you are
    not an intended recipient then please promptly delete this e-mail and
    any attachment and all copies and inform the sender. Thank you.
    --
                 ---> Please post QUESTIONS and SUMMARIES only!! <---
            To subscribe/unsubscribe to this list, contact majordomo@dutchworks.nl
           Name: hpux-admin@dutchworks.nl     Owner: owner-hpux-admin@dutchworks.nl
     
     Archives:  ftp.dutchworks.nl:/pub/digests/hpux-admin       (FTP, browse only)
                http://www.dutchworks.nl/htbin/hpsysadmin   (Web, browse & search)
    

  • Next message: Johnson, Craig E: "[HPADM] Summary: Nike Model 10 Question"

    Relevant Pages

    • Re: SQL2k debugging broken/problem
      ... Firewall existing between the client and the server. ... Turn off any firewall installed on the client computer and the server. ... "Troubleshooting the Transact-SQL Debugger" topic in BOL to allow remote ...
      (microsoft.public.sqlserver.tools)
    • Server 2003 reboots
      ... OS is Windoes Server 2003 Standard. ... After reboot there is a system has recovered from a serious error ... When I run Windbg on the mini dump files the problem file appears to be ... If a kernel debugger is available get the stack backtrace. ...
      (microsoft.public.windows.server.general)
    • RE: Stepping through stored procedures not working
      ... With Visual Basic 5.0 Enterprise Edition or later, you can debug SQL Server ... For information on problems that may arise when using the debugger with SQL ... SQL Server must run under a Windows NT user account, ...
      (microsoft.public.sqlserver.programming)
    • RE: Problem using SQL Debugger
      ... locally" user rights. ... | Thread-Topic: Problem using SQL Debugger ... | the server was replaced with a Windows 2003 Server, ...
      (microsoft.public.sqlserver.tools)
    • Re: DebugBreak()
      ... that COM server was run in other account ... How to disable catching DebugBreak() by my proxy/stub? ... I'd like to catch it in debugger, ... (Not system service COM, but only *.exe out-of process COM) ...
      (microsoft.public.win32.programmer.kernel)