SUMMARY: How to find the cause of my problem?

From: Bernt Christandl (beb_at_MPA-Garching.MPG.DE)
Date: 07/02/03

  • Next message: manori atapattu: "configuration of silkworm 3800"
    To: sunmanagers@sunmanagers.org
    Date: Wed, 02 Jul 2003 11:14:49 +0200
    
    

    Dear managers,

    i've asked how to find the "hiding" problem that causes my ultra-60
    to "hang"(?)/"crash"(?) from time to time, so that the only thing
    that remains to do is a power-cycled reboot... (see my original
    question attached below).

    I got several helpful answers and wish to thank again all those
    who tried to help me. Nevertheless, yesterday my ultra60 showed
    this strange behaviour 3 times and i still have any hints/data/messages
    from that machine :(

    In the meantime i have a console attached and a script running
    that makes a "top" and "ps -ef" any 3 seconds and saves the output
    on disk, but i can't find anything abnormal within that data...

    I have again patched that machine (7_Recommended) this morning
    (3 new patches, including a kernel patch) and at the moment it is running.

    Those who answered my question suggested

    -> to attach a console to get possibly "last" console messages
       (that failed, there has been nothing...)

    -> to setup a script that saves potentially useful parameters
       (this up to now reveals nothing to me...)

    -> joe.fletcher@btconnect.com said:
       The problem sounds like a watchdog reset
       which is generally hardware related.

       Do i have such a "watchdog"? I don't know how to tell...
       (Is this a SUN-Default?)

    -> to get a core dump
       (this up to now failed too...,
        "limit" says "coredumpsize unlimited" and i have enough diskspace
        available, but no core shines up.)

    -> Dominic Clarke <dominicc@foe.co.uk> said:
       I wonder if you have power saving inadvertantly configured -
       have a look at the manual page for powerd and for power.conf

       Yes, i have a power.conf and powerd is running.
       But why should only that machine suffer from some power-problems?
       (If the machine-power-supply is not "the" problem.)

    -> "Williams, Mario" <mw180013@exchange.DAYTONOH.NCR.com>
       said (among other ideas)
    > check your network table

       and yes, to me the output of "netstat -rn" looks normal/as it should
       and not essentially different from my other ultra60...
       
    -> that i my have a failing network interface...

    With best regards,

    Bernt Christandl

    --------------------------------------------------------------------

    My original question:

    Dear managers,

    i have a ultra 60 under solaris-7, with all recommended+security
    patches from 2 weeks ago.
    ( SunOS sun-5 5.7 Generic_106541-24 sun4u sparc SUNW,Ultra-60 )

    The machine is normally running fine, but about once a month,
    like this morning, the machine does not "communicate" at all, when
    i come in in the morning:
    no answers to ssh, ping or nfs requests, even no output or
    "communication" on the console. (My console is connected to a
    terminal server, so i can't see the last screen of messages...)

    Then my only idea is a power-cycle and this reboots the machine
    without problems.

    Afterwards i'm not able to find anything that gives me a hint
    about what my have happenend: no messages in /var/adm/messages,
    no crash dumps, no core files, nothing that i can find.

    The boot itself says, when checking the filesystems, that all(!)
    are stable, despite my power off without shutdown.

    Being not a sun/solaris guru myself, what can i try to find out
    what kind of a problem i have on this machine? (And we don't have
    a service contract with sun)

    With best regards,

    Bernt Christandl
    _______________________________________________
    sunmanagers mailing list
    sunmanagers@sunmanagers.org
    http://www.sunmanagers.org/mailman/listinfo/sunmanagers


  • Next message: manori atapattu: "configuration of silkworm 3800"

    Relevant Pages

    • Re: [PATCH] MFD maintainer
      ... I'd like to see the MFD core make it into mainline, ... These drivers are in a LOT of handhelds and I'd really like to see the ... It's better to send patches to some ML. ...
      (Linux-Kernel)
    • Re: [Linux-fbdev-devel] [PATCH 0/13 v2] viafb: VIA Frame Buffer Device Driver
      ... All the patches are based on linux kernel 2.6.27-rc2 ... - when blanking console colored text remains visible (e.g. green '*' ... Ideally the output/monitor should got into standby when console is ... suspend/resume handling as the intel DRI can do. ...
      (Linux-Kernel)
    • Re: The debate
      ... be able to play user mods, unlike the console version (if they release ... but more importantly it will have patches first. ... console versions will have to get patches through the proprietary online ... -- A class action lawsuit has been settled against | kendrick @ ...
      (comp.sys.ibm.pc.games.rpg)
    • console command - why not official in linux etc.
      ... Given that the file console.tcl in the tk lib is part of all core ... and that on linux one can use the code here to enable ... which enhances the console. ... While it's true that [puts dest string] requires that dest be ...
      (comp.lang.tcl)
    • Console display oddities in menuconfig
      ... I've noticed this problem now on several machines with both Core 1 and ... Note that I've tried multiple video cards and machines, ... Other console apps seem to work fine. ... I see the exact same display problems. ...
      (Fedora)