Re: Intel i8xx watchdog driver

From: Doug Ambrisko (ambrisko_at_ambrisko.com)
Date: 03/25/04

  • Next message: Brian O'Shea: "Re: Diagnosing unrecognized hardware"
    To: Scott Long <scottl@freebsd.org>
    Date: Thu, 25 Mar 2004 14:01:19 -0800 (PST)
    
    

    Scott Long writes:
    | In reading the code, it appears that it is indeed an ICHx service and
    | not limited to just i8xx chipsets. I have a few issues with how the
    | probe and attach are done, and I'm addressing these in a private mail
    | right now. It's funny that I was reading the Intel ICH5 docs last night
    | and didn't come across this feature at all.

    I haven't looked at the code but have worked with this feature.
     
    | I'm not sure if I like the idea of auto-reboot on second expire, unless
    | it is configurable (i.e. you can turn that feature off depending on the
    | situation). I understand it's purpose though.

    Note sure what you mean by that. The second expire means you didn't
    catch a hang so you must be dead. If the system is dead then you want
    it to reboot. You do need a way to cancel this and I think PHK API
    allowed that. I used his early Soekris WD a while back. The real
    purpose of the TCO timer & SIS timer is that a BIOS can set aggressive
    timings and back off if the systems fails to boot. This way a user doesn't
    have to tune the BIOS. A lot of chipsets have it and I've seen it
    in some Winbond Super I/O chips. However, we can use it for free for
    our purposes :-)
     
    | > A nice concept would be to have a SW watchdog based on the clock tickle
    | > the HW watchdog. If the SW watchdog goes off you get a panic.
    | >
    | > Interesting caveats are having the watchdogs going off while in kgdb/ddb.
    |
    | Talking with PHK about this now. There seems to be a growing need for a
    | mechanism that can inform registered listeners that DDB is about to be
    | entered. It's just a random thought in my brain right now, need more
    | time to flesh it out.

    Perfect.

    consmute is another. What's the point of entering DDB is the console
    is muted :-( I have code so if DDB or panic is entered then consmute
    is disabled for that period. Basically I just switched consmute into
    a function that can be called from the DDB/panic routines. Now there
    are reasons to really mute the console no matter what.

    It would be cool if we could layer all this. One reason I did a HW
    watchdog to enforce a SW watchdog is that various HW watchdogs have
    different time ranges. The SW watchdog can easily run within that
    range. Then a user-land app can set a long SW watchdog and not
    have to worry about if it get starved for a while triggering a false
    reboot. For example who cares if the watchdog is set for 15 minutes.
    That is better then having to drive to a co-lo. etc. on a Sunday morning :-(
    Your chance for a false trigger is greatly reduced.

    | > I have code for the SIS 630 chipset that I can give to anyone interested.
    |
    | More support it wonderful! I don't have any hardware to test it on, though.

    I don't have direct access to HW anymore but have working code. I've been
    watching the HW watchdog stuff and can add it once some of these issues
    have been resolved.

    All of my current code is quick hacks to get around immediate issues but
    it is in production.

    Doug A.
    _______________________________________________
    freebsd-hackers@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
    To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"


  • Next message: Brian O'Shea: "Re: Diagnosing unrecognized hardware"