Re: panic in propagate_priority w/ postgresql under heavy load

From: Koen Martens (fbsd_at_metro.cx)
Date: 09/19/05

  • Next message: Nikhil Dharashivkar: "Re: Adding new option to ktrace"
    Date: Mon, 19 Sep 2005 21:35:44 +0200
    To: Vinod Kashyap <vkashyap@amcc.com>
    
    

    Vinod Kashyap wrote:
    > You seem to be booting off of a 9000 (twa) controller and not 7000/8000
    > (twe).
    > It could be because of a 9000 firmware bug that you are not being able
    > to
    > get the dump. The firmware wrongly interprets physical address 0x0 as
    > invalid
    > during dumps, and fails the operations. This bug will be fixed in
    > future
    > firmware releases.

    Ok, it's been a while, here is an update on this.

    I ran a heavily instrumented kernel for two weeks on the server, it
    did not crash in that time. I then took out the witness and kdb/ddb
    stuff, because the decreased performance was a bit of a nuisance,
    however i retained the ability to obtain a crash dump. I had to
    limit physical memory, put it on 1.8GB in loader.conf:hw.physmem
    because swap and physmem are both 2GB. Tested with 'reboot -d' gave
    me a core dump.

    Without the debug stuff in the kernel, it crashed within 2 days,
    same story: postgresql process, function propagate_priority.
    However, no dump was written to disk :(

    Furthermore, i've been seeing the same crash (in propagate_priority)
    on another box in mysql processes. Both servers seem to panic every
    2-3 days. I have another server of the exact same hardware
    configuration, but it is mainly idling most of the time. Haven't
    seen that one crash yet.

    I am thinking now that it is a bug in the twa driver, so i'll have
    to dig in to that. Furthermore, it seems to have to do with some
    sort of concurrency issue or otherwise timing-sensitive issue,
    because slowing the kernel down with debug code seems to avoid the
    panic. But, as i am completely new to the freebsd kernel and don't
    even know what turnstiles are, i imagine i will have a hard time. So
    if anyone can offer some help, please :)

    Ok, thanks for your attention,

    Koen

    -- 
    K.F.J. Martens, Sonologic, http://www.sonologic.nl/
    Networking, hosting, embedded systems, unix, artificial intelligence.
    Public PGP key: http://www.metro.cx/pubkey-gmc.asc
    Wondering about the funny attachment your mail program
    can't read? Visit http://www.openpgp.org/
    _______________________________________________
    freebsd-hackers@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
    To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"
    

  • Next message: Nikhil Dharashivkar: "Re: Adding new option to ktrace"

    Relevant Pages

    • Re: SetUnhandledExceptionFilter
      ... The only way to prevent at all cost a crash in the applicaiton ... Looking at a full dump of Word.exe and/or LotusNotes is the only way ... A bad heap metadata will cause an access violation, ... normally dismissed in some wide exception handler) ...
      (microsoft.public.win32.programmer.kernel)
    • RE: Cluster hang -- Getting Crash Dump
      ... write a dump. ... At the console command prompt enter: crash ... > Quorum Hang,, ...
      (comp.os.vms)
    • RE: VS .Net 2003 IDE Crashes when Attaching to Debugger
      ... had to run the application much longer when attempting to get the dump than ... The crash occurs when selecting the ... Macaffee can cause some prolems with the IDE since it does ... >>application we're debugging is an inprocess DLL with Visio. ...
      (microsoft.public.vsnet.general)
    • Re: SYSDUMP.DMP corruption
      ... >..analyzing a compressed selective memory dump... ... but not fatal to crash analysis. ... This could be an invalid dump file, or a may be due to a mis-matched SDA image. ... images, but often slows performance too much ...
      (comp.os.vms)
    • Re: panic in propagate_priority w/ postgresql under heavy load
      ... > It could be because of a 9000 firmware bug that you are not being able ... The firmware wrongly interprets physical address 0x0 as ... Indeed am I booting of twa, swap is also on there. ... With the -f option it also says: 'unable to force dump - bad magic' ...
      (freebsd-hackers)