need some debugging help

From: Kenneth D. Merry (ken_at_kdm.org)
Date: 08/30/03

  • Next message: Scott Long: "Re: aac related panics"
    Date: Fri, 29 Aug 2003 22:03:57 -0600
    To: current@FreeBSD.org
    
    
    

    I've been working on a set of patches to remove the sysctl variable creation
    from interrupt context in the cd(4) and da(4) drivers.

    To fix the problem, I've created a new taskqueue that runs in a thread
    context, instead of inside a software interrupt like the current task
    queues. (The eventual fix will involve moving the CAM probe inside a
    thread; this will provide a more temporary solution that will hopefully
    also work on -stable, until we can change the CAM probe code.)

    I think I have everything setup correctly, but I keep getting panics inside
    the GEOM code with these patches. (Memory modified after free.) I don't
    know whether I've just exposed some race condition, or whether I've done
    something wrong.

    I've seen several different panics, all with the same root cause (memory
    modified after free), and with two different previous memory pools -- geom
    and devbuf.

    ==========================================================================
    SMP: AP CPU #1 Launched!
    Memory modified after free 0xcbd4f800(124)
    panic: Most recently used by GEOM

    cpuid = 0; lapic.id = 00000000
    Debugger("panic")
    Stopped at Debugger+0x55: xchgl %ebx,in_Debugger.0
    db> trace
    Debugger(c03e974d,0,c03fb4d8,e5e45934,100) at Debugger+0x55
    panic(c03fb4d8,c03e55db,7c,c083ac14,c083ac00) at panic+0x15f
    mtrash_ctor(cbd4f800,80,0,57a,cbd4f800) at mtrash_ctor+0x5d
    uma_zalloc_arg(c083ac00,0,102,e5e4599c,e5e4599c) at uma_zalloc_arg+0x1e4
    malloc(5b,c042fae0,102,1,c02756e4) at malloc+0xd3
    g_new_providerf(cbda62c0,cbd7b130,e5e45a3c,1,1) at g_new_providerf+0xa3
    g_slice_config(cbda62c0,2,1,0,0) at g_slice_config+0x259
    g_bsd_modify(cbda62c0,cbd7712c,e5e45c8c,10,cbd77000) at g_bsd_modify+0x382
    g_bsd_taste(c0470480,cbda5780,0,159,cbda5700) at g_bsd_taste+0x2c4
    g_new_provider_event(cbda5780,0,c03e52b7,b3,66666667) at g_new_provider_event+0xad
    one_event(e5e45d0c,c021cd85,c0485fb4,0,4c) at one_event+0x218
    g_run_events(c0485fb4,0,4c,c03d7a28,a) at g_run_events+0x15
    g_event_procbody(0,e5e45d48,c03e738b,314,34df1a6d) at g_event_procbody+0x45
    fork_exit(c021cd40,0,e5e45d48) at fork_exit+0xcf
    fork_trampoline() at fork_trampoline+0x8
    --- trap 0x1, eip = 0, esp = 0xe5e45d7c, ebp = 0 ---
    db> panic

    ==========================================================================
    SMP: AP CPU #1 Launched!
    Memory modified after free 0xcbd4f600(124)
    panic: Most recently used by devbuf

    cpuid = 0; lapic.id = 00000000
    Debugger("panic")
    Stopped at Debugger+0x55: xchgl %ebx,in_Debugger.0
    db> trace
    Debugger(c03e974d,0,c03fb4d8,e5e45af0,100) at Debugger+0x55
    panic(c03fb4d8,c03e7f50,7c,c083ac14,c083ac00) at panic+0x15f
    mtrash_ctor(cbd4f600,80,0,57a,cbd4f600) at mtrash_ctor+0x5d
    uma_zalloc_arg(c083ac00,0,102,e5e45b58,e5e45b58) at uma_zalloc_arg+0x1e4
    malloc(5a,c042fae0,102,1,c02756e4) at malloc+0xd3
    g_new_providerf(cbda74c0,cb9b0d90,e5e45bf8,1,1) at g_new_providerf+0xa3
    g_slice_config(cbda74c0,0,1,7e00,0) at g_slice_config+0x259
    g_mbr_modify(cbda74c0,cb9d3800,cbd5b200,123,0) at g_mbr_modify+0x247
    g_mbr_taste(c0470560,cbd4ee80,0,159,cbd4f580) at g_mbr_taste+0x1be
    g_new_provider_event(cbd4ee80,0,c03e52b7,b3,66666667) at g_new_provider_event+0xad
    one_event(e5e45d0c,c021cd85,c0485fb4,0,4c) at one_event+0x218
    g_run_events(c0485fb4,0,4c,c03d7a28,a) at g_run_events+0x15
    g_event_procbody(0,e5e45d48,c03e738b,314,34df1a6d) at g_event_procbody+0x45
    fork_exit(c021cd40,0,e5e45d48) at fork_exit+0xcf
    fork_trampoline() at fork_trampoline+0x8
    --- trap 0x1, eip = 0, esp = 0xe5e45d7c, ebp = 0 ---
    db> panic
    ==========================================================================

    Memory modified after free 0xcbd4f600(124)
    panic: Most recently used by devbuf

    cpuid = 0; lapic.id = 00000000
    Debugger("panic")
    Stopped at Debugger+0x55: xchgl %ebx,in_Debugger.0
    db> trace
    Debugger(c03e974d,0,c03fb4d8,e5e45bbc,100) at Debugger+0x55
    panic(c03fb4d8,c03e7f50,7c,c083ac14,c083ac00) at panic+0x15f
    mtrash_ctor(cbd4f600,80,0,57a,cbd4f600) at mtrash_ctor+0x5d
    uma_zalloc_arg(c083ac00,0,102,fe,cbd6b800) at uma_zalloc_arg+0x1e4
    malloc(60,c042fae0,102,cbd59240,c0470560) at malloc+0xd3
    g_slice_alloc(4,214,cbd4f4d4,1c2,e5e45c9c) at g_slice_alloc+0x7e
    g_slice_new(c0470560,4,cbd4f480,e5e45c98,e5e45c9c) at g_slice_new+0x6f
    g_mbr_taste(c0470560,cbd4f480,0,159,cbd4f580) at g_mbr_taste+0x90
    g_new_provider_event(cbd4f480,0,c03e52b7,b3,66666667) at g_new_provider_event+0xad
    one_event(e5e45d0c,c021cd85,c0485fb4,0,4c) at one_event+0x218
    g_run_events(c0485fb4,0,4c,c03d7a28,a) at g_run_events+0x15
    g_event_procbody(0,e5e45d48,c03e738b,314,34df1a6d) at g_event_procbody+0x45
    fork_exit(c021cd40,0,e5e45d48) at fork_exit+0xcf
    fork_trampoline() at fork_trampoline+0x8
    --- trap 0x1, eip = 0, esp = 0xe5e45d7c, ebp = 0 ---
    db>
    ==========================================================================
    SMP: AP CPU #1 Launched!
    Memory modified after free 0xcbd4f600(124)
    panic: Most recently used by devbuf

    cpuid = 0; lapic.id = 00000000
    Debugger("panic")
    Stopped at Debugger+0x55: xchgl %ebx,in_Debugger.0
    db> trace
    Debugger(c03e974d,0,c03fb4d8,e5e45aa8,100) at Debugger+0x55
    panic(c03fb4d8,c03e7f50,7c,c083ac14,c083ac00) at panic+0x15f
    mtrash_ctor(cbd4f600,80,0,57a,cbd4f600) at mtrash_ctor+0x5d
    uma_zalloc_arg(c083ac00,0,102,5c,cbd4f900) at uma_zalloc_arg+0x1e4
    malloc(64,c042fae0,102,cbd4f900,2) at malloc+0xd3
    g_post_event_x(c021ea70,cbd4f900,2,0,e5e45b6c) at g_post_event_x+0x54
    g_post_event(c021ea70,cbd4f900,2,cbd4f900,0) at g_post_event+0x45
    g_new_providerf(cbda3540,cb9b0b20,e5e45bf8,1,1) at g_new_providerf+0x151
    g_slice_config(cbda3540,0,1,7e00,0) at g_slice_config+0x259
    g_mbr_modify(cbda3540,cbd6c400,cbd73000,123,0) at g_mbr_modify+0x247
    g_mbr_taste(c0470560,cbd4f700,0,159,cbd4f780) at g_mbr_taste+0x1be
    g_new_provider_event(cbd4f700,0,c03e52b7,b3,66666667) at g_new_provider_event+0xad
    one_event(e5e45d0c,c021cd85,c0485fb4,0,4c) at one_event+0x218
    g_run_events(c0485fb4,0,4c,c03d7a28,a) at g_run_events+0x15
    g_event_procbody(0,e5e45d48,c03e738b,314,34df1a6d) at g_event_procbody+0x45
    fork_exit(c021cd40,0,e5e45d48) at fork_exit+0xcf
    fork_trampoline() at fork_trampoline+0x8
    --- trap 0x1, eip = 0, esp = 0xe5e45d7c, ebp = 0 ---

    ==========================================================================

    Since the panics involved either M_DEVBUF or M_GEOM, I removed all M_DEVBUF
    mallocs from the cd(4) and da(4) drivers. (None of the other affected code
    used M_DEVBUF; I created new malloc types for the cd(4) and da(4) drivers.)

    The problem didn't change. (Other than the exact place in GEOM that
    triggered the malloc that caught the problem.)

    Anyway, I've attached the patch in question. If someone could tell me what
    (if anything) I'm doing wrong, I'd appreciate it!

    Thanks,

    Ken

    -- 
    Kenneth Merry
    ken@kdm.org
    
    
    

    _______________________________________________
    freebsd-current@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-current
    To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"



  • Next message: Scott Long: "Re: aac related panics"

    Relevant Pages

    • Re: Fatal trap 12: page fault while in kernel mode
      ... > If this is how I got most of my panics, this little script running in ... > system to panic a lot with the older nvidia drivers. ... "Down with disease, up before the dawn. ...
      (freebsd-current)
    • Re: Fatal trap 12: page fault while in kernel mode
      ... > If this is how I got most of my panics, this little script running in ... > system to panic a lot with the older nvidia drivers. ... "Down with disease, up before the dawn. ...
      (freebsd-hackers)
    • Re: [PATCH] add a trivial patch style checker
      ... And I'm not convinced drivers are in a good position to decide ... panics I see in driver sources seem to be just random logic ... If you're really worried about memory corruption in drivers ...
      (Linux-Kernel)
    • Re: BETA5: nvidia-driver starts crashing
      ... >>P.S. my X is absolutely unuseable because nv driver hangs the system ... I was trying the nivdia drivers on 7.0 current and having all sort of ... problems as well, including panics. ... and every so often trying to start X stated the nvidia drivers ...
      (freebsd-current)
    • Re: Kernel panic on PowerEdge 1950 under certain stress load
      ... under load related to network, get panic after different time intervals. ... never had kernel panics to deal with). ... I will try to get a kernel trace -- it may not happen for awhile since I ...
      (freebsd-hackers)