kernel deadlock

From: Dave Dolson (ddolson_at_sandvine.com)
Date: 07/29/03

  • Next message: Dave Dolson: "RE: kernel deadlock"
    To: "'freebsd-stable@freebsd.org'" <freebsd-stable@freebsd.org>
    Date: Tue, 29 Jul 2003 15:06:26 -0400
    
    

    We have a reproducible problem with FreeBSD-4.7 which is apparently a
    deadlock.
    The system is undergoing a filesystem stress test.

    The machine is pingable, but console and most other features are
    unresponsive.
    The console debugger can be accessed.
    The following information is available with db's "ps".
    I suspect the wchan of "inode" to be what everything is waiting on.
    I'm not sure who is supposed to perform the waking.

    db> ps
      pid proc addr uid ppid pgrp flag stat wmesg wchan cmd
      467 e75df000 e76d6000 0 141 141 000104 3 inode c34ab600 sshd
      466 e75df1a0 e76c9000 25 147 147 000104 3 inode c34ab600 sendmail
      465 e75df340 e76c4000 0 144 144 000104 3 inode c34ab600 sendmail
      464 e75df4e0 e76be000 25 147 147 000104 3 inode c34ab600 sendmail
      463 e75df680 e76ba000 0 144 144 000104 3 inode c34ab600 sendmail
      462 e75df820 e76b5000 25 147 147 000104 3 inode c34ab600 sendmail
      461 e75df9c0 e76b0000 0 144 144 000104 3 inode c34ab600 sendmail
      460 e75dfb60 e76ac000 25 147 147 000104 3 inode c34ab600 sendmail
      459 e75dfd00 e76a7000 0 144 144 000104 3 inode c34ab600 sendmail
      458 e75dfea0 e76a3000 25 147 147 000104 3 inode c34ab600 sendmail
      457 e75e0040 e769e000 0 144 144 000104 3 inode c34ab600 sendmail
      456 e75e01e0 e7698000 25 147 147 000104 3 inode c34ab600 sendmail
      455 e75e0380 e7693000 0 144 144 000104 3 inode c34ab600 sendmail
      454 e75e0520 e768f000 25 147 147 000104 3 inode c34ab600 sendmail
      453 e75e06c0 e768b000 0 144 144 000104 3 inode c34ab600 sendmail
      452 e75e0860 e7685000 25 147 147 000104 3 inode c34ab600 sendmail
      451 e75e0a00 e7681000 0 144 144 000104 3 inode c34ab600 sendmail
      450 e75e0ba0 e767d000 25 147 147 000104 3 inode c34ab600 sendmail
      449 e75e0d40 e7678000 0 144 144 000104 3 inode c34ab600 sendmail
      448 e75e0ee0 e7671000 25 147 147 000104 3 inode c34ab600 sendmail
      447 e75e1080 e766d000 0 144 144 000104 3 inode c34ab600 sendmail
      446 e75e1220 e7669000 25 147 147 000104 3 inode c34ab600 sendmail
      445 e75e13c0 e7664000 0 144 144 000104 3 inode c34ab600 sendmail
      444 e75e1560 e7660000 25 147 147 000104 3 inode c34ab600 sendmail
      443 e75e1700 e765b000 0 144 144 000104 3 inode c34ab600 sendmail
      442 e75e18a0 e7656000 25 147 147 000104 3 inode c34ab600 sendmail
      441 e75e1a40 e7652000 0 144 144 000104 3 inode c34ab600 sendmail
      440 e75e1be0 e764c000 25 147 147 000104 3 inode c34ab600 sendmail
      439 e75e1d80 e7647000 0 144 144 000104 3 inode c34ab600 sendmail
      438 e75e1f20 e7642000 25 147 147 000104 3 inode c34ab600 sendmail
      437 e75e20c0 e763e000 0 144 144 000104 3 inode c34ab600 sendmail
      436 e75e2260 e763a000 25 147 147 000104 3 inode c34ab600 sendmail
      435 e75e2400 e7635000 0 144 144 000104 3 inode c34ab600 sendmail
      434 e75e25a0 e7630000 25 147 147 000104 3 inode c34ab600 sendmail
      433 e75e2740 e762c000 0 144 144 000104 3 inode c34ab600 sendmail
      432 e75e28e0 e7626000 25 147 147 000104 3 inode c34ab600 sendmail
      431 e75e2a80 e7621000 0 144 144 000104 3 inode c34ab600 sendmail
      430 e75e2c20 e761c000 25 147 147 000104 3 inode c34ab600 sendmail
      429 e75e2dc0 e7618000 0 144 144 000104 3 inode c34ab600 sendmail
      428 e75e2f60 e7613000 25 147 147 000104 3 inode c34ab600 sendmail
      427 e75e3100 e760c000 0 144 144 000104 3 inode c34ab600 sendmail
      426 e75e32a0 e7608000 25 147 147 000104 3 inode c34ab600 sendmail
      425 e75e3440 e7602000 0 144 144 000104 3 inode c34ab600 sendmail
      424 e75e35e0 e75fc000 25 147 147 000104 3 inode c34ab600 sendmail
      423 e75e3780 e75f8000 0 144 144 000104 3 inode c34ab600 sendmail
      422 e75e3920 e75f4000 25 147 147 000104 3 inode c34ab600 sendmail
      421 e75e3ac0 e75ee000 0 144 144 000104 3 inode c34ab600 sendmail
      420 e75e3c60 e75ea000 25 147 147 000104 3 inode c34ab600 sendmail
      419 e75e3e00 e75e6000 0 144 144 000104 3 inode c34ab600 sendmail
      418 dc358ea0 e75dc000 25 147 147 000104 3 inode c34ab600 sendmail
      417 dc359040 e75d7000 0 144 144 000104 3 inode c34ab600 sendmail
      416 dc3591e0 e75d1000 25 147 147 000104 3 inode c34ab600 sendmail
      415 dc359380 e75cd000 0 144 144 000104 3 inode c34ab600 sendmail
      414 dc359520 e75c8000 25 147 147 000104 3 inode c34ab600 sendmail
      413 dc3596c0 e75c4000 0 144 144 000104 3 inode c34ab600 sendmail
      412 dc359860 e75bf000 25 147 147 000104 3 inode c34ab600 sendmail
      411 dc359a00 e75ba000 0 144 144 000104 3 inode c34ab600 sendmail
      410 dc359ba0 e75b6000 25 147 147 000104 3 inode c34ab600 sendmail
      409 dc359d40 e75b2000 0 144 144 000104 3 inode c34ab600 sendmail
      408 dc359ee0 e75aa000 25 147 147 000104 3 inode c34ab600 sendmail
      407 dc35a080 e75a6000 0 144 144 000104 3 inode c34ab600 sendmail
      406 dc35a220 e75a2000 25 147 147 000104 3 inode c34ab600 sendmail
      405 dc35a3c0 e759d000 0 144 144 000104 3 inode c34ab600 sendmail
      404 dc35a560 e7598000 25 147 147 000104 3 inode c34ab600 sendmail
      403 dc35af20 e03f3000 0 144 144 000104 3 inode c34ab600 sendmail
      402 dc35a700 e2877000 0 99 99 000004 3 inode c34ab600 dhclient
      401 dc35b260 e03f0000 0 203 401 8000006 3 inode c34ab600 bash
      399 dc35aa40 e1366000 0 398 399 000014 3 FFS node c0350140 cron
      398 dc35a8a0 e135b000 0 139 139 000004 3 ppwait dc35a8a0 cron
      302 dc35abe0 e0402000 0 137 302 4004004 3 ffsvgt c03695e8 tclsh83
      277 dc35ad80 e03fe000 0 137 277 4004084 3 poll c037c1a0 tclsh83
      203 dc35b8e0 e03d6000 0 202 203 004086 3 wait dc35b8e0 bash
      202 dc35c440 e036e000 0 1 202 004186 3 wait dc35c440 login
      191 dc35c2a0 e0376000 0 1 7 000086 3 select c037c1a0 snmpd
      173 dc35b0c0 e03e8000 0 1 173 000084 3 nanslp c03646b0
    siocontrol
      167 dc35b400 e03e4000 0 1 167 000084 3 nanslp c03646b0 wddt
      147 dc35b5a0 e03df000 25 1 147 2000184 3 pause e03df260
    sendmail
      144 dc35b740 e03da000 0 1 144 000184 3 select c037c1a0 sendmail
      141 dc35ba80 e03d2000 0 1 141 000104 3 inode c34ab600 sshd
      139 dc35bc20 e0397000 0 1 139 000004 3 inode c35f4300 cron
      137 dc35bdc0 e0392000 0 1 137 000084 3 select c037c1a0 inetd
      122 dc35bf60 e0382000 0 1 122 000004 3 inode c34ab600 syslogd
       99 dc35c100 e037e000 0 1 99 000084 3 wait dc35c100 dhclient
        6 dc35c5e0 defd1000 0 0 0 000204 3 vlrup dc35c5e0 vnlru
        5 dc35c780 defce000 0 0 0 000204 3 syncer c037c0c8 syncer
        4 dc35c920 defcb000 0 0 0 000204 3 psleep c036487c
    bufdaemon
        3 dc35cac0 defc8000 0 0 0 000204 3 psleep c0372fc0 vmdaemon
        2 dc35cc60 defc5000 0 0 0 000204 3 psleep c0351e58
    pagedaemon
        1 dc35ce00 dc361000 0 0 1 004284 3 wait dc35ce00 init
        0 c037b4a0 c040d000 0 0 0 000204 3 sched c037b4a0 swapper

    The hung tasks look like this:

    db> t 446
    mi_switch(c34ab600,1000040,0,0,ffffffff) at mi_switch+0x1c8
    tsleep(c34ab600,8,c031a54a,0,c34ab600) at tsleep+0x1d1
    acquire(c34ab600,1000040,600,c34ab600,20002) at acquire+0xbc
    lockmgr(c34ab600,1030002,defc4e6c,e75e1220,defc4e00) at lockmgr+0x2cc
    vop_stdlock(e766bd28,e766bd38,c01fa02c,e766bd28,defc4e00) at
    vop_stdlock+0x42
    ufs_vnoperate(e766bd28) at ufs_vnoperate+0x15
    vn_lock(defc4e00,20002,e75e1220) at vn_lock+0x9c
    lookup(e766bed0,0,e766bed0,e766bed0,e75e1220) at lookup+0x81
    namei(e766bed0,0,cb9c0a40,e766bed0,e766be18) at namei+0x19d
    vn_open(e766bed0,1,1a4,3,e75e1220) at vn_open+0x1ed
    open(e75e1220,e766bf80,0,80e3500,0) at open+0xc4
    syscall2(2f,2f,2f,0,80e3500) at syscall2+0x20d
    Xint0x80_syscall() at Xint0x80_syscall+0x2b

    It might be here? Cron is waiting on memory:

    db> t 399
    mi_switch(c0350140,c0363440,c0350140,c02d535c,ffffffff) at mi_switch+0x1c8
    tsleep(c0350140,2,c031a400,0,c368f228) at tsleep+0x1d1
    malloc(100,c0350140,0,c368f228,c35f4300) at malloc+0x1cd
    ffs_vget(c35dca00,ec17,e1368cbc,0,defc3900) at ffs_vget+0xa0
    ufs_lookup(e1368d20,e1368d34,c01ec562,e1368d20,e036c00a) at ufs_lookup+0xb47
    ufs_vnoperate(e1368d20,e036c00a,defc3900,e1368ef8,e1368d20) at
    ufs_vnoperate+0x15
    vfs_cache_lookup(e1368d78,e1368d88,c01efb71,e1368d78,defc4e00) at
    vfs_cache_lookup+0x2c2
    ufs_vnoperate(e1368d78,defc4e00,cb9ded00,e1368ef8,dc35aa40) at
    ufs_vnoperate+0x15
    lookup(e1368ed0,0,e1368ed0,e1368ed0,dc35aa40) at lookup+0x2e1
    namei(e1368ed0,0,cb9cc7c0,e1368ed0,c02d298b) at namei+0x19d
    vn_open(e1368ed0,1,1a4,3,dc35aa40) at vn_open+0x1ed
    open(dc35aa40,e1368f80,68108dec,6811b380,4) at open+0xc4
    syscall2(2f,2f,2f,4,6811b380) at syscall2+0x20d
    Xint0x80_syscall() at Xint0x80_syscall+0x2b

    Can anyone suggest what the bug might be or how to proceed with debugging?

    Thanks in advance,
    David Dolson (ddolson@sandvine.com, www.sandvine.com)

    _______________________________________________
    freebsd-stable@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-stable
    To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"


  • Next message: Dave Dolson: "RE: kernel deadlock"

    Relevant Pages

    • Re: How to detect when a console child process is waiting for input?
      ... the console child process is waiting for input, so that I can give them ... WaitForInputIdlefunction which will always return true with a console ... Create a named event to check in your monitoring program. ... you'll need to redirect that call ...
      (microsoft.public.win32.programmer.kernel)
    • Re: Cant send a fax
      ... I can hear the fax noises from the ... But it seems as though the fax console dosen't reply. ... beeps like a call waiting beep. ... I know it's not call waiting because I don't ...
      (microsoft.public.windowsxp.help_and_support)
    • Re: How to detect when a console child process is waiting for input?
      ... >>>the console child process is waiting for input, so that I can give them ... >> When a console is waiting for input, it is generally in ReadConsoleW() ... you'll need to redirect that call ...
      (microsoft.public.win32.programmer.kernel)
    • RE: 2.6.9: serial_core: uart_open
      ... Thanks for the suggestion of setting the modem termio to a copy of the xterm ... unresponsive at the console and to telnet sessions. ... Note that this test is being done through an xterm session. ... clear the local flag, then open, waiting for modem control. ...
      (Linux-Kernel)
    • Re: how to "wait for any key"
      ... Alexander Malkis wrote: ... # What is the most portable way of "waiting for any key", ... # who runs a program from the console should observe the following: ... ESC or CTRL-C can break your program. ...
      (comp.lang.c)