Re: 'swap_pager: indefinite wait buffer' with swapfile

From: Kris Kennaway (kris_at_obsecurity.org)
Date: 09/13/05

  • Next message: alan bryan: "Re: HEADS UP: call for nve(4) users to test a patch"
    Date: Tue, 13 Sep 2005 01:43:18 -0400
    To: Kris Kennaway <kris@obsecurity.org>
    
    
    

    On Sun, Sep 11, 2005 at 03:51:57AM -0400, Kris Kennaway wrote:
    > I configured a vnode-backed md and enabled swapping on it. A few
    > hours later after moderate swap use the console showed:
    >
    > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 889347, size: 8192
    > [...repeated...]
    >
    > The backing store was a sparse file, but there was ample space:
    >
    > # ls -l /data2/swapfile
    > -rw-r--r-- 1 root wheel 17179869184 Sep 11 16:50 /data2/swapfile
    > # df /data2
    > Filesystem 1K-blocks Used Avail Capacity Mounted on
    > /dev/stripe/data 51666218 27042730 20490192 57% /data2
    > # swapinfo
    > Device 1K-blocks Used Avail Capacity
    > /dev/da0b 6297480 949304 6297480 15%
    > /dev/md41 16777216 842544 16777216 5%
    > Total 23074696 1791848 21282848 8%

    I think these messages happen when a ton of stuff gets paged out at
    once, since the vnode backing store is going to be a significant
    bottleneck. I got a few dozen of these in groups of 5 or 6 over the
    past 24 hours, and the system seemed to be fine (i.e. no immediate
    panics or filesystem corruption from lost transactions).

    However, the system has now deadlocked.

        0 c04188e0 0 0 0 0000200 [SLPQ vmwait 0xc05695e0][SLP] swapper
    db> wh 0
    Tracing pid 0 tid 0 td 0xc0418c30
    mi_switch() at mi_switch+0x2b0
    sleepq_switch() at sleepq_switch+0xf4
    sleepq_wait() at sleepq_wait+0x3c
    msleep() at msleep+0x378
    vm_wait() at vm_wait+0xa8
    scheduler() at scheduler+0x58
    mi_startup() at mi_startup+0x12c
    btext() at btext+0x34

    This is the md that I'm swapping onto:

     9151 fffff8010fbf5a80 0 0 0 0000204 [SLPQ vmwait 0xc05695e0][SLP] md41
    db> wh 9151
    Tracing pid 9151 tid 100359 td 0xfffff801047a30a0
    mi_switch() at mi_switch+0x2b0
    sleepq_switch() at sleepq_switch+0xf4
    sleepq_wait() at sleepq_wait+0x3c
    msleep() at msleep+0x378
    vm_wait() at vm_wait+0xa8
    allocbuf() at allocbuf+0x614
    getblk() at getblk+0x598
    breadn() at breadn+0x58
    bread() at bread+0x20
    ffs_balloc_ufs2() at ffs_balloc_ufs2+0xcf0
    ffs_write() at ffs_write+0x2a4
    VOP_WRITE_APV() at VOP_WRITE_APV+0x120
    mdstart_vnode() at mdstart_vnode+0x16c
    md_kthread() at md_kthread+0x1f8
    fork_exit() at fork_exit+0x94
    fork_trampoline() at fork_trampoline+0x8
    db>

    Most other processes on the system are sleeping in various states and/or
    trying to swap, e.g.:

    30588 fffff8010051ba80 0 30586 44971 0004000 [SLPQ vmwait 0xc05695e0][SLP] bsdtar
    30587 fffff800b05169f0 0 30586 44971 0004000 [SLPQ vmwait 0xc05695e0][SLP] bsdtar
    30586 fffff800d48d73e0 0 28045 44971 0004000 [SLPQ wait 0xfffff800d48d73e0][SLP][SWAP] sh
    30585 fffff80017de73e0 0 30583 45059 0004000 [SLPQ vmwait 0xc05695e0][SLP] bsdtar
    30584 fffff800425d6000 0 30583 45059 0004000 [SLPQ pipdwt 0xfffff8005088e780][SLP] bsdtar
    30583 fffff800b0517730 0 28564 45059 0004000 [SLPQ wait 0xfffff800b0517730][SLP][SWAP] sh

    Looks like swapfiles are broken.

    Kris

    
    



  • Next message: alan bryan: "Re: HEADS UP: call for nve(4) users to test a patch"