RE: kernel deadlock

From: Don Bowman (don_at_sandvine.com)
Date: 07/30/03

  • Next message: Mike Silbersack: "Followup to Luoqi Chen's 4.x PAE post; if_xl driver"
    To: Don Bowman <don@sandvine.com>, 'Robert Watson' <rwatson@freebsd.org>, Dave Dolson <ddolson@sandvine.com>
    Date: Tue, 29 Jul 2003 21:04:54 -0400
    
    

    From: Don Bowman [mailto:don@sandvine.com]
    >
    > From: Robert Watson [mailto:rwatson@freebsd.org]
    > > On Tue, 29 Jul 2003, Dave Dolson wrote:
    > >
    > > > To follow up, I've discovered that the system has
    > exhausted its "FFS
    > > > node" malloc type.
    > ...
    > >
    > > Some problems with this have turned up in -CURRENT on large-memory
    > > machines where some of the scaling factors have been off. In
    >
    > We currently have kern.maxvnodes=70354 set (automatically
    > scaled). This
    > is a 1GB box.
    >
    > I will try re-running the test with less.
    >
    > when it hits kern.maxvnodes, what will it do?

    After applying the fixes from RELENG_4 for kern/52425,
    I can still easily reproduce this hang without low memory.
    Further debugging shows that vnlru process is waiting on
    vlrup. This line is shown below. ie vnlru_nowhere is being
    incremented ever 3 seconds.

    static void
    vnlru_proc(void)
    {
     ...
            s = splbio();
            for (;;) {
     ...
                    if (done == 0) {
                            vnlru_nowhere++;
                            tsleep(vnlruproc, PPAUSE, "vlrup", hz * 3);
                    }
            }
            splx(s);

    syncher is in vlruwk wait from getnewvnode().

    lots of other processes waiting on ffsvgt.

    this implies that vlrureclaim() was unable to free anything.

    i have maxvnode = 35k. as soon as i hit this value, my system locked
    up [bash on serial shell non-responsive, serial driver echos chars,
    can drop into ddb]. Processes which don't use filesystem seem to continue
    to run ok.

    A couple of procs are waiting on inode: env, cron. These never come
    out of waiting for it.

    suggestions?

    db> ps
      pid proc addr uid ppid pgrp flag stat wmesg wchan cmd
      649 dc35a8a0 e0a32000 0 641 641 004104 3 ffsvgt c03698a8 atrun
      648 dc35a3c0 e0e36000 0 647 648 000014 3 vlruwk c0364c90 cron
      647 dc35b740 e03d4000 0 135 135 000004 3 ppwait dc35b740 cron
      646 dc35b0c0 e03ee000 0 635 101 004004 3 inode c368ee00 env
      645 dc35ad80 e03f1000 0 212 644 004006 3 ffsvgt c03698a8 grep
      644 dc35aa40 e0400000 0 212 644 004006 3 ffsvgt c03698a8 sysctl
      641 dc35a080 e0e4c000 0 640 641 004084 3 wait dc35a080 sh
      640 dc35a220 e0e39000 0 135 135 000084 3 piperd e037c5c0 cron
      635 dc35a560 e0e32000 0 101 101 004084 3 piperd e037cd40 sh
      456 dc35abe0 e03fc000 0 133 456 4004004 3 ffsvgt c03698a8 tclsh83
      212 dc35bdc0 e0392000 0 199 212 004086 3 wait dc35bdc0 bash
      199 dc35c440 e036e000 0 1 199 004186 3 wait dc35c440 login
      187 dc35c2a0 e0376000 0 1 7 000086 3 select c037c460 snmpd
      169 dc35af20 e03e7000 0 1 169 000084 3 nanslp c0364970
    siocontrol
      163 dc35b260 e03e2000 0 1 163 000084 3 nanslp c0364970 wddt
      143 dc35b400 e03dd000 25 1 143 2000184 3 pause e03dd260
    sendmail
      140 dc35b5a0 e03d9000 0 1 140 000184 3 select c037c460 sendmail
      137 dc35b8e0 e03d0000 0 1 137 000184 3 select c037c460 sshd
      135 dc35ba80 e03c2000 0 1 135 000004 3 inode c35f4400 cron
      133 dc35bc20 e0397000 0 1 133 000084 3 select c037c460 inetd
      124 dc35bf60 e0382000 0 1 124 000084 3 select c037c460 syslogd
      101 dc35c100 e037e000 0 1 101 000084 3 wait dc35c100 dhclient
        6 dc35c5e0 defd1000 0 0 0 000204 3 vlrup dc35c5e0 vnlru
        5 dc35c780 defce000 0 0 0 000204 3 syncer c037c388 syncer
        4 dc35c920 defcb000 0 0 0 000204 3 psleep c0364b3c
    bufdaemon
        3 dc35cac0 defc8000 0 0 0 000204 3 psleep c0373280 vmdaemon
        2 dc35cc60 defc5000 0 0 0 000204 3 psleep c0352118
    pagedaemon
        1 dc35ce00 dc361000 0 0 1 004284 3 wait dc35ce00 init
        0 c037b760 c040e000 0 0 0 000204 3 sched c037b760 swapper

    _______________________________________________
    freebsd-stable@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-stable
    To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"


  • Next message: Mike Silbersack: "Followup to Luoqi Chen's 4.x PAE post; if_xl driver"

    Relevant Pages

    • Re: Video: Hingis vs Venus highlights
      ... sits up waiting to be punished ... that winner she hits off a sitter at 4-4 in the first set doesn't look all ...
      (rec.sport.tennis)
    • Re: did you get a 360
      ... Kahreeni wrote: ... > I am waiting for the PS3. ... > While I was at Best Buy I picked up Toto's greatest hits for only $4.99. ...
      (rec.music.phish)
    • RE: kernel deadlock
      ... On Tue, 29 Jul 2003, Don Bowman wrote: ... >> when it hits kern.maxvnodes, ... > lots of other processes waiting on ffsvgt. ... > pid proc addr uid ppid pgrp flag stat wmesg wchan cmd ...
      (freebsd-stable)
    • Re: SNME a one-time shot?
      ... NBC has so few hits, I'm just waiting for them to try to plug SNME on all ... > To pimp Mania, or are they actually bringing it back for a ...
      (rec.sport.pro-wrestling)
    • Re: SNME a one-time shot?
      ... > NBC has so few hits, I'm just waiting for them to try to plug SNME on all ... it would sure beat Leno. ...
      (rec.sport.pro-wrestling)