Re: 5.x w/auto-maxusers has insane kern.maxvnodes

From: Bruce Evans (bde_at_zeta.org.au)
Date: 05/09/04

  • Next message: Kris Kennaway: "Re: 5.x w/auto-maxusers has insane kern.maxvnodes"
    Date: Sun, 9 May 2004 20:10:13 +1000 (EST)
    To: Brian Fundakowski Feldman <green@FreeBSD.org>
    
    

    On Sun, 9 May 2004, Brian Fundakowski Feldman wrote:

    > Brian Fundakowski Feldman <green@FreeBSD.org> wrote:
    > > I have a 512MB system and had to adjust kern.maxvnodes (desiredvnodes) down
    > > to something reasonable after discovering that it was the sole cause of too
    > > much paging for my workstation. The target number of vnodes was set to
    > > 33000, which would not be so bad if it did not also cause so many more
    > > UFS, VM and VFS objects, and the VM objects' associated inactive cache
    > > pages, lying around. I ended up saving a good 100MB of memory just
    > > adjusting kern.maxvnodes back down to something reasonable. Here are the
    > > current allocations (and some of the peak values):

    The default for desiredvnodes is almost perfect for my main application
    of running makeworld and otherwise working with the entire src tree.
    Actually, it's too low with 512MB and almost perfect with 1024MB. The
    latter gives desiredvnodes = 70240, and there are 47742 vnodes in my
    src tree (a few hundred extras). 512MB is also not quite enough for
    caching the whole src tree (mine has 476358 1K-blocks according to
    du). In one application involving 2 src trees (slightly reduced to
    get them both cached in 1024MB of which only about 800MB is available
    for VMIO pages), I needed to increase kern.vnodes to 90000+ to avoid
    disk accesses for inodes. Caching them in vnodes didn't work because
    the default number of vnodes wasn't enough, and caching them in VMIO
    pages didn't work for some reason (either because I was testing a
    filesystem that was missing VMIO for metadata, or because the replacement
    policy didn't work -- when inodes are cached in vnodes and not written
    to due to mounting with noatime, they get discarded from VMIO and then
    when theire vnode gets recycled they aren't cached anywhere).

    Since 512MB isn't enough to cache everything for makeworld, the default
    of 33000+ vnodes won't help much, and a better target might be to cache
    everything in /sys. 15000 vnodes and a couple of hundred MB is enough
    for that unless you build too many modules or kernels.

    > > ITEM SIZE LIMIT USED FREE REQUESTS
    > > FFS2 dinode: 256, 0, 12340, 95, 1298936
    > > FFS1 dinode: 128, 0, 315, 3901, 2570969
    > > FFS inode: 140, 0, 12655, 14589, 3869905
    > > L VFS Cache: 291, 0, 5, 892, 51835
    > > S VFS Cache: 68, 0, 13043, 23301, 4076311
    > > VNODE: 260, 0, 32339, 16, 32339
    > > VM OBJECT: 132, 0, 10834, 24806, 2681863

    I don't use ffs2 (nice to see ffs* spelled right), so I have slightly
    smaller oveheads.

    > > (The number of VM pages allocated specifically to vnodes is not something
    > > easy to determine other than the fact that I saved so much memory even
    > > without the objects themselves, after uma_zfree(), having been reclaimed.)

    The number of VMIO pages is also hard to determine. systat's "inact"
    count gives an approximate value for the amount of VMIO memory, but
    various stats utilities' "buf" count gives a useless value. VMIO pages
    are easier to flush (unmount works for them).

    > > We really need to look into making the desiredvnodes default target more
    > > sane before 5.x is -STABLE or people are going to be very surprised
    > > switching from 4.x and seeing paging increase substantially. One more

    5.x has bloat everywhere? Is desiredvnodes the worst part of it? I haven't
    noticed its bloat especially. Not long ago (in early 4.x?), the number of
    vnodes was unbounded and there were bugs like the ufs inode allocation
    doubling due to the required amount growing for bogus reasons to just
    larger than a power of 2 (so that power of 2 allocation almost doubled it).

    > > but why are they not already like that? One last good example I personally
    > > see of wastage-by-virtue-of-zfree-function is the page tables on i386:
    > > PV ENTRY: 28, 938280, 59170, 120590, 199482221
    > > Once again, why do those actually need to be non-reclaimable?

    I haven't noticed much wastage for PV ENTRY. Right now, I have only the
    following large memory consumers in uma, but the system hasn't been up
    long and the measurement is distorted by recently reading the src tree:

    %%%
    ITEM SIZE LIMIT USED FREE REQUESTS
    FFS1 dinode: 128, 0, 52202, 33, 563854
    FFS inode: 140, 0, 52202, 46, 563854
    S VFS Cache: 68, 0, 52494, 75, 573456
    VNODE: 260, 0, 52209, 21, 52209
    2048: 2048, 0, 123, 2843, 20845
    PV ENTRY: 28, 1494920, 4438, 2282, 703502
    VM OBJECT: 132, 0, 52355, 85, 495338
    %%%

    PV ENTRY's are small, so the 2048's waste a lot more. It's hard to see
    what they are for; vmstat -z never showed as much as vmstat -m, and
    vmstat -m is not as good as it used to be.

    > It really doesn't seem appropriate to _ever_ scale maxvnodes (desiredvnodes)
    > up that high just because I have 512MB of RAM.

    Like most things, the best value depends on the workload. Sinc the number
    of vnodes that can be handled scales with the amount of memory, it seems
    reasonable for the default to scale with the amount of memory. -current
    needs a larger scale factor than RELENG_4 if anything, since it has more
    files. Combined with more costs per file, it could easily need twice as
    much real memory as RELENG_4 for equivalent disk caching.

    Bruce
    _______________________________________________
    freebsd-current@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-current
    To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"


  • Next message: Kris Kennaway: "Re: 5.x w/auto-maxusers has insane kern.maxvnodes"

    Relevant Pages

    • Re: 5.x w/auto-maxusers has insane kern.maxvnodes
      ... The target number of vnodes was set to ... I ended up saving a good 100MB of memory just ... > switching from 4.x and seeing paging increase substantially. ...
      (freebsd-current)
    • Re: Call for PRs: nullfs
      ... One is the direct overhead associated with stacking ... the maxvnodes bound, which causes vnodes to be recycled. ... "memory is no object" -- on other ... rate and disk I/O transaction rates during the benchmark. ...
      (freebsd-current)
    • Re: Freeing vnodes.
      ... On Mon, 14 Mar 2005, Stephan Uphoff wrote: ... >> that allows us to start reclaiming vnodes from the free list and release ... that are free in memory. ... >> recycle these vnodes once there is enough pressure, ...
      (freebsd-arch)
    • Re: Freeing vnodes.
      ... On Mon, 2005-03-14 at 21:38, Jeff Roberson wrote: ... > that allows us to start reclaiming vnodes from the free list and release ... that are free in memory. ... > is done by a new helper function that is called from vnlru_proc. ...
      (freebsd-arch)