Re: you are in an fs with millions of small files

From: Robert Watson (rwatson_at_FreeBSD.org)
Date: 06/07/05

  • Next message: Steve Kargl: "Re: buildworld fails, can't guess why"
    Date: Tue, 7 Jun 2005 17:57:02 +0100 (BST)
    To: Eric Anderson <anderson@centtech.com>
    
    

    On Tue, 7 Jun 2005, Eric Anderson wrote:

    > Julian Elischer wrote:
    >> what. all in one directory?
    >>
    >> I've only had up to 500,000 files in one directory on FreeBSD.
    >
    > The only problems I've had with a directory with millions of files is
    > things like ls -al with attempt to sort the list, but the list doesn't
    > fit into memory. Access to the files is of course very snappy.

    Ditto. I regularly use directories with tens and hundreds of thousands of
    entries as a result of manipulating very large folders with the Cyrus
    server. I run into the following two classes of problems:

    - Some appliations behave poorly with large trees. ls(1) is the classic
       example -- sorting 150,000 strings is expensive, and should be avoided.
       It also requires holding al the strings in memory rather than continuing
       the iteration. fts ns bad about this, so many applications that use fts
       suffer from this. With the sort issue, -f makes a big difference.

    - Some operations become more expensive -- as directories grow, the cost
       of adding new entries gets more expensive. You'll notice this fairly
       substantailly if you untar a tar file with many entries in the same
       directory -- early on, cost of insert for a new item is very cheap, but
       it rapidly slows down from h thousands of inserts per second to hundreds
       or less. I notice this if I restore a large Cyrus directory from
       backup.

    - UFS_DIRHASH really helps with large directory performance by reducing
       the cost of lookup, but at the cost of memory. Make sure the box has
       lots of memory.

    All this said -- FreeBSD works really well for me with large file counts,
    I rarely hit the edge cases where there is a problem. Most problems are
    with applications, and when you are using more extreme file system
    layouts, you typically are using applications customized for that andso
    they do the right things.

    Robert N M Watson
    _______________________________________________
    freebsd-current@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-current
    To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"


  • Next message: Steve Kargl: "Re: buildworld fails, can't guess why"

    Relevant Pages

    • [Full-Disclosure] FreeBSD Security Advisory FreeBSD-SA-02:39.libkvm
      ... The kvmlibrary provides a uniform interface for accessing kernel ... virtual memory images, including live systems and crash dumps. ... Applications that wish to present system information such as swap ... Several applications in the FreeBSD Ports Collection were identified ...
      (Full-Disclosure)
    • Tcl under conditions of high memory usage.
      ... fairly high memory usage, both on windows & FreeBSD. ... I hardly expect applications to run properly when there is little free ... whilst the system uses it's swap space heavily - but my Tcl daemons ...
      (comp.lang.tcl)
    • Re: DRAM data persistence
      ... returned to the memory allocation pool. ... data can clear the memory before freeing it so applications have ... never leave security to an application ... doesn't increase cost or power consuption. ...
      (sci.electronics.design)
    • Re: xmalloc string functions
      ... like so many others have become lazy and just add more memory. ... Not always possible either because you don't own the purse strings (a company machine) or the cost is prohibitive, or the machine already has the maximum the hardware supports. ... applications assume the user has enough memory and simply don't bother ...
      (comp.lang.c)
    • Re: xmalloc string functions
      ... This is a company machine, ... getting any more RAM in it would have cost an extra 400UKP or so and I ... like so many others have become lazy and just add more memory. ... applications assume the user has enough memory and simply don't bother ...
      (comp.lang.c)