Re: too many files

From: Hunter, Mark (Mark.Hunter_at_ANHEUSER-BUSCH.COM)
Date: 07/30/03

  • Next message: Green, Simon: "Re: Implementing DNS on SP2"
    Date:         Wed, 30 Jul 2003 09:52:33 -0500
    To: aix-l@Princeton.EDU
    
    

    Yes, it does take a long time under jfs filesystems.
    Why?
    jfs based directories maintain a simple list for filenames - unordered.
    Your backup command must dereference the file. Each dereference will search, on
    average, 1/2 the directory before finding the file - 450,000. You also have to
    open each file, most of which will not be in your incore inode table.
    900,000 files * 450,000 dereferences = 405,000,000,000

    On my 660 with 600MHz processors and 15GB RAM, 35275 files take about 42 cpu
    seconds to do an ls -l on - after I have forced them into the incore table.
    Double that time the first time I run the command.
    timex ls -l > /dev/null

    real 46.18
    user 41.30
    sys 2.59
    Doing the math, you should take at least 650 times longer or about 8 hours to do
    an ls -l. Double that if you have not forced everything into the incore inode
    table/cache (which with that many files you can't). And at least another 8
    hours to restore the files on the other side, ignoring the file sizes.

    jfs and standard unix does not handle this many files in a directory well.
    jfs2 would be a much better choice as the directories maintain the file list in
    a sorted order. Thus, each dereference is log2(n), not n/2.

    On jfs2, I created 35275 and did an ls -l. This was on an H50 with 332 MHz
    processors, 1GB RAM.
    timex ls -l > /dev/null

    real 2.53
    user 1.36
    sys 1.15

    Obviously, jfs2 is way faster at large number of files in the directory.

    For your problem, backup by inode would be way faster than backup by name.
    Nothing is going to improve your restore time on jfs, but if your target can be
    made jfs2, you might be ok.

    Mark Hunter

    -----Original Message-----
    From: Nguyen, Joseph [mailto:JNguyen@WM.COM]
    Sent: Tuesday, July 29, 2003 5:35 PM
    To: aix-l@Princeton.EDU
    Subject: too many files

    I have a filesystem that contains 900,000+ files in one directory. I ran
    the following command to copy files to another host and it ran for couple
    days and stops. Event just run the find command would take a long time.

    find /indirectory -print | backup -iqf- | compress -c | rsh remotehost
    "(uncompress -c | restore -xf- )"

    Do you know any other command that can speed up the copy? we try to backup
    to tape and restore and that also take days.

    Joseph


  • Next message: Green, Simon: "Re: Implementing DNS on SP2"

    Relevant Pages