[HPADM] SUMMARY: Large directories

From: David R Antoch (dantoch_at_csc.com)
Date: 09/16/05

  • Next message: Daniel Keisling: "[HPADM] sulog going reverting back to root group"
    To: hpux-admin@dutchworks.nl
    Date: Fri, 16 Sep 2005 08:57:15 -0400
    
    

    Original post:

    >>>>>>
    I have a filesystem (VxFS/LVM) that contains a directory with 600,000+
    files (avg ~ 20K each... some are larger). Architecturally, an
    applicaton no-no (but that's another issue...). As a side project,
    we're evaluating a search tool that will search through the files, and
    index them into a database. Now, management does not want to risk
    evaluating the search on the production system, so I'm attempting to copy
    the entire filesystem (51GB, they wanted to search it all) to a
    development system. The target disks are new. Both machines are 11.0
    patched to recent (within a few weeks) patch versions.

    The copy (I've tried ssh and remsh|tar pipe, as well as NFS
    find |cpio) gets a to a certain point, then starts thrashing the
    target disk. iostat and sar say there's 1.7MB/sec continuous disk IO,
    and I see 200+ seeks/second (seems way too much), and only about 1
    file (20K) per 10 or 15 seconds gets copied. The issue is definitely
    due to the directory size, as anything written into that dir, just grinds
    to a crawl.

    I used the VXFS defaults when building the filesystem. I was aware of
    issues like decreasing the bytes per inode etc...for many small files,
    but looking into the VXFS options, I really didnt come up with
    anything other than the defaults. (an oversight?, what am I missing?)

    Is there anything I can do for a VXFS filesystem, that would render
    better performance, when writing large numbers of files into one directory?
    >>>>>>

    Thanks to everyone who replied:

    Tom Myers
    Marc Ahrendt
    Bill Hassell
    llicR
    John Lanier
    Shyam Hazari

    Bill Hassel verified my worst fear :)) with his (abbreviated)
    summary that hits it on the head:

    "This is one of the well-documented 'features' of large directories.
    It's an application design no-no for a reason--massivle large flat
    filesystems are a very painful sysadmin issue. And as you've seen,
    you have to work around the huge delays and difficulties in searching
    through all these files.
    ...
    ...
     The VxFS filesystem is optimized out of the
    box for large or small files so there isn't anything special to do
    on the source system. On the destination, the file creation process
    will start crawling as the directory gets really big. Think of it as
    a parking lot at the SuperDome...it's easy to find a parking space
    when it's empty but it takes a lot of driving around to locate a
    spot when there are 20,000 cars parked there.
    ...
    ...
    .."

    I our case we decided to run the index on the production filesystem.
    It's not a heavily used filesystem anyway, with the app writing 20K
    to a couple of meg files fairly infrequently (one or two hundred per
    day) ...but they collect there over time and there's no archival setup
    yet. I'm trying to get them to store the files in subdirs named by
    year/month YYYYMM, which would pretty much alleviate the problem, and can
    support years and years of files. Definitely an application architecture
    issue.

    However, I'd be interested to find out if there any other filesystem
    types that are more efficient and can work with large numbers better...

    I'm also including some good info from all the respondents regarding
    filesystems and copying :

    -use "mkfs -m" to see how the production filesystems were built and
    compare to the filesystems you built on the development system.
     Ex: /usr/sbin/mkfs -m /dev/vg01/lvol1

     - maximize random seeks on the development system volume group by
    using extent-based striping across all available drives.
     Ex: /usr/sbin/lvcreate -s g -D y ...

    - try rsync as a copy method. It may be more efficient about updating
    the directory file as data files are copied into the target system.

    - As far as the copy process, NFS is the least useful as it has a very
    large network overhead and a puny 100Mbit LAN will severely limit the
    speed of the transfer.

    - find | cpio would be good if you could hook up the target disk to
    the production system. The cpio -pudlm options will perform a direct
    disk read to help bypass some of the filesystem overhead, but it will
    not be optimal.

    - use dd for a disk to disk copy. fsck on the target prior to
    mounting:

    keep in mind that dd
    will copy each block (make sure you use a LARGE block size such as
    bs=64k) regardless of whether the block contains multiple file and/or
    directories that are changing. Ideally, this backup will take place
    with application shutdown. The best choice is dd using a tape but
    you could also do this in a network pipe (much slower than tape).

    Once the copy has been restored (to the same-sized lvol on the target)
    you'll need to run fsck on the rlvol (raw) volume. Then mount the
    new filesystem and you should be ready to go.

    Thx again,
    Dave

    --
                 ---> Please post QUESTIONS and SUMMARIES only!! <---
            To subscribe/unsubscribe to this list, contact majordomo@dutchworks.nl
           Name: hpux-admin@dutchworks.nl     Owner: owner-hpux-admin@dutchworks.nl
     
     Archives:  ftp.dutchworks.nl:/pub/digests/hpux-admin       (FTP, browse only)
                http://www.dutchworks.nl/htbin/hpsysadmin   (Web, browse & search)
    

  • Next message: Daniel Keisling: "[HPADM] sulog going reverting back to root group"

    Relevant Pages

    • Weird harddisk behaviour
      ... A couple of weeks ago my 400Gb SATA disk crashed. ... Partition Table for /dev/sda ... # Type Sector Sector Offset Length Filesystem Type Flag ... Superblock backups stored on blocks: ...
      (Linux-Kernel)
    • Strange case of root filesystem corruption
      ... Yesterday GRUB would suddenly not display the boot menu anymore. ... Looking at filesystems on the disk with the free ufs2tools program, ... Subfolders of / on the same filesystem are affected as well. ... sectors/track: 63 ...
      (freebsd-questions)
    • Re: Making a bootable second hard disk (and larger filesystems)
      ... >filesystem on disk2 and it has to be bootable. ... >backup on second hard disk, ... under "root" and let both discs in the server? ... If you do only "dd" you may want to boot a "Knoppix" CD. ...
      (comp.unix.sco.misc)
    • Re: DVD-RAM slowness and questions
      ... to know when it's done writing to the disk. ... > with a non-journaling filesystem such as ext2 or FAT? ... you've got the wrong idea about journalling, ... > driver responsibility, then how do I know if Linux is doing it? ...
      (comp.os.linux.hardware)
    • [ANNOUNCE]: RIP Linux rescue system!
      ... This is a bootable CD Linux boot/rescue system! ... The bootable CD image `RIP-12.7.iso.bin' can be written to a CD disk, ... It also includes the CD/DVD UDF filesystem packet writing tools (cdrwtool, ... a Linux reiserfs and reiser4 filesystem. ...
      (comp.os.linux.announce)