Re: Bad performance when accessing a lot of small files
- From: Alfred Perlstein <alfred@xxxxxxxxxxx>
- Date: Fri, 21 Dec 2007 12:16:25 -0800
* Alexandre Biancalana <biancalana@xxxxxxxxx> [071219 11:35] wrote:
Hi List,
I have a backup server running FreeBSD 7-BETA3. The cpu is CPU:
Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz, 3GB Ram, 10x 500GB
SATA, Areca 1231-ML, the filesystem used to backup my other servers
locally is build on top of ARC-1231, 4TB (32k stripe) zfs filesystem
with gzip compression.
This machine receive backups from ~30 servers, (of all kinds and
sizes, databases, fileservers, image servers, webservers, etc) all
night, write the last day in LTO-3 tapes and store some days older
days in disk.
The behavior that I'm observing and that want your help is when the
system is accessing some directory with many small files ( directories
with ~ 1 million of ~30kb files), the performance is very poor.
There is a lot of very good tuning advice in this thread, however
one thing to note is that having ~1 million files in a directory
is not a very good thing to do on just about any filesystem.
One trick that a lot of people do is hashing the directories themselves
so that you use some kind of computation to break this huge dir into
multiple smaller dirs.
If you can figure out a hashing algorithm, that may help you.
For instance, if you tell sendmail to use "/var/spool/mq*"
for its mail spool and you happen to have 256 directories
under "/var/spool/" named "mq000" through "mq256" it will
randomly pick a directory to dump a file in.
This makes the performance a lot better.
For one million files you can probably do a two level hash,
you just have to figure out a good hashing algorithm.
If you you can describe the data, I may be able to help
you come up with a hashing algorithm for it.
-Alfred
_______________________________________________
freebsd-performance@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "freebsd-performance-unsubscribe@xxxxxxxxxxx"
- Follow-Ups:
- Re: Bad performance when accessing a lot of small files
- From: Alexandre Biancalana
- Re: Bad performance when accessing a lot of small files
- References:
- Bad performance when accessing a lot of small files
- From: Alexandre Biancalana
- Bad performance when accessing a lot of small files
- Prev by Date: Re: intel drivers vs. freebsd drivers
- Next by Date: Re: Bad performance when accessing a lot of small files
- Previous by thread: Re: Re[2]: Bad performance when accessing a lot of small files
- Next by thread: Re: Bad performance when accessing a lot of small files
- Index(es):
Relevant Pages
|