Re: performance implications of releasing resources and context switches
- From: Ambrose Silk <silky@xxxxxxxxxxxxxx>
- Date: Mon, 28 Jul 2008 22:59:29 -0400
David Schwartz wrote:
On Jul 28, 8:33 am, Ambrose Silk <si...@xxxxxxxxxxxxxx> wrote:
It seems there might be a performance benefit to forgoing the munmap()
and close() calls and letting them be handled by _exit() instead. As I
read the docs, _exit() is a single system call (thus 1 context switch)
which guarantees to remove all mappings, close all open files, etc.
If your process is long-running, you have to clean up as you go along.
Otherwise, you risk tying up or running out of resources. So you can
only do this for resources that you would normally free right at the
end of your execution.
Normally speeding up the end of execution wouldn't have any
significant benefit. The performance you care about is performance
while you have work to do, not after all your work is done.
A typical process might allocate a resource 1,000,000 times in its
life and need to free 995,000 of them while it's running. So at most
there's that .5% that you could let the exit implicitly free. And
that's the .5% that you only free when you're done with what you were
doing anyway, so it's the .5% that has the least impact on
performance.
This is a good analysis of the typical case and I would give the same advice in most cases. However, there are exceptions to any rule of thumb. My case, if anyone is interested, is a tool which runs other, pre-existing commands, under its control for the purpose of adding some value. But if it causes the child command to run more than (say) 5-10% slower, nobody will bother using it and the value will be lost. I observe that it does currently add about 10%, on average, to a command's runtime and thus is right on the threshold of real pain.
The core of what it does is checksumming files. Therefore most of this extra 10% is spent opening, closing, and doing I/O on these files[*]. The actual reading of the files is unavoidable; clearly, in order to get a checksum, each block of each file must be read sequentially. So there is limited scope for optimization and I must look into optimizing the number of system calls.
[*] A bit of an oversimplification - there are also quite a few stat() and lseek() calls.
Files are mapped rather than fopen-ed because that costs exactly 4 system calls per file (open, mmap, munmap, close) as opposed to fopen/fread/fclose which might do any number of read calls. It's also simpler, and I trust the OS to know things like the optimal block size and alignment better than I do.
I've done some analysis using "truss -c" (-c counts system calls) with results something like:
mmap .731 5521
mmap 1.106 11545
munmap .320 937
munmap .731 3108
open .561 3481
open 1.372 8636
close .430 5235
close .917 13767
For each system call the top line is the baseline and the lower line is what happens when my tool is added into the mix. The value on the left is the cumulative number of seconds spent in that call and on the right is the total number of calls. Now, why there are ~6K extra mmap calls and only ~2K extra munmaps, and ~5K more opens vs ~8K more closes, is a very interesting question which I'm looking into. But I think this demonstrates fairly clearly that there is significant overhead within system calls (how much of this is in the context switch itself and how much is work that must be done in kernel space at some point is TBD).
A. Silk
.
- Follow-Ups:
- Re: performance implications of releasing resources and context switches
- From: Rainer Weikusat
- Re: performance implications of releasing resources and context switches
- From: Silverlock
- Re: performance implications of releasing resources and context switches
- References:
- performance implications of releasing resources and context switches
- From: Ambrose Silk
- Re: performance implications of releasing resources and context switches
- From: David Schwartz
- performance implications of releasing resources and context switches
- Prev by Date: Re: File Unique ID
- Next by Date: good stuff on thread programming
- Previous by thread: Re: performance implications of releasing resources and context switches
- Next by thread: Re: performance implications of releasing resources and context switches
- Index(es):
Relevant Pages
|