Re: performance implications of releasing resources and context switches
- From: Silverlock <Silverlock@xxxxxxxxxxx>
- Date: Tue, 29 Jul 2008 04:49:04 GMT
Ambrose Silk wrote:
David Schwartz wrote:On Jul 28, 8:33 am, Ambrose Silk <si...@xxxxxxxxxxxxxx> wrote:
It seems there might be a performance benefit to forgoing the munmap()
and close() calls and letting them be handled by _exit() instead. As I
read the docs, _exit() is a single system call (thus 1 context switch)
which guarantees to remove all mappings, close all open files, etc.
If your process is long-running, you have to clean up as you go along.
Otherwise, you risk tying up or running out of resources. So you can
only do this for resources that you would normally free right at the
end of your execution.
Normally speeding up the end of execution wouldn't have any
significant benefit. The performance you care about is performance
while you have work to do, not after all your work is done.
A typical process might allocate a resource 1,000,000 times in its
life and need to free 995,000 of them while it's running. So at most
there's that .5% that you could let the exit implicitly free. And
that's the .5% that you only free when you're done with what you were
doing anyway, so it's the .5% that has the least impact on
performance.
This is a good analysis of the typical case and I would give the same advice in most cases. However, there are exceptions to any rule of thumb. My case, if anyone is interested, is a tool which runs other, pre-existing commands, under its control for the purpose of adding some value. But if it causes the child command to run more than (say) 5-10% slower, nobody will bother using it and the value will be lost. I observe that it does currently add about 10%, on average, to a command's runtime and thus is right on the threshold of real pain.
The core of what it does is checksumming files. Therefore most of this extra 10% is spent opening, closing, and doing I/O on these files[*]. The actual reading of the files is unavoidable; clearly, in order to get a checksum, each block of each file must be read sequentially. So there is limited scope for optimization and I must look into optimizing the number of system calls.
[*] A bit of an oversimplification - there are also quite a few stat() and lseek() calls.
Files are mapped rather than fopen-ed because that costs exactly 4 system calls per file (open, mmap, munmap, close) as opposed to fopen/fread/fclose which might do any number of read calls. It's also simpler, and I trust the OS to know things like the optimal block size and alignment better than I do.
I've done some analysis using "truss -c" (-c counts system calls) with results something like:
mmap .731 5521
mmap 1.106 11545
munmap .320 937
munmap .731 3108
open .561 3481
open 1.372 8636
close .430 5235
close .917 13767
For each system call the top line is the baseline and the lower line is what happens when my tool is added into the mix. The value on the left is the cumulative number of seconds spent in that call and on the right is the total number of calls. Now, why there are ~6K extra mmap calls and only ~2K extra munmaps, and ~5K more opens vs ~8K more closes, is a very interesting question which I'm looking into. But I think this demonstrates fairly clearly that there is significant overhead within system calls (how much of this is in the context switch itself and how much is work that must be done in kernel space at some point is TBD).
A. Silk
If you analyze your numbers above, it appears that your
tool is adding about 0.8 seconds to the run time
of the program using mmap/munmap and about 1.3 seconds
when using open/close. What is the total
running time of the whole thing? If it's something
like 10 seconds then it makes sense that the
10% increase would be due to the mmap/munmap
additional overhead. If it's something like
60 seconds (66 seconds with your tool), then
mmap/munmap is not likely to be the primary
culprit.
Personal opinion: In either case, I don't really
see why ONLY a 10% increase in run-time should
be that big a deal for your users, but you're the
best judge of that.
Silverlock
.
- References:
- performance implications of releasing resources and context switches
- From: Ambrose Silk
- Re: performance implications of releasing resources and context switches
- From: David Schwartz
- Re: performance implications of releasing resources and context switches
- From: Ambrose Silk
- performance implications of releasing resources and context switches
- Prev by Date: Re: File Unique ID
- Next by Date: Software Package Free! ... about our Free Software
- Previous by thread: Re: performance implications of releasing resources and context switches
- Next by thread: Re: performance implications of releasing resources and context switches
- Index(es):
Relevant Pages
|