Re: Q: too much data too little memory
From: phil-news-nospam@ipal.net
Date: 04/18/03
- Next message: phil-news-nospam@ipal.net: "Re: Reading a directory asynchronously: getdents() ?"
- Previous message: Russell Shaw: "Re: objc"
- In reply to: Fei Chen: "Q: too much data too little memory"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
From: phil-news-nospam@ipal.net Date: 18 Apr 2003 04:52:35 GMT
On Thu, 17 Apr 2003 13:30:16 +0100 Fei Chen <feic@stats.ox.ac.uk> wrote:
| I was wondering, with a large chunk of data that does not fit into all the
| physical memory available to a computer, is it better, to run one process
| that attempts to load all the data, and thus off loading the memory
| management work to the OS (e.g. via paging/swapping). Or is it better to
| start several processes concurrently, each of which takes in a chunk of
| data that individually does fit into memory, and thus have the OS manage
| these processes. Or there is really little difference between these two
| approaches?
You may be describing a variety of difference scenarios which either
fit or are close to your description.
First of all you need to understand the nature of your problem and why
or if it needs to actually loade all the data at the same time. If it
can work on just part of it at a time, then doing that might be best.
If the different parts can be worked on in parallel with out depending
on results from others, then you might want to allow specifying a
maximum number of concurrent processes to run, and run a number that
is equal to or maybe one larger than the number of CPUs the machine
has.
If your machine has enough physical RAM to actually swap in all of the
data, but not enough address space to actually map it into a single VM
at the same time, such as with a 4GB limit on a 32-bit process on a
machine with perhaps 64GB of RAM and a 32GB data file, one thing you
might try is to juggle the mapping. Leave the file open and call
mmap() and munmap() as needed to keep portions of the file in the
address space. Unless there is other demand for the RAM in the system,
thoe unmapped pages should remain in RAM for a while in case they need
to be mapped back in again. If your OS happens to (badly) flush them
when munmap()'d, you could maybe adjust that behaviour with madvise()
or have numerous idle child processes holding mappings in place to keep
the OS from thinking nothing needs it. The question you'll have to
face is how to manage the mapping juggling so you minimize the number
tests done to see if a give piece of data is already mapped or not.
-- ----------------------------------------------------------------- | Phil Howard - KA9WGN | Dallas | http://linuxhomepage.com/ | | phil-nospam@ipal.net | Texas, USA | http://ka9wgn.ham.org/ | -----------------------------------------------------------------
- Next message: phil-news-nospam@ipal.net: "Re: Reading a directory asynchronously: getdents() ?"
- Previous message: Russell Shaw: "Re: objc"
- In reply to: Fei Chen: "Q: too much data too little memory"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Relevant Pages
|