Re: Buffered IO and pipes
From: Logan Shaw (lshaw-usenet_at_austin.rr.com)
Date: 10/19/03
- Next message: Michael Laajanen: "Re: mozilla 1.5 binaries on blastwave"
- Previous message: chris berg: "Re: SMC"
- In reply to: Bruce Allen: "Buffered IO and pipes"
- Next in thread: Bruce Allen: "Re: Buffered IO and pipes"
- Reply: Bruce Allen: "Re: Buffered IO and pipes"
- Reply: Barry Margolin: "Re: Buffered IO and pipes"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Sun, 19 Oct 2003 08:00:37 GMT
Bruce Allen wrote:
> For the sake of this discussion, the code involves 6 processes. I'll
> call them M (master) and S1 to S5 (slaves 1-5).
>
> M does the following:
> (0-M) start
> (1-M) popen(3)'s processes S1 to S5 to read from them.
> (2-M) reads 64kB from pipes 1 to 5, and does some
> 'very fast' computation
> (3-M) outputs 64kB (let's say to /dev/null)
> (4-M) sleeps 1 second
> (5-M) returns to step 2-M
>
> S1 to S5 do the following:
> (0-S) start
> (1-S) initialize (a few seconds) and call
> setvbuf(3) on stdout (size discussion below).
> (2-S) compute a 1 MB block of output
> (long slow floating-point computation)
> (3-S) output 1 MB down stdout
> (4-S) return to step 2-S
>
> My goal is to try to ensure that when M goes from step (5-M) to step
> (2-M), the 64 kB read in step (2-M) returns as fast as possible. In
> other words that the pipes are "full" and don't block when M tries to
> read from them.
It sounds like you want the slaves to be able to write to the
master with as little waiting around as possible, so that the
slaves can get back to doing their computation.
If this is the case, I think one useful approach would be to
use select(). It will tell you exactly the set of pipes on
which data is available. This gains you a lot because you can
ensure that the master is never blocked waiting one slave
when another slave would actually be ready to write. Because
of that, you can ensure that, as long as a slave is trying to
write, the master will be ready to read its data. (Well,
assuming the master doesn't fall behind on its "very fast"
computations.)
To do this, all you've got to do is change M's sequence to
something like this:
(0-M) start
(1-M) popen() processes S1 to S5 to read from them.
(2-M) calls select() until it has read 64 kB from pipes 1 to 5.
(3-M) does the "very fast" computation
(4-M) outputs 64kB to wherever
(5-M) no sleeping of 1 second (why would you?)
(6-M) go back to step 2-M.
Notice that you're avoiding all the hassle with trying to guess
timings and buffer sizes. You have nothing to tune. It all
just happens as it should, and it's portable to various
versions of Unix, too.
By the way, why ARE you doing the sleep for 1 second? It
seems like your priority would be to make the master run
so that there's never a situation where it's not ready and
forces a slave to block (when the slave otherwise could be
doing a computation).
Also, one possible optimization is to make the master
prioritize reads from the children. If select() says
data is available from 1, 2, and 5, read data from the
lowest number. Eventually, you'll have read all the data
you need from #1, so then take it out of your list of
file descriptors ignore it while reading the data from
the others. This way, you will ensure that #1 is finished
as quickly as possible and you will delay the others just
a little bit. This is good, because it means that the
processes will be a little bit "out of phase" with each
other, meaning that there is (hopefully) always at least
one that can sit around burning CPU cycles.
Alternatively, scrap the entire model, and go with shared
memory. Create a two-way channel between the master and
each slave. (Using two pipes is the simplest, most portable
way.) Then, create a 64kB shared memory buffer for every
one of the slaves that it shares with the master.
You can control the flow of data easily, without using
locks or anything. Just pass a "token" back and forth
through the two-way channel. When the master starts a
child, the first thing it does is writes the character
"X" to the slave's pipe. This tells the slave that it
has control of the shared memory segment. When the slave
has finished its long, slow computation, it writes the
64kB of data to the buffer and writes an "X" into the
pipe that goes to the master, signalling to the master
that the master has control of the shared memory.
The big advantage here is that when the slave wants to
pass data to the master, it just reads the "X" token
from the master, does a memcpy() into the shared memory,
and then writes the "X" token back to the master. So
that's just two system calls (one read() and one write())
to pass a big block of data to the master. The read()
will never block unless the master has fallen far
behind. The write() will never block at all, because
the most that will ever sit sitting in that pipe is one
character, and that will always be buffered on absolutely
every single version of Unix ever. :-)
With the shared-memory approach, the master looks like this:
(0-M) start
(1-M) allocate a shared memory segment and *two* pipes
(one master=>slave and one slave=>master)
(2-M) fork() a slave process
(3-M) write() the letter "X" the master=>slave pipe
(4-M) if there are more slave processes to create, go to 1-M.
(5-M) for this data set, determine which slaves haven't
sent their data yet
(6-M) select() on the remaining slaves
(7-M) on the appropriate slave=>master pipe, read the letter "X"
from the pipe, copy the shared memory buffer's contents
to somewhere else, and then write "X" to the corresponding
master=>slave pipe.
(8-M) if there are slaves left for the current data set, go
to step 5-M
(9-M) do the "very fast" computation, output the data to wherever
(10-M) return to step 5-M and start waiting for the results of
the next data set.
For the slave, it looks like this:
(0-S) initialize
(1-S) compute a **** 64 kB **** block of data
(2-S) read "X" from the master=>slave pipe
(3-S) copy the 64 kB of data to the shared memory
(4-S) write "X" to the slave=>master pipe
(5-S) go back to 1-S.
If you can't do slave computations on 64 kB blocks, either
buffer data in the master (not hard) or increase the shared
memory size to 1 MB.
Since we assume that memcpy() takes a tiny amount of time
in comparison to your computations, there is very little
blocking going on because
Also, I'm really not sure whether your "long slow floating-point
computation" takes the same amount of time for every work unit.
That is, do the slaves all get equal amounts of work (measured
in CPU time) each time through? If it's always very even, the
above method will be fine. But, if slave #1 gets twice as much
work done as slave #2 does per unit time (because it's being given
easier work), you may need to buffer things in the master.
Otherwise, you could have a situation where slave #1, #2, #3,
and #5 are all done with their work units, but #4 needs
5 seconds more CPU time to finish its work unit. On a
single-processor machine, this doesn't matter at all, but
on a dual-processor machine, it means you'd have one CPU just
sitting idle while the other finishes up work unit #4. (On a
five-processor machine with five slaves, you can't do anything
about it if one of your slaves takes longer than others
consistently!)
Finally, if your "very fast" computation turns out to be
not so very fast, you can refine the process by turning
the master into just a process that combines and buffers
data. Then, it'll always be available to take data way from
the slaves. It can then pass the queued data on to a
process which does the final "very fast" computation and
writes the output.
Hope that helps.
- Logan
- Next message: Michael Laajanen: "Re: mozilla 1.5 binaries on blastwave"
- Previous message: chris berg: "Re: SMC"
- In reply to: Bruce Allen: "Buffered IO and pipes"
- Next in thread: Bruce Allen: "Re: Buffered IO and pipes"
- Reply: Bruce Allen: "Re: Buffered IO and pipes"
- Reply: Barry Margolin: "Re: Buffered IO and pipes"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|