TCP socket, shutdown, eats last few bytes in buffer (rarely)
- From: David Mathog <mathog@xxxxxxxxxxx>
- Date: Mon, 03 Nov 2008 16:10:05 -0800
On Linux 2.6.19
Open input and output TCP sockets.
Copy from input to output (a lot).
The last write() comes back without any errors.
Shutdown the socket
Usually this works. However every so often the shutdown() seems to eat whatever the lat write() sent to the transmission buffer, such that the next node never receives it. The shutdown is like this:
if(0!=shutdown(fd, SHUT_RDWR)){
fprintf(stderr, "Error on shutdown of network stream\n");
}
and it never emits the error message so I assume it closed correctly. However sometimes the next node gets stuck waiting for the data from the last write(), and that node eventually times out.
Shouldn't shutdown() used this way on a TCP socket wait until the
data has been received at the other end before tearing down the connection?
Here is a specific example of a failure. The 18th node in the chain (the error hops around and does not occur on every run), with heavy debugging turned on, shows that the output buffer emptied down to just
5016 bytes when the input socket closed:
monkey19.cluster: Flushing buffer... 2008-11-03 15:52:22.391
DEBUG PREIO RB 3 type a holds 5016 misses 1 count 204794984
DEBUG POSTIO RB 3 bytes moved 5016 count 204800000
(at this point write() has returned and all 204800000 bytes have been written)
DEBUG closing fd 7 type 6
DEBUG closing fd 8 type A <-- this is where the shutdown() is called
So far, so good, and it does not look different from any other successful run, but on the next node downstream the log file holds:
DEBUG - select - fdmax 7
DEBUG PREIO RB 2 type 9 holds 204793536 misses 2 count 204793536
DEBUG POSTIO RB 2 bytes moved 1448 count 204794984
DEBUG - select - fdmax 7
monkey20.cluster: No input data received within Timeout period of 10 seconds.
Translation, the downstream node read up to byte 204794984 of the TCP stream, whereupon it hangs and receives no more input and eventually times out. (Specifically, it does NOT see the connection closed, as that should have triggered the select(), not fallen into the signal hander.) But 204794984 + 5016 = 204800000. So it is waiting for
exactly the amount of the last write() on the preceding node. I have seen 4 of these, and the values vary, but it is always like this,
last write() amount == missing amount on next node.
After the last input data has been read (which it knows because the input socket was closed at the other end), but before that data has been sent to the output socket, this code is executed.
fprintf(stderr,"Flushing buffer... %s",show_time());
signal(SIGALRM, SIG_IGN);
The show_time() function calls time(), localtime(), and gettimeofday(). In uses them to construct this format: "2008-11-03 15:52:14.590"
The SIGALRM is canceled because it has been up to this point attached to a signal hander which goes off if more than N seconds pass with no input received. Once the input socket closes that timer must not run any more (since definitely no more input will be found), yet the IO loop may not complete for a while as it unspools the data to the output.
I'm wondering if every so often one of these time/signal related calls interferes with the subsequent shutdown(). Or maybe there is something else one is supposed to do to be sure that the transmission buffer is empty before shutdown() is called? Some sort of "netflush"???
Thanks,
David Mathog
.
- Follow-Ups:
- Re: TCP socket, shutdown, eats last few bytes in buffer (rarely)
- From: Rick Jones
- Re: TCP socket, shutdown, eats last few bytes in buffer (rarely)
- From: Maxim Yegorushkin
- Re: TCP socket, shutdown, eats last few bytes in buffer (rarely)
- From: vippstar
- Re: TCP socket, shutdown, eats last few bytes in buffer (rarely)
- Prev by Date: Re: portability: static vs. shared?
- Next by Date: Google groups troubles [was: Re: dh, the daemon helper]
- Previous by thread: How to use sockets correctly?
- Next by thread: Re: TCP socket, shutdown, eats last few bytes in buffer (rarely)
- Index(es):
Relevant Pages
|