Re: Sockets debugging tools

From: T Koster (reply-to-group_at_use.net)
Date: 02/17/05


Date: Thu, 17 Feb 2005 10:38:27 GMT

Andrei Voropaev wrote:
> On 2005-02-16, T Koster <reply-to-group@use.net> wrote:
>>Andrei Voropaev wrote:
>>[...]
>>>You are wrong. This is perfectly possible by programming your peer not
>>>to read from the socket :) (Emulating the deadlock of the peer). In this
>>>case after the system buffers are filled, your call to send will process
>>>less (or even none) of the data. Note, system buffers are fairly large
>>>so you may need to pass lots of data before they fill up. But they do
>>>fill up. Tested :)
>>
>>Hmmm okay. So to start filling up the system's I/O buffers I just
>>freeze the client program, and watch how my program handles send's
>>inability to send?
>>
>>If so, what would happen if the client program never springs back to
>>life, and the socket just remains in a permanent would-block state?
>>After a while the OS will time-out the connection, right? Where is this
>>error raised? The tcp(7) man page only specifies ETIMEDOUT but no
>>indication of which calls raise it. Is is send and friends?
>
> Nope. OS won't time-out the connection in this case. Your application is
> responsible for handling such situations. For example by expecting
> confirmation messages from peer within certain time and closing
> connection if the confirmation does not come (yep, even with TCP
> *sigh* :)

Okay. Will do.

> As to ETIMEDOUT error, this happens in the case where there was no route
> to peer for a long time (pulled out network cable). The time is defined
> by OS. Usually it is few minutes. The error is returned when you call
> one of socket function (recv, send etc.) Supposedly the indication of
> this error is also visible in poll (POLLERR) and select (err set). In
> this case you can check the error using getsockopt with SO_ERROR option.
> I use this for handling non-blocking connection on Linux.

Aha. I'll just stick to looking out for ETIMEDOUT from the socket I/O
calls, along with the others.

> [...]
>
>>In addition to the ETIMEDOUT error, I'm confused about whether I need to
>>look out for EPIPE or not. One man page (send(2)) tells me it is raised
>>when "the local end has been shut down on a connected socket." Surely
>>the local end only shuts down if I call close or shutdown on the socket,
>>right? Thus, I already know when the local end has been shut down and
>>such a socket would never be sent to anyway, or am I missing something?
>> Another man page (tcp(7)) has something else to say about EPIPE: "The
>>other end closed the socket unexpectedly or a read is executed on a shut
>>down socket." This is totally different from what send(2) has to say,
>>but sounds more likely. ip(7)'s story is similar to tcp(7), but not
>>identical, plus I find the following unreassuring note under the BUGS
>>section: "There are too many inconsistent error values." :(
>
> EPIPE is important. You can't control it. Usually you get EPIPE when you
> are writing the data and remote peer crashes. In this case your OS get's
> back from peer EOF and closes "local end of pipe". But if your program
> does not read at the moment, it does not get indication about that. And
> during write it'll get SIGPIPE because the pipe is closed by OS already.
> That's why I usually use send with MSG_NOSIGNAL option to convert
> SIGPIPE to EPIPE. There are few other situations that lead to the same
> scenario.

Yeah, I'm also using MSG_NOSIGNAL to suppress the SIGPIPE. So to recap:
EPIPE is send's analogue to recv returning 0, or rather, I found out
that recv also raises SIGPIPE/EPIPE if you try to call it on a socket it
already returned 0 for, so to have recv returning 0 is like getting a
forward notification that the connection is closed for reading and
calling recv again will raise SIGPIPE/EPIPE, just like calling send on a
socket closed for writing raises SIGPIPE/EPIPE. Correct?

>>ECONNRESET is elusive too: neither the ip(7) or tcp(7) man pages mention
>>it. It only appears to be mentioned in send(2). Can it be expected to
>>be raised by recv also? It makes more sense to me that recv should find
>>out that a connection has been reset by the peer than send.
>
> Quite opposite. CONNRESET is something that happens during sending. When
> TCP attempts to deliver some data to the peer, and the peer has no idea
> what to do with the data it sends back RST to indicate that connection
> must be reopened. That situation is again normall for the case when peer
> application crashes while your application is sending data. In practice
> though I've never seen this error. Usually I get EPIPE :) I guess this
> is so because computers are fast now. I call send to pass some data,
> this goes into system buffer. Then it is passed to peer. In reply comes
> RST. Local end gets closed and now I'm trying to send second portion.
> Boom. SIGPIPE :)

Got it. You send data to a computer that has rebooted in the meantime,
and it doesn't expect that data coming to that port, so it sends
something back with the RST flag set, causing ECONNRESET.

>>I would never have thought sockets programming on Linux would be so
>>poorly-documented. I'm starting to read some FreeBSD man pages to see
>>if they contain the details I'm missing.
>
> Man pages are written in assumption, that those who read them know how
> the protocol works. In other words, if you really want to write good
> networking code you have to read some good books first, and then use man
> pages only as memory refreshment or to find out specifics of
> implementation for given OS :) I guess the authoritive book for
> networking programming is one by Stevens "Unix Network Programming"
> vol.1

As soon as uni starts again I'll borrow it from the library.

Thanks,
Thomas



Relevant Pages

  • Re: Sockets debugging tools
    ... This is perfectly possible by programming your peer not ... system buffers are fairly large ... and the socket just remains in a permanent would-block state? ... > returned by recv or send calls. ...
    (comp.unix.programmer)
  • Re: How do you get ECONNRESET on recv?
    ... SIGPIPE, because your first send triggers RESET, and your second send ... a more common cause is that the peer uses the SO_LINGER ... closes the socket. ...
    (comp.unix.programmer)
  • shutdown, recv, and closesocket on local connections
    ... When closing a socket, I go through the following steps: ... recv in step 3 eventually times out. ... This is my hunch of what is going on: I am quickly closing the sockets ... the peer as they executed step 3. ...
    (microsoft.public.win2000.networking)
  • Re: Recv and fwding in multi-threaded sockets programming
    ... I am new to socket programming in Java and I have a newbie question. ... > would forward it to the other 4 connections. ... listen for incoming message from peer and forward it to other thread. ... A sender that recieves messages from the queue and sends them ...
    (comp.lang.java.programmer)
  • Re: Another socket programming question
    ... When you close one side peer, the second side can fill that only if it do ... closing of peer socket, but accepted one only if it will do mentioned op. ... One for listening that can be started and stopped and one for sending data ... BeginReceiveCallback loop listening for data, ...
    (microsoft.public.win32.programmer.networks)