Re: Socket errors and errno.
loic-dev_at_gmx.net
Date: 04/18/05
- Previous message: Scott Howard: "Re: implicit -e in Solaris /bin/sh on cd?"
- In reply to: Lawrie: "Socket errors and errno."
- Next in thread: Lawrie: "Re: Socket errors and errno."
- Reply: Lawrie: "Re: Socket errors and errno."
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: 18 Apr 2005 03:03:14 -0700
Hello Lawrie,
> I am fairly new to socket programming and am extremely paranoid about
> making sure I catch any error condition that may affect my
application
> (and attempt a recovery if possible).
>
> I am using the select() function to notify my program of socket
events.
[snip]
> 1> If a remote host disconnects the select function will inform me
that
> the socket is ready to read/write. I am intending to use the value of
> errno if my write attepmt (which will fail if disconnected) to the
> socket to diagnose the problem and trigger my application to
> re-establish the connection with the remote host.
>
> Does this seem a sensible approach?
Yes, I guess. But you must be aware of one fact: if you have received a
RST and you write to the socket, then the SIGPIPE signal shall be
delivered to your process. This would happen for instance if the remote
application has crashed.
The way to deal with such an issue is to ignore SIGPIPE. It this case,
write() shall return -1 and set errno to EPIPE.
> 2> What error conditions should I check for to determine errors on my
> local (write) socket. If I detect errors with my local socket is it
> best to drop the socket and create a new socket before attempting to
> reconnect to the remote host?
I think, the relevant question is: why would the remote host
disconnect? I can think only of 3 possibilities (perhaps I missing
some):
1) The remote application closes or shutdowns voluntarily the socket.
2) The remote application crashes.
3) There is a failure in the network path.
Unless you are doing some active monitoring of network's healthiness,
you have no "reasonable" mean to detect 3) directly [Using timeout
might give you an indirect hint that something is going wrong.]
You can detect 2) with the EPIPE mechanism, but you can't really
recover from that failure unless you can re-spawn the remote
application or the remote application is programmed against that kind
of failure (using e.g. a watchdog that restarts the application if it
crashes).
It might make sense to re-initiate the connection in case 1)... But why
would the remote application closes or shutdowns the socket voluntarily
in first place?
Perhaps detecting the EPIPE error condition and trying to reconnect
could add some more reliability to the program. But without knowing
explicitely the applications in question, it is difficult to tell.
Cheers,
Loic.
- Previous message: Scott Howard: "Re: implicit -e in Solaris /bin/sh on cd?"
- In reply to: Lawrie: "Socket errors and errno."
- Next in thread: Lawrie: "Re: Socket errors and errno."
- Reply: Lawrie: "Re: Socket errors and errno."
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|