sockets, closing and TIME_WAIT
- From: "k:arel" <karelnijs@xxxxxxxxx>
- Date: 26 Apr 2006 11:04:45 -0700
For my thesis i've written a server service which is ought to handle a
lot of clients at the same time (the precise specification of the goal
of this service doesn't mind at this point).
During heavy load the server can't follow anymore because the sockets
aren't actually closed or the TIME_WAIT is taking really long.
For example: my server should be able to handle 10 clients connecting
each second (querying for information).
/* ******************************** loop listens for clients
*********************** */
while( listening ) {
/*
* server waits eternal
*/
rset = rset_all;
select(maxfdpl, &rset, NULL, NULL, NULL);
/*
* new client?
*/
pos = -1;
if( FD_ISSET(listen_sd, &rset) ){
pos = getFreePosition(connections);
//extra IF here to avoid unnecessary calls to getFreePosition()
function
if( pos != -1 ) {
len = sizeof(client);
recv_sd = accept(listen_sd, reinterpret_cast<struct
sockaddr*>(&client), &len);
if( recv_sd != -1 ) {
//do something
} else {
throw ASICexception(msg);
}
} else if ( pos == -1 ) {
cout << "Max number of clients reached.\nWaiting untill clients
finish up";
//"patch" include for cleaning up the backlog
recv_sd = accept(listen_sd, reinterpret_cast<struct
sockaddr*>(&client), &len);
if( recv_sd != -1 ) {
close_socket(recv_sd);
}
}
}
}
/* ******************************** loop listens for clients
***********************
My socket is closed this way:
int close_socket(int sd) {
fcntl(sd, F_SETFL, O_NONBLOCK);
shutdown(sd, SHUT_RDWR);
char tmp[100];
recv(sd,tmp,100,0);
return close(sd);
}
After accept() i configure the socket:
int no = 0;
res = setsockopt(sd, SOL_SOCKET, SO_KEEPALIVE, &no, sizeof(no));
if(res < 0 )
dbg->write("Unable to set KEEPALIVE socket option", __FUNCTION__,
loglevel);
int yes = 1;
setsockopt(sd, SOL_SOCKET, SO_REUSEADDR, &yes, sizeof(yes));
I used to set the SO_LINGER option withOUT a delay, but the Unix FAQ
(ftp://rtfm.mit.edu/pub/usenet/news.answers/unix-faq/socket) says you
should use the TIME_WAIT for TCP to close correctly.
Now, if you look at the while loop, you'll find this:
pos = getFreePosition(connections);
This gets a free position in the array of connections, a position is
considerate free <=> the socket is not used anymore.
I test that with:
bool socket_exists(int sd) {
int res = fcntl(sd, F_GETFL, 0);
return (res != -1);
}
I've already include a "backlog patch" for closing down the client
connection immediately when we can't connect right away (because there
aren't any free positions).
The server can run about a hour without any real problems, but after
that clients can't connect anymore because the getFreePosition()
function can't find any non-active or non-used sockets.
My netstat output looks like this:
$ netstat -s -tcp
Tcp:
51165 active connections openings
39897 passive connection openings
853 failed connection attempts
23182 connection resets received
6 connections established
638389 segments received
657288 segments send out
9046 segments retransmited
4 bad segments received.
9490 resets sent
TcpExt:
723 resets received for embryonic SYN_RECV sockets
31269 TCP sockets finished time wait in fast timer
3 time wait sockets recycled by time stamp
142 packets rejects in established connections because of timestamp
15440 delayed acks sent
138 delayed acks further delayed because of locked socket
Quick ack mode was activated 1090 times
3515 times the listen queue of a socket overflowed
3515 SYNs to LISTEN sockets ignored
321 packets directly queued to recvmsg prequeue.
29383 of bytes directly received from backlog
15884 of bytes directly received from prequeue
130013 packet headers predicted
357 packets header predicted and directly queued to user
90550 acknowledgments not containing data received
104433 predicted acknowledgments
1616 congestion windows recovered after partial ack
0 TCP data loss events
71 timeouts after SACK recovery
1 timeouts in loss state
3988 other TCP timeouts
1152 DSACKs sent for old packets
29 DSACKs received
13733 connections reset due to unexpected data
93 connections reset due to early user close
136 connections aborted due to timeout
Has anybody any idea what i could be doing wrong?
I've been searching a lot for this problem and tried variating socket
options. None seem to resolve the problem completely...
.
- Follow-Ups:
- Re: sockets, closing and TIME_WAIT
- From: Maxim Yegorushkin
- Re: sockets, closing and TIME_WAIT
- Prev by Date: Re: Need guru to help inprove efficiency if possible.
- Next by Date: Makefile Rule: idl/%.idl -> src/%.c
- Previous by thread: Need guru to help inprove efficiency if possible.
- Next by thread: Re: sockets, closing and TIME_WAIT
- Index(es):
Relevant Pages
|