Re: read() returns ETIMEDOUT
- From: Mitar <mmitar@xxxxxxxxx>
- Date: Wed, 4 Feb 2009 15:35:25 +0100
Hi!
sysctl net.inet.tcp | grep keep
net.inet.tcp.keepidle: 7200000
net.inet.tcp.keepintvl: 75000
net.inet.tcp.keepinit: 75000
net.inet.tcp.always_keepalive: 1
I am using FreeBSD 7.0-STABLE on amd64 architecture and bge network interface.
Server has around 5 MB/s (megabytes) almost constant rx/tx rate. I use
pf firewall and there are a lot of connections opened, for example,
some pf states stats:
State Table Total Rate
current entries 17042
searches 6752096417 14750.6/s
inserts 66200602 144.6/s
removals 66183560 144.6/s
I have been sending a TCP/IP data with netcat listening on the server
side and netcat sending from the client. It was not so fast connection
(around 50 kB/s (kilobytes) connection on average) but it was a stable
steady sending. Server has much more bandwidth available. The
connection has lasted only around 12 minutes and only 30 MB of data
has been transferred until the time the server closed the connection.
The problem is that this is repeatable (I have repeated this test many
times) and under such load it happens always. If I disable/cancel all
other load on the server the connection is not broken by the server.
(3) TCP retransmit timer reaches its full exponntial backoff without being
ACK'd. (tcp_timer_rexmt)
I believe it is because of this. I could not insert kernel printf as I
am unable to reboot the server at the moment but I have been checking
drop counters with netstat and at the moment the connection broke
"connections dropped by rexmit timeout" counter increased. It is true
that the counters are increasing almost all the time under the load
but I believe that I have timed this correctly.
It would also be useful, if possible, to look at the tcpdump for the last
portion of the connection, perhaps ideally from the second-to-last ACK from
the remote host to the connection reset from the local end. It might be
worth running tcpdump on both sides to see if they see the same thing -- for
example, does one side think it's sending ACKs and the other not receive it?
I have put complete logs on the net:
http://mitar.tnode.com/Temp/timeout-tcpdump-client.txt.gz
http://mitar.tnode.com/Temp/timeout-tcpdump-server.txt.gz
Client is NATed behind a router on a different ISP.
In the previous thread, it looked a bit like the outcome was that there was
a memory exhaustion issue under load, and that bumping nmbclusters helped at
least defer that problem. So it would be useful to see the output of
netstat -m before and after (for as small an epsilon as you can make it) the
connection is timed out. I realize capturing the above sorts of data can be
an issue on high-load boxes but if we can, it would be quite helpful.
Regardless of that, knowing if you're seeing allocation errors in the
netstat -m output would be helpful.
I doubt that it is a memory issue as I have been monitoring those
allocations and they do not come near max values. current netstat -m
output is:
10657/8228/18885 mbufs in use (current/cache/total)
8248/7388/15636/25600 mbuf clusters in use (current/cache/total/max)
8248/5994 mbuf+clusters out of packet secondary zone in use (current/cache)
1839/774/2613/12800 4k (page size) jumbo clusters in use
(current/cache/total/max)
0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
26857K/19929K/46786K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
27072 requests for I/O initiated by sendfile
0 calls to protocol drain routines
Mitar
_______________________________________________
freebsd-net@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscribe@xxxxxxxxxxx"
- References:
- read() returns ETIMEDOUT
- From: Mitar
- Re: read() returns ETIMEDOUT
- From: Robert Watson
- read() returns ETIMEDOUT
- Prev by Date: Re: kern/130605: [tcp] Certain hardware produces "Network is unreachable" errors for scanning tools
- Next by Date: Re: Vimage globals vs structures measurements.
- Previous by thread: Re: read() returns ETIMEDOUT
- Next by thread: Re: kern/129719: [tcp] [panic] Panic during shutdown, tcp_ctloutput: inp == NULL
- Index(es):
Relevant Pages
|