Re: read() returns ETIMEDOUT on steady TCP connection



On Sun, 20 Apr 2008, Peter Jeremy wrote:

Can you give some more detail about your hardware (speed, CPU,
available RAM, UP or SMP) and the application (roughly what does the
core of the code look like and is it single-threaded/multi-threaded
and/or multi-process).

The current test is a Dell 2650, 2Gb, Quad Xeon with onboard bge.

The application is single threaded, non-blocking multiplexed I/O based on poll(). It's relatively simple at its core -- read() from an inbound connection and write() to outbound sockets.

As the number of outbound connections increases, the 'output drops'
increases to around 10% of the total packets sent and maintains that ratio.
There's no problems with network capacity.

'output drops' (ips_odropped) means that the kernel is unable to
buffer the write (no mbufs or send queue full). Userland should see
ENOBUFS unless the error was triggered by a fragmentation request.

The app definitely isn't seeing ENOBUFS; this would be treated as a fatal condition and reported.

I can't explain the problem but it definitely looks like a resource
starvation issue within the kernel.

I've traced the source of the ETIMEDOUT within the kernel to tcp_timer_rexmt() in tcp_timer.c:

if (++tp->t_rxtshift > TCP_MAXRXTSHIFT) {
tp->t_rxtshift = TCP_MAXRXTSHIFT;
tcpstat.tcps_timeoutdrop++;
tp = tcp_drop(tp, tp->t_softerror ?
tp->t_softerror : ETIMEDOUT);
goto out;
}

I'm new to FreeBSD, but it seems to implies that it's reaching a limit of a number of retransmits of sending ACKs on the TCP connection receiving the inbound data? But I checked this using tcpdump on the server and could see no retransmissions.

As a test, I ran a simulation with the necessary changes to increase TCP_MAXRXTSHIFT (including adding appropriate entries to tcp_sync_backoff[] and tcp_backoff[]) and it appeared I was able to reduce the frequency of the problem occurring, but not to a usable level.

With ACKs in mind, I took the test back to stock kernel and configuration, and went ahead with disabling sack on the server and the client which supplies the data (FreeBSD 6.1, not 7). This greatly reduced the 'duplicate acks' metric, but didn't fix the problem. The next step was to switch off delayed_ack as well, and I didn't see the problem for some hours on the test system at 850mbit output. But hasn't eliminated it, as it happened again.

Perhaps someone with a greater knowledge can help to join the dots of all these symptoms?

Mark
_______________________________________________
freebsd-net@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscribe@xxxxxxxxxxx"



Relevant Pages

  • Re: read() returns ETIMEDOUT on steady TCP connection
    ... It's relatively simple at its core -- readfrom an inbound connection and writeto outbound sockets. ... I've traced the source of the ETIMEDOUT within the kernel to ... but it seems to implies that it's reaching a limit of a number of retransmits of sending ACKs on the TCP connection receiving the inbound data? ...
    (freebsd-net)
  • Re: Announce: Linux-next (Or Andrews dream :-))
    ... And the rate of change in each major portion of the kernel (drivers, ... arch, core, network, etc) is exactly proportional to the amount of the ... and we also tried to simply even re-architect the whole tree so ... And we fix them up, ...
    (Linux-Kernel)
  • Re: WTF: Stack Size 4k
    ... > hell are those of us who use and love the SCSI boards based on the ... > kernel with the 8k stack size by choosing to not allow that as an option ... I, for one, see great folly in upgrading ANY datacenter to Fedora Core 2 ...
    (Fedora)
  • Re: SMP Kernel
    ... since the memory is shared between all the cores the Kernel can be ... invoked by any core receiving an interrupt and thus executed by that ... Are there an independent scheduler per cpu? ... To unsubscribe from this list: ...
    (Linux-Kernel)
  • Re: Processor type kernel option for Core Duo (not Core 2)
    ... >> I've got a Centrino Core Duo laptop; ... >> can't work out which processor type option to use for the kernel. ... the CPU type selection within the kernel configuration (make ... > hardware and the software support. ...
    (comp.os.linux.setup)