Re: Packet loss every 30.999 seconds
- From: Mark Fullmer <maf@xxxxxxxxxxx>
- Date: Mon, 17 Dec 2007 12:57:05 -0500
Back to back test with no ethernet switch between two em interfaces,
same result. The receiving side has been up > 1 day and exhibits
the problem. These are also two different servers. The small
gettimeofday() syscall tester also shows the same ~30
second pattern of high latency between syscalls.
Receiver test application reports 3699 missed packets
Sender netstat -i:
(before test)
em1 1500 <Link#2> 00:04:23:cf:51:b7 20 0 15975785 0 0
em1 1500 10.1/24 10.1.0.2 37 - 15975801 - -
(after test)
em1 1500 <Link#2> 00:04:23:cf:51:b7 22 0 25975822 0 0
em1 1500 10.1/24 10.1.0.2 39 - 25975838 - -
total IP packets sent in during test = end - start
25975838-15975801 = 10000037 (expected, 1,000,000 packets test + overhead)
Receiver netstat -i:
(before test)
em1 1500 <Link#2> 00:04:23:c4:cc:89 15975785 0 21 0 0
em1 1500 10.1/24 10.1.0.1 15969626 - 19 - -
(after test)
em1 1500 <Link#2> 00:04:23:c4:cc:89 25975822 0 23 0 0
em1 1500 10.1/24 10.1.0.1 25965964 - 21 - -
total ethernet frames received during test = end - start
25975822-15975785 = 10000037 (as expected)
total IP packets processed during test = end - start
25965964-15969626 = 9996338 (expecting 10000037)
Missed packets = expected - received
10000037-9996338 = 3699
netstat -i accounts for the 3699 missed packets also reported by the
application
Looking closer at the tester output again shows the periodic
~30 second windows of packet loss.
There's a second problem here in that packets are just disappearing
before they make it to ip_input(), or there's a dropped packets
counter I've not found yet.
I can provide remote access to anyone who wants to take a look, this
is very easy to duplicate. The ~ 1 day uptime before the behavior
surfaces is not making this easy to isolate.
--
mark
On Dec 17, 2007, at 12:43 AM, Jeremy Chadwick wrote:
On Mon, Dec 17, 2007 at 12:21:43AM -0500, Mark Fullmer wrote:While trying to diagnose a packet loss problem in a RELENG_6 snapshot dated
November 8, 2007 it looks like I've stumbled across a broken driver or
kernel routine which stops interrupt processing long enough to severly
degrade network performance every 30.99 seconds.
Packets appear to make it as far as ether_input() then get lost.
Are you sure this isn't being caused by something the switch is doing,
such as MAC/ARP cache clearing or LACP? I'm just speculating, but it
would be worthwhile to remove the switch from the picture (crossover
cable to the rescue).
I know that at least in the case of fxp(4) and em(4), Jack Vogel does
some through testing of throughput using a professional/high-end packet
generator (some piece of hardware, I forget the name...)
--
| Jeremy Chadwick jdc at parodius.com |
| Parodius Networking http:// www.parodius.com/ |
| UNIX Systems Administrator Mountain View, CA, USA |
| Making life hard for others since 1977. PGP: 4BD6C0CB |
_______________________________________________
freebsd-stable@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable- unsubscribe@xxxxxxxxxxx"
_______________________________________________
freebsd-net@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscribe@xxxxxxxxxxx"
- Prev by Date: using netgraph to create a pair of pseudo ethernet interface
- Next by Date: Re: using netgraph to create a pair of pseudo ethernet interface
- Previous by thread: Re: Packet loss every 30.999 seconds
- Next by thread: WOL suport in Broadcom 5721 (57XX)
- Index(es):
Relevant Pages
|
|