Re: Packet loss every 30.999 seconds
- From: Bruce Evans <brde@xxxxxxxxxxxxxxx>
- Date: Tue, 18 Dec 2007 16:43:40 +1100 (EST)
On Mon, 17 Dec 2007, David G Lawrence wrote:
While trying to diagnose a packet loss problem in a RELENG_6 snapshot
dated
November 8, 2007 it looks like I've stumbled across a broken driver or
kernel routine which stops interrupt processing long enough to severly
degrade network performance every 30.99 seconds.
I see the same behaviour under a heavily modified version of FreeBSD-5.2
(except the period was 2 ms longer and the latency was 7 ms instead
of 11 ms when numvnodes was at a certain value. Now with numvnodes =
17500, the latency is 3 ms.
I noticed this as well some time ago. The problem has to do with the
processing (syncing) of vnodes. When the total number of allocated vnodes
in the system grows to tens of thousands, the ~31 second periodic sync
process takes a long time to run. Try this patch and let people know if
it helps your problem. It will periodically wait for one tick (1ms) every
500 vnodes of processing, which will allow other things to run.
However, the syncer should be running at a relative low priority and not
cause packet loss. I don't see any packet loss even in ~5.2 where the
network stack (but not drivers) is still Giant-locked.
Other too-high latencies showed up:
- syscons LED setting and vt switching gives a latency of 5.5 msec because
syscons still uses busy-waiting for setting LEDs :-(. Oops, I do see
packet loss -- this causes it under ~5.2 but not under -current. For
the bge and/or em drivers, the packet loss shows up in netstat output
as a few hundred errors for every LED setting on the receiving machine,
while receiving tiny packets at the maximum possible rate of 640 kpps.
sysctl is completely Giant-locked and so are upper layers of the
network stack. The bge hardware rx ring size is 256 in -current and
512 in ~5.2. At 640 kpps, 512 packets take 800 us so bge wants to
call the the upper layers with a latency of far below 800 us. I
don't know exactly where the upper layers block on Giant.
- a user CPU hog process gives a latency of over 200 ms every half a
second or so when the hog starts up, and a 300-400 ms after the
hog has been running for some time. Two user CPU hog processes
double the latency. Reducing kern.sched.quantum from 100 ms to 10
ms and/or renicing the hogs don't seem to affect this. Running the
hogs at idle priority fixes this. This won't affect packet loss,
but it might affect user network processes -- they might need to
run at real time priority to get low enough latency. They might need
to do this anyway -- a scheduling quantum of 100 ms should give a
latency of 100 ms per CPU hog quite often, though not usually since
the hogs should never be prefered to a higher-prioerity process.
Previously I've used a less specialized clock-watching program to
determine the syscall latency. It showed similar problems for CPU
hogs. I just remembered that I found the fix for these under ~5.2 --
remove a local hack that sacrifices latency for reduced context
switches between user threads. -current with SCHED_4BSD does this
non-hackishly, but seems to have a bug somehwhere that gives a latency
that is large enough to be noticeable in interactive programs.
Bruce
_______________________________________________
freebsd-net@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscribe@xxxxxxxxxxx"
- References:
- Packet loss every 30.999 seconds
- From: Mark Fullmer
- Re: Packet loss every 30.999 seconds
- From: David G Lawrence
- Packet loss every 30.999 seconds
- Prev by Date: Re: Packet loss every 30.999 seconds
- Next by Date: Re: Packet loss every 30.999 seconds
- Previous by thread: Re: Packet loss every 30.999 seconds
- Next by thread: Re: Packet loss every 30.999 seconds
- Index(es):
Relevant Pages
- Re: Packet loss every 30.999 seconds
... the latency is 3 ms. ... I don't see any packet loss even in ~5.2 where
the ... network stack (but not drivers) is still Giant-locked. ... second or so
when the hog starts up, and a 300-400 ms after the ... (freebsd-stable) - Re: CONFIG_PREEMPT and server workloads
... I've been meaning to do another round of latency tuneups for ages, ... but the
other CPU is hanging on the lock for ages ... ALSA drivers which would cause a dump_stackto
be triggered if the audio ... of underruns we can just ask them to turn on the sysctl and
we get a trace ... (Linux-Kernel) - Re: PC recording studio
... That card and its drivers are simply not designed for the task you are using ...
You'll get *much* better latency and fidelity with something using ASIO ... like a USB
2 or Firewire audio interface. ... However, having used it and other firewire devices
for some time, if I were ... (alt.guitar) - Re: Accurate Broadband Speed Tests
... a conglomeration of various imparement numbers, such as packet loss, ... latency,
jitter, and fragmentation. ... running that eat CPU cycles and suck bandwidth (Skype, GizmoProject,
... The values much be constant to get a good QoS ... (alt.internet.wireless) - Re: Recording on Computers
... I've heard this is a bigger issue with USB1. ... More a question of the drivers,
... latency will be higher than acceptable. ... Track Pro USB1 interface
when you can get a USB2 interface with the ... (alt.guitar)