Re: new sk driver [was: nve timeout (and down) regression?]
- From: Pieter de Goeje <pieter@xxxxxxxxxx>
- Date: Tue, 28 Mar 2006 16:38:11 +0200
On Tuesday 28 March 2006 12:40, you wrote:
<snip>
probably you do not have the traffic to make the box crash or less then
1/2GB of RAM in use
The box has 1GB RAM. Traffic is approx. 2-3Mbit/s.
in fact the problem does not happen on UP machines, only some times a
device timeout which only ocasionally cause rx/tx to stop
The problem is appearing on SMP machines
when you have less then 2Gb of RAM the problem ocurres once a day or so and
seems to depend on memory use and amount of traffic
soon the traffic reaches more than 1Mbit/s the crash is predictable and you
can wait to see
The box has actually crashed once, but I am not sure it was because of the
NIC.
~> uptime
4:19PM up 3 days, 9:59, 1 user, load averages: 1.38, 1.20, 1.03
on 4GB of Ram machines and more traffic the crash is imediatly and worse
when the box crashed under load (4-6Mbit/s) and comes back then the high
demand strokes it and it crashes in minutes or imediatly soon the network
is up
so probably mpsafenet may help by not processing concurrent packets but
this is a workaround not a solution (for me)
Agreed.
I can't really tell if the performance is impaired by mpsafenet=0, because the
last time I checked mpsafenet=0 almost cut 1Mbit/s of traffic and the
overall performance/response was bad, higher HZ did not resolved anything
and disabling polling made it still worse (I have other NICs installed),
the machines are working as GW
box is mostly busy doing userland stuff. Typical traffic looks like this:
~> netstat -w 1
input (Total) output
packets errs bytes packets errs bytes colls
1186 0 97134 1302 0 276430 0
1206 0 97484 1382 0 264315 0
1193 0 97048 1366 0 278901 0
1198 0 98251 1403 0 273428 0
1205 0 99283 1393 0 270364 0
1162 0 94746 1376 0 265909 0
1162 0 93011 1420 0 258514 0
1187 0 94366 1467 0 263162 0
1178 0 93441 1441 0 248875 0
1176 0 93116 1484 0 266285 0
1146 0 91615 1424 0 256180 0
1222 0 96597 1560 0 432862 0
1222 0 93796 1591 0 444466 0
This is all UDP. The traffic generates around 2000 interrupts/sec on sk.
until january the machines didn't crashed, only timeouts and rx/tx stops
I used Pyun's driver and the timeouts went away, thank's again!
so then I got confused by some if_sk talks on stable and thought the driver
was comitted and the boxes started crashing until I got it last week and
reused Pyun's driver again and my sk problems are gone again, the machines
are stable for 4/5 days now
I'm going to test the new driver to see if I can disable mpsafenet. To be
specific on the NIC:
skc0@pci0:10:0: class=0x020000 card=0x811a1043 chip=0x432011ab rev=0x13
hdr=0x00
vendor = 'Marvell Semiconductor (Was: Galileo Technology Ltd)'
device = '88E8001/8003/8010 Gigabit Ethernet Controller with Integrated
PHY (copper)'
class = network
subclass = ethernet
Pieter de Goeje
_______________________________________________
freebsd-stable@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscribe@xxxxxxxxxxx"
- Follow-Ups:
- References:
- Re: nve timeout (and down) regression?
- From: Ion-Mihai Tetcu
- Re: new sk driver [was: nve timeout (and down) regression?]
- From: Pieter de Goeje
- Re: new sk driver [was: nve timeout (and down) regression?]
- From: JoaoBR
- Re: nve timeout (and down) regression?
- Prev by Date: Re: new sk driver [was: nve timeout (and down) regression?]
- Next by Date: Four patches for PR bin/73422
- Previous by thread: Re: new sk driver [was: nve timeout (and down) regression?]
- Next by thread: Re: new sk driver [was: nve timeout (and down) regression?]
- Index(es):
Relevant Pages
|