Re: em(4) receive part wedging randomly at moderate load

From: Chris Howells (howells_at_kde.org)
Date: 09/28/05

  • Next message: Daniel Dias Gonçalves: "FreeBSD 5.4 AMD64 & MPD"
    To: freebsd-net@freebsd.org
    Date: Wed, 28 Sep 2005 18:52:48 +0100
    
    
    

    On Monday 26 September 2005 15:29, Gleb Smirnoff wrote:
    > during last month we are experiencing a nasty problem with em(4)
    > driver. Several times a day the receive path of the driver wedges
    > for a minute or two. During wedge the transmit part works with
    > no problems. The latter fact makes this problem very nasty, because
    > the problematic router can't be backed up with help of CARP.

    This sounds very much like the problem I've been having. It affects two
    machines, one runs 5.4-STABLE and one runs 4.11-STABLE. Both are Duron 1800s
    based on the Asus A7V8X motherboard.

    The card in the 4.11 machine is:

    em0: <Intel(R) PRO/1000 Network Connection, Version - 2.1.7> port
    0xb000-0xb03f mem 0xf4800000-0xf481ffff,0xf5000000-0xf501ffff irq 11 at
    device 13.0 on pci0

    em0@pci0:13:0: class=0x020000 card=0x002e8086 chip=0x100e8086 rev=0x02
    hdr=0x00
        vendor = 'Intel Corporation'
        device = '82540EM Gigabit Ethernet Controller'
        class = network
        subclass = ethernet

    The card in the 5.4 machine is:
    em0: <Intel(R) PRO/1000 Network Connection, Version - 2.1.7> port
    0x6400-0x643f mem 0xf0000000-0xf001ffff irq 3 at device 19.0 on pci0

    em0@pci0:19:0: class=0x020000 card=0x10028086 chip=0x10268086 rev=0x04
    hdr=0x00
        vendor = 'Intel Corporation'
        device = '82545GM Gigabit Ethernet Controller'
        class = network
        subclass = ethernet

    > The box is serving 8 - 15 kpps, 70 - 100 MBps. It runs stateful pf(4)
    > firewall, with 50k - 80k states. The IP fastforwarding is enabled. The
    > average state insert/removal ratio is 300 states per second, however
    > sometimes several thousands of states can be removed in one pass. The
    > state removal locks the network code for quite a long time, so I guess
    > that wedge happens exactly when a lot of states are removed. The NIC
    > interrupts aren't serviced for some time and it wedges.

    Happens for me with no pf and serving a single client with samba and much
    lower load -- only a few tens of KB a second.

    > The NIC is plugged in Cisco Catalyst 6509 gigabit ethernet port. No
    > errors are counted on switch port.

    Mine is a simple unmanaged SMC 5 port GigE switch.

    > To workaround the problem, I have made the following patch:

    Interesting, I'll give that a go....

    <snip>

    > I am asking developers, who work in Intel, to pay attention to this
    > problem.

    Have you tried the em driver directly from intel? It can be found on the Intel
    web site. A few people on freebsd-stable are claiming that it works
    perfectly.

    I have noticed that having something like this in sysctl.conf helps to reduce
    the frequency of it happening:

    kern.ipc.somaxconn=1024
    net.inet.udp.recvspace=65536
    net.inet.tcp.sendspace=65536
    net.inet.tcp.recvspace=65536

    Though sadly it still does happen...

    -- 
    Cheers, Chris Howells -- chris@chrishowells.co.uk, howells@kde.org
    Web: http://www.chrishowells.co.uk, PGP ID: 0x33795A2C
    KDE/Qt/C++/PHP Developer: http://www.kde.org
    
    



  • Next message: Daniel Dias Gonçalves: "FreeBSD 5.4 AMD64 & MPD"

    Relevant Pages

    • Re: Regression: em driver in -CURRENT, "Invalid MAC address"
      ... Oh, hmmm, so this card is completely broken with the new driver then? ... Our method of getting the mac address changed, ... vendor = 'Intel Corporation' ... device = '82542 Gigabit Ethernet Controller' ...
      (freebsd-net)
    • FreeBSD 6.1-PREREALEASE , Xorg 6.9
      ... Maybe the support for this Saphire ATI X300 in radeon driver is broken. ... vendor = 'Intel Corporation' ... # the way multiple screens are organised. ...
      (freebsd-stable)
    • [opensuse] Slow NIC performance
      ... here is the Internet connection working well and fast. ... SubVendor: pci 0x8086 "Intel Corporation" ... Driver Modules: "e100" ... Jul 17 17:49:45 newyork ifup: eth0 device: Intel Corporation ...
      (SuSE)
    • Re: Warm up swings before a round
      ... I'm not sure that starting with wedges is a good way to go - it can ... Six-i in a chipping, or punch-out motion is next, working only on contact. ... deck followed by 5w both off deck and tee, then driver. ...
      (rec.sport.golf)
    • Re: new sk driver [was: nve timeout (and down) regression?]
      ... the box is mostly busy doing userland stuff. ... I'm going to test the new driver to see if I can disable mpsafenet. ...
      (freebsd-stable)