Re: ste(4) NIC's RX ring head may get ahead of the driver [PATCH]

From: Doug Ambrisko (ambrisko_at_ambrisko.com)
Date: 03/30/04

  • Next message: Bjoern A. Zeeb: "Re: IPSec troubles"
    To: Ruslan Ermilov <ru@FreeBSD.org>
    Date: Tue, 30 Mar 2004 08:23:04 -0800 (PST)
    
    

    Ruslan Ermilov writes:
    | To make the long story short, under a heavy RX load, the ste(4) NIC's
    | RX ring head may get ahead of what driver thinks, bringing all sort
    | of havoc like stuck traffic, disordered packets, etc. The NIC never
    | gets out of this state, and the only workaround is to reset the chip,
    | and so we did for some time (by adding the IFF_LINK2 handler to call
    | the driver's watchdog function).

    We never experienced this in our testing or in production. We were
    using the 4 port card. Maybe this bug is on another variant of the
    chip.

    | We've adopted the approach used by dc(4) and xl(4), but instead of
    | seeing if we need to re-synchronize the head _after_ receiving (like
    | dc(4) and xl(4) drivers do), we do it at the beginning of ste_rxeof().
    | As statistics shows, the number of resyncs needed is smaller by a
    | factor of 3 or more in this case, because often the RxDMAComplete
    | interrupt is generated when RX ring is completely empty(!), and as
    | NIC continues to do DMA and fill the RX ring while we're still
    | servicing the RxDMAComplete interrupt, we did more resyncs than was
    | actually necessary.

    Sounds good.
     
    | Also, we were able to further reduce the number of resyncs by setting
    | the RxDMAPollPeriod to a higher value. 320ns looked like an overkill
    | here, and I'm not sure why you have chosen it in the first place,
    | when adding polling support for RX in the driver. Also, we believe
    | that this setting may be responsible for what you referred to as:

    I'm not sure.
     
    | > This card still has seemingly unfixable issues under heavy RX load in
    | > which the card takes over the PCI bus.
    |
    | in the commit log for revision 1.33 of if_ste.c.
    |
    | Attached is the patch (for RELENG_4) we're currently using, and are
    | quite happy with. If anyone is using ste(4) NICs and is experiencing
    | similar problems, I'd be glad to hear the reports about this patch.

    Sounds good. However it won't fix the core problem that I reported.
    D-Link's solution was to EOL 4-port card because of this problem.
    You can see it in their Linux and Windows drivers. The easiest
    way to see it is to send traffic into all 4 ports of the 4 port card.
    You will see only one port have activity then it switch to another.
    It will not be multiplexing traffic. Another thing I found that would
    lead to a panic was that if you reset the chip while it is sending
    traffic into the card the reset will return but the card still takes
    RX packets and DMA's them into memory. Since it we have released
    the memory for the card it would then splat bits over something else.
    It was a while before I figure out this cause of panics :-(
    I don't see how your change will fix that.

    I no longer have access to the HW or the test environment I used
    since I've changed jobs. I have no objection to your change and it
    sounds good. I don't think it will solve the problem I saw.

    While you are in this driver can you convert it to Mike Silby's generic
    de-frager? To test it do some like:
            dd if=/kernel bs=1 | ssh <something> "cat > /tmp/kernel"
    I original "stole" the code to do this from fxp(4) which was before
    Mike did the generic de-frager.

    Thanks,

    Doug A.
    _______________________________________________
    freebsd-net@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-net
    To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"


  • Next message: Bjoern A. Zeeb: "Re: IPSec troubles"

    Relevant Pages

    • Re: nfe stalls (analysis and partial solution)
      ... with the 'nfe' driver so we can see if the patch we can come ... they stop looking at the ring buffers and drop incoming packets. ... I still don't know how to reliably reproduce Rx stalls ... but some more extra card reset is certainly better than losing contact ...
      (freebsd-current)
    • Re: nfe stalls (analysis and partial solution)
      ... with the 'nfe' driver so we can see if the patch we can come ... they stop looking at the ring buffers and drop incoming packets. ... I still don't know how to reliably reproduce Rx stalls ... but some more extra card reset is certainly better than losing contact ...
      (freebsd-net)
    • RE: [UPDATED PATCH] EFI support for ia32 kernels
      ... >> reuse a single driver image for multiple architectures assuming there ... As one of the people responsible for the EFI Specification and our ... Perhaps the UNDI network card interface that Intel developed ... BIOS can't shadow that much ROM code. ...
      (Linux-Kernel)
    • Re: horcruxes?
      ... Troels Forchhammer wrote: ... > The number of the card in the traditional deck, ... >> a disk with a star on it, a cup, a sword, and a staff. ... > mention yourself, the absence of the ring, but you might anyway be on ...
      (alt.fan.harry-potter)
    • Re: Linux, X, ld, gcc, linking, shared libraries and stuff
      ... >> because, originally, video cards / system RAM could NOT afford to have ... > GL actually "copies" everything, but it's done by the graphics card, so ... > anyway if it's not hardware accelerated. ... installed the proper driver, then it zooms around the screen... ...
      (alt.lang.asm)