Re: em interrupt storm

From: Michael Vince (mv_at_roq.com)
Date: 11/24/05

  • Next message: Mark Jayson Alvarez: "Re: carp questions"
    Date: Thu, 24 Nov 2005 12:40:24 +1100
    To: Kris Kennaway <kris@obsecurity.org>
    
    

    Kris Kennaway wrote:

    >On Tue, Nov 22, 2005 at 08:54:49PM -0800, John Polstra wrote:
    >
    >
    >>On 23-Nov-2005 Kris Kennaway wrote:
    >>
    >>
    >>>I am seeing the em driver undergoing an interrupt storm whenever the
    >>>amr driver receives interrupts. In this case I was running newfs on
    >>>the amr array and em0 was not in use:
    >>>
    >>> 28 root 1 -68 -187 0K 8K CPU1 1 0:32 53.98% irq16: em0
    >>> 36 root 1 -64 -183 0K 8K RUN 1 0:37 27.75% irq24: amr0
    >>>
    >>># vmstat -i
    >>>interrupt total rate
    >>>irq1: atkbd0 2 0
    >>>irq4: sio0 199 1
    >>>irq6: fdc0 32 0
    >>>irq13: npx0 1 0
    >>>irq14: ata0 47 0
    >>>irq15: ata1 931 5
    >>>irq16: em0 6321801 37187
    >>>irq24: amr0 28023 164
    >>>cpu0: timer 337533 1985
    >>>cpu1: timer 337285 1984
    >>>Total 7025854 41328
    >>>
    >>>When newfs finished (i.e. amr was idle), em0 stopped storming.
    >>>
    >>>MPTable: <INTEL SE7520BD22 >
    >>>
    >>>
    >>This is the dreaded interrupt aliasing problem that several of us have
    >>experienced with this chipset. High-numbered interrupts alias down to
    >>interrupts in the range 16..19 (or maybe 16..23), a multiple of 8 less
    >>than the original interupt.
    >>
    >>Nobody knows what causes it, and nobody knows how to fix it.
    >>
    >>
    >
    >This would be good to document somewhere so that people don't either
    >accidentally buy this hardware, or know what to expect when they run
    >it.
    >
    >Kris
    >
    >
    This is Intels latest server chipset designs and Dell are putting that
    chipset in all their servers.
    Luckily I haven't not seen the problem on any of my Dell servers (as
    long as I am looking at this right).

    This server has been running for a long time.
    vmstat -i
    interrupt total rate
    irq1: atkbd0 6 0
    irq4: sio0 23433 0
    irq6: fdc0 10 0
    irq8: rtc 2631238611 128
    irq13: npx0 1 0
    irq14: ata0 99 0
    irq16: uhci0 1507608958 73
    irq18: uhci2 42005524 2
    irq19: uhci1 3 0
    irq23: atapci0 151 0
    irq46: amr0 41344088 2
    irq64: em0 1513106157 73
    irq0: clk 2055605782 99
    Total 7790932823 379

    This one just transfered over 8gigs of data in 77seconds with around
    1000 simultaneous tcp connections under a load of 35. Both seem OK.
    vmstat -i
    interrupt total rate
    irq4: sio0 315 0
    irq13: npx0 1 0
    irq14: ata0 47 0
    irq16: uhci0 2894669 2
    irq18: uhci2 977413 0
    irq23: ehci0 3 0
    irq46: amr0 883138 0
    irq64: em0 2890414 2
    cpu0: timer 2763566717 1999
    cpu3: timer 2763797300 1999
    cpu1: timer 2763551479 1999
    cpu2: timer 2763797870 1999
    Total 11062359366 8004

    Mike

    _______________________________________________
    freebsd-net@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-net
    To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"


  • Next message: Mark Jayson Alvarez: "Re: carp questions"

    Relevant Pages

    • Re: em interrupt storm
      ... >>This is the dreaded interrupt aliasing problem that several of us have ... >>Nobody knows what causes it, and nobody knows how to fix it. ... This server has been running for a long time. ...
      (freebsd-current)
    • Re: Can`t boot up!
      ... > new mail server. ... > it has not been live and nobody has been able to log into ... I know what the root password ... > I tried to use RESCUE MODE from the boot CD, ...
      (RedHat)
    • How do you respond to "Its slow"?
      ... what's wrong with the server?" ... off all sorts of phone calls, meetings and general panic becuase "OMG ... Nobody seems to hear me when I point out all ... made without telling anyone, offsite client connections, etc. ...
      (comp.unix.solaris)
    • Re: mail() sends as "nobody", causing server error
      ... coming from "nobody," which is automatically bounced to me. ... You are running php as an Apache module, ... unless you hack something in your mail server software (e.g. ... secure mailform script. ...
      (comp.lang.php)
    • Re: Is it JAVA or VB.net?
      ... > There are only two rivals in server side programming, Java and C#. ... > that nobody, ...
      (comp.lang.java.programmer)