Re: 5.3-RELEASE: WARNING - WRITE_DMA interrupt timout

From: Zoltan Frombach (tssajo_at_hotmail.com)
Date: 11/19/04

  • Next message: Shunsuke SHINOMIYA: "Re[2]: serious networking (em) performance (ggate and NFS) problem"
    To: Søren Schmidt <sos@DeepCore.dk>, "Poul-Henning Kamp" <phk@phk.freebsd.dk>, "Garance A Drosihn" <drosih@rpi.edu>
    Date: Fri, 19 Nov 2004 01:52:27 -0800
    
    

    My problem is not related to a SATA controller. I use the onboard UDMA133
    controller (pretty rare) with a Maxtor UDMA133 drive. It is a new ABIT
    motherboard that uses SiS chipset. The hard drive is not new, but previously
    I used it in UDMA100 mode only, with another motherboard. See:

    atapci0: <SiS 964 UDMA133 controller> port
    0x4000-0x400f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 2.5 on pci0
    ata0: channel #0 on atapci0
    ata1: channel #1 on atapci0
    ad0: 78167MB <Maxtor 6Y080L0/YAR41VW0> [158816/16/63] at ata0-master UDMA133

    Everything works pretty well on this server. Except that these DMA_WRITE
    warning messages make me worrying. However, I was not getting too many of
    them lately, and none since I installed Soren's patch a few hours ago.

    I also figured out why my system became so unresponsive at times. I host
    about 150 domains on this server, with email and everything. I use qmail as
    the MTA, and by default it accepts all email on all hosted domains, even
    when the mail is addressed to a non-existing user. It will try to bounce
    those messages but only later in the process. IMO, it is very poor design of
    the qmail MTA, an otherwise pretty powerful email program. I also use
    qmail-scanner with clamav and spamassassin. The qmail-scanner program and
    spamassassin are written in Perl. So every single message that qmail accepts
    gets through qmail-scanner (and therefore gets through clamav and
    spamassassin as well), even the ones that are addressed to non-existing
    users... Some of the hosted domains at times get hit really hard with
    extensive spam and around that time the server becomes very unresponsive.
    Not surprisingly though, because according to my maillog, time to time some
    spammer send literally hundreds of junk mail to non-existing users, all
    within a few seconds of time. Right then the server comes to a crawl. Last
    time, I couldn't access any hosted web sites via HTTP nor FTP for minutes.
    It took me like 3 minutes to be able to get in via SSH because of the
    slowness. Finally I was able to see the reason: all those Perl processes
    scanning the junk mail... The server became a victim of a DOS attack caused
    by excessive spam. So I believe that this was the reason of the
    unresponsiveness. And it could be the reason why I received those DMA_WRITE
    warnings at those times! I'm not a 100% sure about it though, but I think it
    is possible.

    I'm going to apply a patch to qmail in a few days. That makes qmail to
    reject messages sent to unexsiting users immediately, so they won't need to
    get scanned. This way, I believe, I will greatly reduce the load caused by
    this flood of junk mail. Then hopefully these DMA_WARNING messages will be
    gone, too... We'll see.

    Zoltan

    > At 7:33 PM -0800 11/18/04, Zoltan Frombach wrote:
    >>For your information, I applied this patch just now to my kernel.
    >>Sorry about the delay! I will send an update in a few days once I
    >>see if those DMA_WRITE warnings are still happening or not.
    >
    > For those who may have missed my other message, it looks like all
    > of my problems were related to a PCI-based SATA controller which
    > was added by the store that built my machine. This card was added
    > even though I had selected a motherboard with on-board SATA.
    >
    > The problem controller was a: <SiI 3112 SATA150 controller>
    > and it has been causing me enough problems that I couldn't get
    > through a buildworld to even try the suggested patch.
    >
    > I have now switched to the on-board: <VIA 6420 SATA150 controller>
    > and so far I have not seen any more of these WRITE_DMA messages.
    > None. And I have been pounding the disk pretty hard with a
    > variety of work for a few hours now. So, now there is no point
    > in me adding the patch, because I no longer see the message!
    >
    > It would still be nice if FreeBSD would react better to whatever
    > problems this card causes. I still have this stupid card, and I
    > would be happy to mail it off to anyone who might want to debug the
    > problems with it. And if we *can't* fix it, then maybe we should
    > just remove support for it. I have had to rebuild my freebsd
    > partitions several times now due to these problems, and certainly
    > that wasn't much fun. Although I guess my problems might also be
    > partially due to the Western Digital drive I was using, when it is
    > used in combination with this card.
    >
    > --
    > Garance Alistair Drosehn = gad@gilead.netel.rpi.edu
    > Senior Systems Programmer or gad@freebsd.org
    > Rensselaer Polytechnic Institute or drosih@rpi.edu
    _______________________________________________
    freebsd-current@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-current
    To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"


  • Next message: Shunsuke SHINOMIYA: "Re[2]: serious networking (em) performance (ggate and NFS) problem"

    Relevant Pages

    • Re: qmail starttls patch does not seed the random number generator
      ... qmail starttls patch does not seed the random number generator ... > The way you fixed the problem is not secure. ... If you're depending on the fact that your mail server is TLS encrypting ...
      (Bugtraq)
    • Re: KB917537 Failing
      ... four days after the patch released. ... mature server OS, an enterprise-class messaging system, and automated ... if you hit the "Restart" button ... here as I had assumed this would be a common problem.. ...
      (microsoft.public.windows.server.sbs)
    • Re: ER problem / bug? in 11.50.UC3
      ... The engineer develops a patch and performs unit testing to verify that the patch is working. ... The staging branch is built nightly and goes through some 10 hours of automated testing daily. ... catch that you used an uppercase letter when defining the server. ... not necessarily those of the Fonterra Co-operative Group. ...
      (comp.databases.informix)
    • Re: mail server setup questions
      ... Subject: mail server setup questions ... Sendmail at least supports most ... I just realised that qmail appears over and over in Linux ... We have run qmail for several years on FreeBSD quite well with few problems, none of which where related to the software, it's design, it's configuration, always it was Clam or SpamAssassin binding things up. ...
      (freebsd-questions)
    • Re: FOLLOW UP : Forms Authentication Randomly Times Out (Windows 2003)
      ... Well there goes my theory on the patch. ... "Joe Audette" wrote in message ... > It doesn't look like we have that patch on our server. ... > had to scrap the automatic re-direction to login from the ...
      (microsoft.public.dotnet.framework.aspnet.security)