Re: aac0: COMMAND 0xffffffffxxxxxxxx TIMEOUT AFTER xx SECONDS
- From: Scott Long <scottl@xxxxxxxxxx>
- Date: Fri, 09 Jun 2006 14:18:30 -0600
Chris Hedley wrote:
On Fri, 9 Jun 2006, Doug White wrote:
On Fri, 9 Jun 2006, Chris Hedley wrote:
I've been receiving this message quite a lot lately if I put my Adaptec 2410SA aac controller under really heavy load. A quick look at the archives suggests that it used to be a problem a couple of years ago, but was apparently fixed. Personally I've had no bother with it until a few months ago when I upgraded my version of -CURRENT, at which point it started misbehaving.
I assume you've checked cabling and termination? Frequently, driver updates can improve performance which means less tolerance for marginal configurations.
The 2410SA uses SATA discs (I was trying to get SCSI performance on the cheap, ever the optimist!) so I'm assuming that the cables are okay. At least there's no user-breakable termination settings for me to worry about...
I'm also wondering if I might not be better off actually replacing the card with something better, or at least something better suited to FreeBSD: with the discs' and controller's write-caching turned off, the 2410SA is s-l-o-w, about 6MB/s for contiguous writes to an array (either RAID-5 or RAID-10) (benchmarked using the admittedly somewhat crude "dd various block sizes to/from a /dev entry" technique), although reads are acceptable at ~50-60MB/s, if not especially earth-shattering. Any suggestions (for something inexpensive! If money were no object I'd've gone for a SCSI-only system), or might I just as well stick with the 2410SA?
6MB/s sounds like you aren't getting any help from the card's write cache; its having to do stripe reads to recalculate parity instead of doing full stripe writes. Many cards disable write-back cache if the battery module isn't present -- make sure you have one and its working. /dev accesses also use physio so you don't get any benefit from write combining in the filesystem layer.
I've deliberately turned off write-caching because the 2410SA doesn't support battery-backed memory. I'm not sure if it's really necessary to disable it, but having experienced the odd disc crash in the past I've become a little paranoid about my data...
What the battery gives you is consistency of the parity data in the case of power loss. You can have a situation where a block is being modified, and thus the parity also needs to be modified. If the block
gets written but not the parity, or the parity gets written but not the
block, the stripe will be inconsistent. You won't see this until you
have a drive failure and are trying to do a rebuild from teh parity. By
that point, it's too late, you'll have silent data corruption due to the
inconsistency. For RAID-0, the battery is pointless, and for RAID-1, the battery is nearly pointless; the mirror members will either agree or
not, and if they disagree the worst that will happen is that you'll get
old data. This is no different than if the OS crashes without flushing
out all buffers. Old data is much easier to recover from than corrupt
data, which is what you get if the parity is inconsistent.
Also, in general, hardware RAID beats PCI RAID, hands down.
In my case, software raid beats it too! I have my "fast discs" attached to an old 3960 controller and mirror them with gmirror, and the write performance is an order of magnitude better than the 2410SA, which tells me that something somewhere must be wrong. I know I shouldn't really expect SCSI performance from SATA discs, but this seems a bit much to me (I also have write-caching turned off on my SCSI discs, but I have enabled tagged queueing). I'm still slightly uncomfortable with the idea of software RAID, but it hasn't lost anything yet, in spite of a few "unplanned outages".
Software RAID will almost always be faster for trivial tasks than PCI
RAID. What PCI RAID gives you is task offloading from the CPU, and
protection while the OS is not running. If your CPU is sitting idle
most of the time, then software RAID often is a win.
That said, the design of a PCI RAID controller plays a huge role in how
it performs. Let's just say that the 2410 design is, um, "low end". There are other cards out there from several vendors, especially the newer generation ones that use PCI-Express and PCI-X, that perform a whole heck of a lot better. I have several cards that beat software
RAID by a wide margin, but they are also expensive.
Scott
_______________________________________________
freebsd-current@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@xxxxxxxxxxx"
- Follow-Ups:
- Re: aac0: COMMAND 0xffffffffxxxxxxxx TIMEOUT AFTER xx SECONDS
- From: Chris Hedley
- Re: aac0: COMMAND 0xffffffffxxxxxxxx TIMEOUT AFTER xx SECONDS
- References:
- aac0: COMMAND 0xffffffffxxxxxxxx TIMEOUT AFTER xx SECONDS
- From: Chris Hedley
- Re: aac0: COMMAND 0xffffffffxxxxxxxx TIMEOUT AFTER xx SECONDS
- From: Doug White
- Re: aac0: COMMAND 0xffffffffxxxxxxxx TIMEOUT AFTER xx SECONDS
- From: Chris Hedley
- aac0: COMMAND 0xffffffffxxxxxxxx TIMEOUT AFTER xx SECONDS
- Prev by Date: Re: aac0: COMMAND 0xffffffffxxxxxxxx TIMEOUT AFTER xx SECONDS
- Next by Date: Re: [fbsd] Integrating ProPolice/SSP into FreeBSD
- Previous by thread: Re: aac0: COMMAND 0xffffffffxxxxxxxx TIMEOUT AFTER xx SECONDS
- Next by thread: Re: aac0: COMMAND 0xffffffffxxxxxxxx TIMEOUT AFTER xx SECONDS
- Index(es):
Relevant Pages
|