Re: RFC: ATA to CAM integration patch (and gjournaled previuos nodes)



Juergen Lock wrote:
On Sat, Jul 25, 2009 at 10:19:10PM +0300, Alexander Motin wrote:
Juergen Lock wrote:
On Mon, Jul 06, 2009 at 11:16:46PM +0200, Juergen Lock wrote:
I tried this on the box with that optical drive that head no
longer likes (fails to be probed and generates an irq storm, see
http://docs.freebsd.org/cgi/mid.cgi?20090628101656.GA38983
), and with ahci.ko loaded by loader.conf I got timeouts followed by
a panic:
http://people.freebsd.org/~nox/cam-ata.20090704-panic1.jpg
http://people.freebsd.org/~nox/cam-ata.20090704-panic2.jpg
[...]
Ok I managed to dig myself out of this mess by connecting the problem
drive to a jmicron pcie card that fell into my hands yesterday; I updated
the test install to head from today and started reinstalling ports (bc of
the shlib bumps) and testing the new hplip port on head (seems to work
no worse than on 7), when suddenly ahci got problems: it printed endless
retrying messages with the box' disk access led on solid, causing processes
to get stuck. I was still able to switch to a console and enter ddb,
but dumping (call doadump) failed and I didn't know what to look for
otherwise, so I'm afraid I can't give more info about this hang. :(
Anyway, could this be caused by ncq? I have disabled ahci.ko for now,
we'll see if this `fixes' it...
Difficult to say without seeing those messages. NCQ errors actually may lead to series (up to 32) of retries, as if there were several running commands when error happened, all other commands are aborted and retried after error recovery process completes.

Ah so the recovery could take several minutes? Maybe I didn't wait
long enough then...

Depends on number of errors. It should be incredibly bad case I think.

I haven't experimented with really broken drives, but artificially generated NCQ errors were handled properly on my tests.

OK I guess I should take a photo next time it happens... Btw, can the
max # of `tags' be lowered with ncq too in case a drive cant handle
too many? I think its `camcontrol tags' for scsi...

To allow some simplifications, current implementation supports NCQ in all-or-none fashion. If drive reports queue support of less then 32 commands, then NCQ will not be used for it. It is not controllable via `camcontrol tags` now, due to major difference between SATA NCQ and SCSI TCQ operation principles.

Here is the dmesg with ahci and the jmicron:

atapci0: <JMicron JMB363 SATA300 controller> port 0xbf00-0xbf07,0xbe00-0xbe03,0xbd00-0xbd07,0xbc00-0xbc03,0xbb00-0xbb0f mem 0xfd8fe000-0xfd8fffff irq 17 at device 0.0 on pci2
atapci0: Reserved 0x10 bytes for rid 0x20 type 4 at 0xbb00
ioapic0: routing intpin 17 (PCI IRQ 17) to lapic 0 vector 49
atapci0: [MPSAFE]
atapci0: [ITHREAD]
atapci0: Reserved 0x2000 bytes for rid 0x24 type 3 at 0xfd8fe000
atapci0: AHCI called from vendor specific driver
atapci0: AHCI v1.00 controller with 2 3Gbps ports, PM supported
atapci0: Caps: 64bit NCQ ALP AL CLO 3Gbps PM PMD SSC PSC 32cmd 2ports
ata2: <ATA channel 0> on atapci0
ata2: AHCI reset...
ata2: hardware reset ...
ata2: SATA connect timeout status=00000000
ata2: AHCI reset done: phy reset found no device
ata2: [MPSAFE]
ata2: [ITHREAD]
ata3: <ATA channel 1> on atapci0
ata3: AHCI reset...
ata3: hardware reset ...
ata3: SATA connect time=0ms status=00000113
ata3: ready wait time=11ms
ata3: software reset port 15...
ata3: ready wait time=0ms
ata3: SIGNATURE: eb140101
ata3: AHCI reset done: devices=00010000
ata3: [MPSAFE]
ata3: [ITHREAD]
ata4: <ATA channel 2> on atapci0
atapci0: Reserved 0x8 bytes for rid 0x10 type 4 at 0xbf00
atapci0: Reserved 0x4 bytes for rid 0x14 type 4 at 0xbe00
ata4: reset tp1 mask=03 ostat0=60 ostat1=70
ata4: stat0=0x20 err=0x20 lsb=0x20 msb=0x20
ata4: stat1=0x30 err=0x30 lsb=0x30 msb=0x30
ata4: reset tp2 stat0=20 stat1=30 devices=0x0
ata4: [MPSAFE]
ata4: [ITHREAD]
As I can see here, your JMicron configured for combined mode, not for plain AHCI, so it was handled by ata(4), not by ahci(4).

Ah that can be configured? Anyway there's only an optical drive on
it atm so its probably not _that_ important. :)

On my system it can be configured via BIOS settings.

--
Alexander Motin
_______________________________________________
freebsd-current@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@xxxxxxxxxxx"



Relevant Pages