Re: [mfi] command timeouts
- From: "Bjoern A. Zeeb" <bzeeb-lists@xxxxxxxxxxxxxxxxxx>
- Date: Mon, 19 Feb 2007 13:55:47 +0000 (UTC)
On Mon, 19 Feb 2007, Bjoern A. Zeeb wrote:
Hi,
I am testing mfi on a Dell 2950 with 6 PD, 2LD (1st LD=RAID1,
2nd LD=RAID5, 1HTSP).
(The somewhat sucky) megacli "works".
While most commands to gather information work fine, as do pulling out
disks hard, setting a disk offline or running some other commands hangs
'something', which might be the controller?
For example:
foo# megacli -PDOffline -PhysDrv'[1:3]' -a0
EnclId-1 SlotId-3 state changed to OffLine.
foo# foo# ls -l
<hangs forever>
It's not only this process but all disk IO related processes.
On the serial console I get:
...
mfi0: COMMAND 0xffffffff80c3c040 TIMEOUT AFTER 732 SECONDS
mfi0: COMMAND 0xffffffff80c3b8d0 TIMEOUT AFTER 732 SECONDS
mfi0: COMMAND 0xffffffff80c3cb68 TIMEOUT AFTER 732 SECONDS
mfi0: COMMAND 0xffffffff80c3bd98 TIMEOUT AFTER 732 SECONDS
mfi0: COMMAND 0xffffffff80c3bc88 TIMEOUT AFTER 732 SECONDS
mfi0: COMMAND 0xffffffff80c3cbf0 TIMEOUT AFTER 732 SECONDS
mfi0: COMMAND 0xffffffff80c3cc78 TIMEOUT AFTER 732 SECONDS
mfi0: COMMAND 0xffffffff80c3cf20 TIMEOUT AFTER 732 SECONDS
mfi0: COMMAND 0xffffffff80c3cd88 TIMEOUT AFTER 732 SECONDS
mfi0: COMMAND 0xffffffff80c3cfa8 TIMEOUT AFTER 732 SECONDS
mfi0: COMMAND 0xffffffff80c3d828 TIMEOUT AFTER 684 SECONDS
mfi0: COMMAND 0xffffffff80c3db58 TIMEOUT AFTER 679 SECONDS
mfi0: COMMAND 0xffffffff80c3de88 TIMEOUT AFTER 44 SECONDS
mfi0: COMMAND 0xffffffff80c3c728 TIMEOUT AFTER 763 SECONDS
mfi0: COMMAND 0xffffffff80c3c040 TIMEOUT AFTER 763 SECONDS
mfi0: COMMAND 0xffffffff80c3b8d0 TIMEOUT AFTER 763 SECONDS
mfi0: COMMAND 0xffffffff80c3cb68 TIMEOUT AFTER 763 SECONDS
mfi0: COMMAND 0xffffffff80c3bd98 TIMEOUT AFTER 763 SECONDS
mfi0: COMMAND 0xffffffff80c3bc88 TIMEOUT AFTER 763 SECONDS
mfi0: COMMAND 0xffffffff80c3cbf0 TIMEOUT AFTER 763 SECONDS
mfi0: COMMAND 0xffffffff80c3cc78 TIMEOUT AFTER 763 SECONDS
mfi0: COMMAND 0xffffffff80c3cf20 TIMEOUT AFTER 763 SECONDS
mfi0: COMMAND 0xffffffff80c3cd88 TIMEOUT AFTER 763 SECONDS
mfi0: COMMAND 0xffffffff80c3cfa8 TIMEOUT AFTER 763 SECONDS
mfi0: COMMAND 0xffffffff80c3d828 TIMEOUT AFTER 715 SECONDS
mfi0: COMMAND 0xffffffff80c3db58 TIMEOUT AFTER 710 SECONDS
mfi0: COMMAND 0xffffffff80c3de88 TIMEOUT AFTER 75 SECONDS
mfi0: COMMAND 0xffffffff80c3c728 TIMEOUT AFTER 793 SECONDS
mfi0: COMMAND 0xffffffff80c3c040 TIMEOUT AFTER 794 SECONDS
mfi0: COMMAND 0xffffffff80c3b8d0 TIMEOUT AFTER 794 SECONDS
mfi0: COMMAND 0xffffffff80c3cb68 TIMEOUT AFTER 794 SECONDS
mfi0: COMMAND 0xffffffff80c3bd98 TIMEOUT AFTER 794 SECONDS
mfi0: COMMAND 0xffffffff80c3bc88 TIMEOUT AFTER 794 SECONDS
mfi0: COMMAND 0xffffffff80c3cbf0 TIMEOUT AFTER 794 SECONDS
mfi0: COMMAND 0xffffffff80c3cc78 TIMEOUT AFTER 794 SECONDS
mfi0: COMMAND 0xffffffff80c3cf20 TIMEOUT AFTER 794 SECONDS
mfi0: COMMAND 0xffffffff80c3cd88 TIMEOUT AFTER 794 SECONDS
mfi0: COMMAND 0xffffffff80c3cfa8 TIMEOUT AFTER 794 SECONDS
mfi0: COMMAND 0xffffffff80c3d828 TIMEOUT AFTER 746 SECONDS
mfi0: COMMAND 0xffffffff80c3db58 TIMEOUT AFTER 741 SECONDS
mfi0: COMMAND 0xffffffff80c3de88 TIMEOUT AFTER 106 SECONDS
mfi0: COMMAND 0xffffffff80c3c728 TIMEOUT AFTER 824 SECONDS
...
I can still break to ddb. Without disk I/O, the only
possible thing I can really do is type reset.
I'll build a debugging kernel so I can do show alllocks, etc
but if someone with more experience with this driver/hw could
contact me I can run further tests.
this time with the debugging kernel:
foo# megacli -PDOffline -PhysDrv'[1:3]' -a0
EnclId-1 SlotId-3 state changed to OffLine.
foo# foo# foo# foo#
I was able to hit <enter> multiple times after the "uh it still lives"
but then ...
command 0xffffffff80c40000 not in queue, flags = 0x20, bit = 0x80
panic: command not in queue
cpuid = 2
Uptime: 1m17s
Physical memory: 4084 MB
Dumping 199 MB: 184 168 152 136 120 104 88 72 56 40 24 8
Dump complete
telnet> send brk
KDB: enter: Line break on console
[thread pid 15 tid 100009 ]
Stopped at kdb_enter+0x2f: nop
db> where
Tracing pid 15 tid 100009 td 0xffffff012f5c4000
kdb_enter() at kdb_enter+0x2f
siointr1() at siointr1+0x400
siointr() at siointr+0x2e
intr_execute_handlers() at intr_execute_handlers+0x124
Xapic_isr1() at Xapic_isr1+0x7f
--- interrupt, rip = 0xffffffff803c9787, rsp = 0xffffffffac06eb30, rbp = 0xffffffffac06eb60 ---
_mtx_lock_sleep() at _mtx_lock_sleep+0x137
_mtx_lock_flags() at _mtx_lock_flags+0xe1
mfi_timeout() at mfi_timeout+0x32
softclock() at softclock+0x1c8
ithread_loop() at ithread_loop+0xfe
fork_exit() at fork_exit+0xaa
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xffffffffac06ed40, rbp = 0 ---
db> show alllocks
Process 24 (irq78: mfi0) thread 0xffffff012f5c5000 (100020)
exclusive sleep mutex MFI I/O lock r = 0 (0xffffff012f5cc630) locked @ /u1/src/HEAD/sys/dev/mfi/mfi.c:775
After the reboot it does not seem that the command
was executed as the disk still seems to be online (at least
it was the last time).
--
Bjoern A. Zeeb bzeeb at Zabbadoz dot NeT
_______________________________________________
freebsd-current@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@xxxxxxxxxxx"
- Follow-Ups:
- Re: [mfi] command timeouts
- From: Scott Long
- Re: [mfi] command timeouts
- References:
- [mfi] command timeouts
- From: Bjoern A. Zeeb
- [mfi] command timeouts
- Prev by Date: [mfi] command timeouts
- Next by Date: [g_bio(9) PANIC] Duplicate free of item 0xc56ee8c4 from zone 0xc185d780(g_bio) ...
- Previous by thread: [mfi] command timeouts
- Next by thread: Re: [mfi] command timeouts
- Index(es):
Relevant Pages
|
|