Re: iSCSI disconnects dilema
- From: Wilko Bulte <wb@xxxxxxxxxxxxxxxxx>
- Date: Fri, 12 Jan 2007 20:55:50 +0100
On Fri, Jan 12, 2007 at 09:31:04PM +0200, Danny Braniss wrote..
--s/l3CgOIzMHHjg/5
Content-Type: text/plain; charset=iso-8859-2
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
On Tue, Jan 09, 2007 at 09:06:46AM +0200, Danny Braniss wrote:
Hi,
While I think I have almost solved the problem of network disconnects,
It downed on me a major problem:
When a 'local' disk crashes, the kernel will probably hang/panic/crash.
if i don't try to recover, then there is no change in the above scenario.
if i try to recover, then the client does not know that it should
umount/fsck/mount.
While all this seems familiar, removing a floppy/disk-on-key while it's
mounted, we could always say "you shouldn't have done that!", with
a network connection, it can happen very often - rebooting the target, a
network hickup, etc.
=20
So, any ideas?
In my opinion it should be done this way:
You have a queue of I/O requests. You send the to the other end and wait
for confirmation. Until confirmation is received, you keep the requests
queued. If the other end dies, you try to reconnect (until some timeout
expires, the processes which send those requests will just wait), if you
reconnect successfully, you resend not-confirmed requests, if you won't
be able to reconnect, you just pass the errors up.
This is what I did in ggate and it seems to work.
That is basically what i'm doing - unacked request get requed.
the problem I fear (and maybe I'm paranoid :-):
Paranoia is a Good Thing(TM) in data storage land :-)
assume the following scenario, the client(initiator) sends a write command,
the target acks it, then it crashes, if the write was never completed,
the initiator goes on as nothing ever happened.
Yes, but what can the initiator do about that? I mean, it does not have any
visibility of what the target has (or has not) done with the data. '
This is roughly the same as a RAID box accepting a write into a writeback cache
and ACK-ing to the host. You can only assume that the RAID box' cache
will get flushed to the spindles properly. All the usual horror scenarios
with a broken battery backup of the cache and a powerfailure etc apply here.
Wilko
--
Wilko Bulte wilko@xxxxxxxxxxx
_______________________________________________
freebsd-hackers@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@xxxxxxxxxxx"
- Follow-Ups:
- Re: iSCSI disconnects dilema
- From: Scott Long
- Re: iSCSI disconnects dilema
- References:
- Re: iSCSI disconnects dilema
- From: Danny Braniss
- Re: iSCSI disconnects dilema
- Prev by Date: Re: iSCSI disconnects dilema
- Next by Date: Re: Getting a patch commited
- Previous by thread: Re: iSCSI disconnects dilema
- Next by thread: Re: iSCSI disconnects dilema
- Index(es):
Relevant Pages
|