Re: iSCSI disconnects dilema



Wilko Bulte wrote:
On Fri, Jan 12, 2007 at 09:31:04PM +0200, Danny Braniss wrote..
--s/l3CgOIzMHHjg/5
Content-Type: text/plain; charset=iso-8859-2
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Tue, Jan 09, 2007 at 09:06:46AM +0200, Danny Braniss wrote:
Hi,
While I think I have almost solved the problem of network disconnects,
It downed on me a major problem:
When a 'local' disk crashes, the kernel will probably hang/panic/crash.
if i don't try to recover, then there is no change in the above scenario.
if i try to recover, then the client does not know that it should
umount/fsck/mount.
While all this seems familiar, removing a floppy/disk-on-key while it's
mounted, we could always say "you shouldn't have done that!", with
a network connection, it can happen very often - rebooting the target, a
network hickup, etc.
=20
So, any ideas?
In my opinion it should be done this way:

You have a queue of I/O requests. You send the to the other end and wait
for confirmation. Until confirmation is received, you keep the requests
queued. If the other end dies, you try to reconnect (until some timeout
expires, the processes which send those requests will just wait), if you
reconnect successfully, you resend not-confirmed requests, if you won't
be able to reconnect, you just pass the errors up.

This is what I did in ggate and it seems to work.
That is basically what i'm doing - unacked request get requed.
the problem I fear (and maybe I'm paranoid :-):

Paranoia is a Good Thing(TM) in data storage land :-)

assume the following scenario, the client(initiator) sends a write command,
the target acks it, then it crashes, if the write was never completed,
the initiator goes on as nothing ever happened.

Yes, but what can the initiator do about that? I mean, it does not have any
visibility of what the target has (or has not) done with the data. '

This is roughly the same as a RAID box accepting a write into a writeback cache
and ACK-ing to the host. You can only assume that the RAID box' cache
will get flushed to the spindles properly. All the usual horror scenarios
with a broken battery backup of the cache and a powerfailure etc apply here.

Wilko


I forget, does iSCSI have a concept of a flush_cache command, or the
equivalent of what parallel SCSI does with ordered tags?

not realy - or I can't find it. iSCSI is mainly and envelope for
scsi commands, so whatever the CAM does, it will pass it on.
There are some managemenet commands, so the target can tell the initiator
that it's going down for example (and what should the driver
do in such a case in freebsd?)

If so, then
that's how your app or OS knows that the transaction got committed to
stable storage. It's been long assumed in the external storage world
that you are at the mercy of the external storage cache, so the problem
that Danny is referring to is nothing new. The real question is how
to implement the equivalent mechanism that iSCSI provides in a way that
the OS/app can make use of it. For example, CAM issues an ordered tag
periodically to flush the disk cache to stable storage.
nice, (or wishful thinking :-), the scsi part of iSCSI is/can be
software/virtual.

Most storage
drivers, including CAM, will issue some sort of a flush_cache command to
the controller and media during system shutdown.

this took me a long time to fix! the userland program got killed at shutdown,
the link was lost, and so there was no way to flush buffers, fixed by calling
fget(...) too.

I guess I can summarize: (and use the 3 monkey law :-)
1- assume the target is 'well behaved' and will flush cache.
2- there is - currently - no way to tell the OS that not all
seems to be as expected.
3- keep quiet and hope for the best.
danny


_______________________________________________
freebsd-hackers@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@xxxxxxxxxxx"



Relevant Pages

  • Re: iSCSI disconnects dilema
    ... While I think I have almost solved the problem of network disconnects, ... You have a queue of I/O requests. ... Paranoia is a Good Thingin data storage land :-) ... This is roughly the same as a RAID box accepting a write into a writeback cache ...
    (freebsd-hackers)
  • RE: Pull request for FS-Cache, including NFS patches
    ... Local disk cache was great for AFS back around 1992. ... desktop box and are connected by sufficiently fast network and you have ...
    (Linux-Kernel)
  • Re: ACP, One of the Oldest Open Source Apps
    ... three other processor caches instead of only one other processor cache. ... on cache-line boundaries and end on cache-line boundaries (be multiples ... using computer real storage as cache ... ... disk to "home" location), it first had to be written to its home ...
    (bit.listserv.ibm-main)
  • Re: How Netlogon Service creates the WinlogonDomain Cache
    ... The GINA gets invoked long before the network is available. ... why msgina creates a cache in the registry of the available domains. ... 3.If i can delete this key from registry and then press Alt+Clt+Del ...
    (microsoft.public.platformsdk.security)
  • Re: [PATCHSET block#for-2.6.36-post] block: replace barrier with sequenced flush
    ... I'm not seeing where anything is broken with current mpath. ... CACHE is needed. ... But I'm afraid if there is a storage where something like below can happen: ... - mpath retries the flush command using other path. ...
    (Linux-Kernel)