Re: umount -f implementation





On Mon, 29 Jun 2009, Attilio Rao wrote:

2009/6/29 Rick Macklem <rmacklem@xxxxxxxxxxx>:
I just noticed that when I do the following:
- start a large write to an NFS mounted fs
- network partition the server (unplug a net cable)
- do a "umount -f <mntpoint>" on the machine

that it gets stuck trying to write dirty blocks to the server.

I had, in the past, assumed that a "umount -f" of an NFS mount would be
used to get rid of an NFS mount on an unresponsive server and that loss
of "writes in progress" would be expected to happen.

Does that sound correct? (In other words, an I seeing a bug or a feature?)

While that should be real in principle (immediate shutdown of the fs
operation and unmounting of the partition) it is totally impossible to
have it completely unsleeping, so it can happen that also umount -f
sleeps / delays for some times (example: vflush).
Currently, umount -f is one of the most complicated thing to handle in
our VFS because it puts as requirement that vnodes can be reclaimed in
any moment, adding complexity and possibility for races.

Yes, agreed. And I like to leave that stuff to more clever chaps than I:-)

What's the fix for your problem?

Well, when I tested it I found that it got stuck in two places, both
calls to VFS_SYNC(). The first was a
sync();
right at the beginning of umount.c.
- All I did for that one is move it to after the code that handles
option processing and change it to
if ((fflag & MNT_FORCE) == 0)
sync();
so that it isn't done for the "-f" case. (I believe the sync(); call
at the beginning of umount is only a performance optimization, so I
don't think not doing it for "-f" should break anything.)

- the second happened just before the VFS_UNMOUNT() call in the
umount(2) system call. The code looks like:
if (((mp->mnt_flag & MNT_RDONLY) ||
(error = VFS_SYNC(mp, MNT_WAIT)) == 0) || (flags & MNT_FORCE) != 0)
- Although it was tempting to reverse the order of VFS_SYNC() and the
test for MNT_FORCE, I thought that might have a negative impact on
other file systems, since it avoided doing the VFS_SYNC(), so...

- Instead, I just put a check for MNTK_UNMOUNTF at the beginning of
nfs_sync(), so that it returns EBUSY for this case instead of getting
stuck trying to flush().

Assuming that I'm right w.r.t. the "sync();" at the beginning of umount.c,
it simply ensures that the umount command thread makes it as far as
VFS_UNMOUNT()->nfs_unmount(), so that the forced dismount proceeds. It
kills RPCs in progress before doing the vflush() and, since no new RPCs
can be done once MNTK_UNMOUNTF is set (it is checked at the beginning of
a request), the vflush() won't actually flush anything to the server.

As such, "umount -f" is pretty well guaranteed to throw away the dirty
buffers. I believe this is correct behaviour, but it would mean that a
user/sysadmin that uses "umount -f" for cases where the server is still
functioning, but slow, will lose data when they probably don't expect to.

Does this help? rick
ps: During simple testing, it has worked ok. It waits about 1 minute for
the RPC threads to shut down, but the "umount -f" does complete after
that happens. It the consensus seems to be that patching this is a
good idea, I'll get some more testing done.

_______________________________________________
freebsd-current@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@xxxxxxxxxxx"



Relevant Pages

  • Re: umount -f implementation
    ... network partition the server ... that it gets stuck trying to write dirty blocks to the server. ... I had, in the past, assumed that a "umount -f" of an NFS mount would be ...
    (freebsd-current)
  • Re: 11.4 problem with NFS umount (and others)
    ... What I see when I do an umount in the server log: ... So it receives the umount. ... Then the client hangs, however. ... Is NFS that unstable on 11.4? ...
    (alt.os.linux.suse)
  • Re: Unmounting NFS Server
    ... How to unmount an mountef drive through NFS on server HUP without ... did you try to make an umount /MOUNTPOINT? ... there are no opend files on this share. ...
    (comp.os.linux.networking)
  • Re: Unmounting NFS Server
    ... How to unmount an mountef drive through NFS on server HUP without ... are no opend files on this share. ... all servers) or try an umount -f /MOUNTPOINT. ...
    (comp.os.linux.networking)
  • Re: Gianfar driver failing on MPC8641D based board
    ... The boot process grinds to a halt not long after the first access of the ... NFS root and I receive multiple "nfs: server 192.168.0.1 not responding, ... at which point the gianfar network driver fails to compile (I have tried ...
    (Linux-Kernel)