Re: NFS client/buffer cache deadlock

From: Jilles Tjoelker (jilles_at_stack.nl)
Date: 04/20/05

  • Next message: Brian Fundakowski Feldman: "Re: NFS client/buffer cache deadlock"
    Date: Wed, 20 Apr 2005 19:12:20 +0200
    To: Brian Fundakowski Feldman <green@freebsd.org>
    
    

    On Wed, Apr 20, 2005 at 11:52:33AM -0400, Brian Fundakowski Feldman wrote:
    > On Wed, Apr 20, 2005 at 05:35:28PM +0200, Marc Olzheim wrote:
    > > On Wed, Apr 20, 2005 at 11:20:38AM -0400, Brian Fundakowski Feldman wrote:
    > > > > Btw.: I'm not sure write(),writev() and pwrite() are allowed to do short
    > > > > writes on regular files... ?

    > > > Our manpage is incorrect; POSIX states that they are (see earlier
    > > > e-mail). There really is no alternative -- we simply can't build
    > > > an NFS transaction larger than our buffer cache can accomodate.
    > > > Note that short wries won't happen for normal buffer sizes, only
    > > > excessively large ones. I really don't believe that writev() is meant
    > > > to be used so that you can write gigantic data structures in a single
    > > > transaction...

    It is ok to return partial success if the first chunk of a large write
    succeeded and a later chunk failed persistently, but not if it cannot be
    performed as a single NFS transaction.

    > > Ah, I was reading the SUSv2 page:

    > > http://www.opengroup.org/onlinepubs/009695399/functions/write.html

    > > instead of the POSIX version.

    > > But in neither of those I can extrude the fact that it can return
    > > with result < nbyte, without it being a permanent condition.
    > > What phrase makes you conclude that it can ?

    > This specific issue is not clear-cut; the best thing to do lies somewhere
    > within the range of these scenarios:

    > "If a write() requests that more bytes be written than there is room
    > for (for example, [XSI] [Option Start] the process' file size limit
    > or [Option End] the physical end of a medium), only as many bytes as
    > there is room for shall be written. For example, suppose there is
    > space for 20 bytes more in a file before reaching a limit. A write of
    > 512 bytes will return 20. The next write of a non-zero number of bytes
    > would give a failure return (except as noted below)."

    This only applies to permanent conditions.

    > "When attempting to write to a file descriptor (other than a pipe or
    > FIFO) that supports non-blocking writes and cannot accept the data
    > immediately:

    > * If the O_NONBLOCK flag is clear, write() shall block the calling
    > thread until the data can be accepted.

    > * If the O_NONBLOCK flag is set, write() shall not block the
    > thread. If some data can be written without blocking the thread,
    > write() shall write what it can and return the number of bytes
    > written. Otherwise, it shall return -1 and set errno to [EAGAIN]."

    I think regular files do not support non-blocking writes, even if they
    are on NFS; in any case, O_NONBLOCK is disabled by default.

    > "[ENOBUFS] Insufficient resources were available in the system to
    > perform the operation."

    > I think the first is more useful behavior than the last. Supporting it
    > should be exactly the same as supporting what happens if the actual
    > filesystem fills up. In this case, the filesystem is being requested to
    > write more "than there is room for."

    The filesystem filling up is a totally different case as attempting the
    rest of the write is futile in that case.

    In a lot of code, a short write() is treated as a (fairly) persistent
    error.

    -- 
    Jilles Tjoelker
    _______________________________________________
    freebsd-current@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-current
    To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
    

  • Next message: Brian Fundakowski Feldman: "Re: NFS client/buffer cache deadlock"

    Relevant Pages