Re: process stuck in nfsfsync state

From: Robert Watson (rwatson_at_freebsd.org)
Date: 10/25/04

  • Next message: Robert Watson: "Re: Where to get 5.3 Stable"
    Date: Mon, 25 Oct 2004 13:25:49 +0100 (BST)
    To: Joan Picanyol <lists-freebsd-stable@biaix.org>
    
    

    On Mon, 25 Oct 2004, Joan Picanyol wrote:

    > > Is there an response to the request? If not, that might suggest the
    > > server is wedged, not the client. If you are willing to share the results
    > > of a tcpdump -s 1500 -w <whatever> output from a few seconds during the
    > > wedge, that would be very useful.
    >
    > Available at http://biaix.org/pk/debug/nfs/ These are from just after
    > logging in to GNOME until gconfd-2 goes to nfsfsync, and the nfs server
    > not responding messages start appearing.

    Comparing the client and server traces, it looks like fragments in the
    client-generated writes are being lost. For example, frame 4175 in the
    client trace is a fragmented NFSv3 write over UDP. The total datagram
    size is 8192, but it's broken down into six IP fragments:

    Frame IP offset Length Arrived?
    4175 0 1480 Yes
    4176 1480 1480 Yes
    4177 2960 1480 Yes
    4178 4440 1480 Yes
    4179 5920 1480 No
    4180 7400 944 Yes

    Without the missing fragments, the datagrams (and hence RPCs) can't be
    reassembled, and with 6-fragment datagrams, even fairly low probability
    loss for individual packets adds up (or multiplies up!). So the question
    is: where are your fragments going?

    Since the fragments all ended up in the BPF trace on the client, we know
    that sufficient mbufs could be allocated on that side to build not only
    the datagram but the fragment stream, as well as insert it into the
    interface queue without an overflow; they could still have been dropped at
    a low level in the driver. Since they don't appear, even corrupted, in
    the server trace, we know they either didn't reach the server or were
    dropped very early in processing in the driver. Dropping in the IP stack
    would occur after the packet was submitted to BPF. So if possible, I
    might try some of the following:

    - Substituting a different switch or hub between the two systems, and
      looking for possible chronic sources of packet loss between them.

    - If possible, getting a trace of the packets on an intermediate node to
      see whether the packets were really sent or not. Maybe on a monitor
      port on the switch, or by inserting a bridging node. My suspicion is
      either that the sender is dropping them at a low level in the driver,
      perhaps due to a resource leak, or that they're dropped on the way
      through an intermediate node. Maybe something is particularly sensitive
      to the rapid sequential send of the 6 fragments.

    - Perhaps instrumenting the device drivers on the sender and recipient to
      look for possible areas where packet drops are being triggered.

    - I think someone already suggested disabling hardware checksumming, but
      if you haven't tried that, it would be worth trying it.

    - It would be useful to see if less complicated NFS meta-transactions than
      "Start GTK" can trigger the problem. For example, doing a large dd to a
      file in NFS, varying the blocksize to see if you can find useful
      thresholds that trigger the problem. I see a lot of successful 512 byte
      writes in the trace, but larger datagram sizes of 8192 for writes seem
      to have problems.

    Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
    robert@fledge.watson.org Principal Research Scientist, McAfee Research

    _______________________________________________
    freebsd-stable@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-stable
    To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"


  • Next message: Robert Watson: "Re: Where to get 5.3 Stable"