Re: panic: m_copym, length > size of mbuf chain

From: Robert Watson (rwatson_at_freebsd.org)
Date: 07/10/04

  • Next message: Brian Fundakowski Feldman: "Re: quick interactivity? question regarding -current"
    Date: Sat, 10 Jul 2004 10:27:23 -0400 (EDT)
    To: Daniel Lang <dl@leo.org>
    
    

    On Sat, 10 Jul 2004, Daniel Lang wrote:

    > So I come back to the issue. As I already wrote, I guess I can rule out
    > hardware problems now. I did a very thorough test with the Dell
    > diagnosis utilities which showed no problems.

    Thanks!

    > Also, after John's patch I did not see any WITNESS related problems (so
    > far) again. But I had the m_copy panic again (see subject). This time I
    > did file a PR and did some more detailed gdb analysis. It is all
    > documented at:
    >
    > http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/68889
    >
    > I am puzzled, because the stack frame on entering m_copym has 0x0 as
    > first argument (m), however in the previous frame when m_copy() is
    > called, the struct mbuf* argument is valid.
    >
    > Ok, I just realized that there is a difference m_copy() and m_copym()
    > are apparently different functions. Is this a makro/#define discrepancy
    > it seems that that m_copym() is the function which is called in this
    > line of code.
    >
    > Ah, I found it:
    >
    > sys/mbuf.h:#define m_copy(m, o, l) m_copym((m), (o), (l), M_DONTWAIT)
    >
    > so, the puzzle remains, since the arguments passed are kept, except that
    > M_DONTWAIT flag is added.
    >
    > Is this a trashed stack?

    Possibly, but notice that the m_copym() function modifies its copy of 'm'
    in the stack as part of its work -- it uses 'm' to iterate the mbuf chain
    passed in in order to move to the necessary starting offset for the copy.
    Note that the requested offset ('off0') is 737, and the requested 'len' is
    at least 1222, so the loop starting at line 369 will walk until it either
    gets far enough or the "offset > size" assertion triggers:

            while (off > 0) {
                    KASSERT(m != NULL, ("m_copym, offset > size of mbuf chain"));
                    if (off < m->m_len)
                            break;
                    off -= m->m_len;
                    m = m->m_next;
            }

    Since that assertion didn't trigger, we can assume m_copym() successfully
    walked at least 'off0' (737) bytes. The problem appears to be that it ran
    out of mbufs in which to find data to copy, as it hit the end of the chain
    (m == NULL):

            while (len > 0) {
                    if (m == NULL) {
                            KASSERT(len == M_COPYALL,
                                ("m_copym, length > size of mbuf chain"));
                            break;
                    }

    So the initial conclusion is that the caller requested that more data be
    copied from the chain than is actually present in the chain. This
    suggests a bug in socket buffer management or the TCP code. It's
    interesting to note that the socket buffer believes it contains less than
    the requested length -- 'so_snd.sb_mbcnt' is 1536, which is arguably less
    than 737 + 1222 (although we don't know, I think, if it's iterated or not
    and therefore decreased the value of 'len'). Could you print the value of
    'top' in the m_copym() stack? That will tell us if it's on the first mbuf
    or not.

    It sounds like the socket buffer state may be inconsistent with the TCP
    PCB state, or that the expectations in tcp_offset() are wrong. I've CC'd
    Paul because he's had his hands in the new SACK code that was merged, and
    it has its hands in that bit of the output code. Here are some things you
    might want to try:

    (1) Try running with TCP SACK disabled. Set the
        'net.inet.tcp.sack.enable' sysctl to 0 to try this.

    (2) Try adding some assertions just before the copy to m_copy() in
        tcp_output(). I'd suggest something like the following:

    Index: tcp_output.c
    ===================================================================
    RCS file: /home/ncvs/src/sys/netinet/tcp_output.c,v
    retrieving revision 1.95
    diff -u -r1.95 tcp_output.c
    --- tcp_output.c 23 Jun 2004 21:04:37 -0000 1.95
    +++ tcp_output.c 10 Jul 2004 14:12:29 -0000
    @@ -740,6 +740,15 @@
     #endif
                     m->m_data += max_linkhdr;
                     m->m_len = hdrlen;
    + if (off + len > so->so_snd.sb_cc) {
    + printf("tcp_output: not enough data to copy\n");
    + printf("off=%d\n", off);
    + printf("len=%ld\n", len);
    + printf("so->so_snd.sb_cc=%d\n", so->so_snd.sb_cc);
    + printf("m_length(so->so_snd.sb_mb, NULL) == %d\n",
    + m_length(so->so_snd.sb_mb, NULL));
    + panic("Down she goes...");
    + }
                     if (len <= MHLEN - hdrlen - max_linkhdr) {
                             m_copydata(so->so_snd.sb_mb, off, (int) len,
                                 mtod(m, caddr_t) + hdrlen);

        These printfs are oriented at making sure the socket buffer state is
        consistent, and don't attempt to dump the TCP state used to construct
        'len' or 'off'. If the buffer is consistent (i.e., sb_cc ==
        m_length()), then the next logical thing to do is figure out how
        (len + off) became greater than the available buffer space. If you
        look up a bit higher in tcp_output(), you'll see the values being set,
        trimmed, etc, based on various values in the tcpcb.

    (3) If the socket buffer is inconsistent -- i.e., the length of the buffer
        in the socket buffer meta-data doesn't match the actual length of the
        mbufs in the buffer -- try compiling in "options SOCKBUF_DEBUG", which
        turns on additional socket buffer diagnostics.

    Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
    robert@fledge.watson.org Principal Research Scientist, McAfee Research

    _______________________________________________
    freebsd-current@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-current
    To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"


  • Next message: Brian Fundakowski Feldman: "Re: quick interactivity? question regarding -current"

    Relevant Pages

    • Re: panic: m_copym, length > size of mbuf chain
      ... > copied from the chain than is actually present in the chain. ... > suggests a bug in socket buffer management or the TCP code. ... > It sounds like the socket buffer state may be inconsistent with the TCP ... I'm very suspicious of the SACK code. ...
      (freebsd-current)
    • Re: using ipfw seems to interfere with socket communication
      ... to break the code's assumption that TCP over the loopback interface is instantaneous. ... If my fading memory of TCP/IP Illustrated Vol 2 serves me right, that was actually the case at least back then: the sendto system call would push the data all the way down to lo0, which would immediately pass it back up until it ended up in the receiving socket buffer. ...
      (freebsd-net)
    • Re: Much improved sosend_*() functions
      ... For datagrams the uio must not contain more data than we have space left in the socket buffer. ... For TCP this is of no consequence either as TCP doesn't care about it. ... Also, 'space' is not just about PRUS_MORETOCOME, it's also about the loop terminating at the right time. ... I'll try to give your revised patch a closer reading this weekend. ...
      (freebsd-current)
    • Re: Much improved sosend_*() functions
      ... For datagrams the uio must not contain more data than we have space left in the socket buffer. ... For TCP this is of no consequence either as TCP doesn't care about it. ... Also, 'space' is not just about PRUS_MORETOCOME, it's also about the loop terminating at the right time. ... I'll try to give your revised patch a closer reading this weekend. ...
      (freebsd-net)
    • Re: Much improved sendfile(2) kernel implementation
      ... A> as many pages into mbufs as it can up to the free send socket buffer space. ... I see that mbuf allocations are done in different way depending on the SS_NBIO ... It isn't related to blocking on memory, ... sendfile() has two flags which control its blocking behavior. ...
      (freebsd-current)