Re: m_copy & if_simloop



Yar Tikhiy wrote:
Hi folks,

A friend user reported to me that rwhod wouldn't work in CURRENT
due to broken outgoing packets. Here's an example:

16:15:28.212810 IP truncated-ip - 6865 bytes missing! (tos 0x0, ttl 64, id 28554, offset 0, flags [none], proto: UDP (17), length: 7169, bad cksum 11c (->c64b)!) 10.10.10.4.513 > 10.10.10.255.513: UDP, length 276
0x0000: 4500 1c01 6f8a 0000 4011 011c 0a0a 0a04 E...o...@.......
^^^^ ^^^^ broken fields
0x0010: 0a0a 0aff 0201 0201 011c 0000 0101 0000 ................
0x0020: 4565 9ef0 0000 0000 6467 0000 0000 0000 Ee......dg......
0x0030: 0000 0000 0000 0000 0000 0000 0000 0000 ................
0x0040: 0000 0000 0000 0000 0000 001a 0000 000e ................
0x0050: 0000 0005 4564 a5e7 7474 7976 3000 0000 ....Ed..ttyv0...
0x0060: 726f 6f74 0000 0000 4565 9d4a 0000 01a6 root....Ee.J....
0x0070: 7474 7976 3100 0000 726f 6f74 0000 0000 ttyv1...root....
0x0080: 4565 9d4d 0000 000c 7474 7976 3200 0000 Ee.M....ttyv2...
0x0090: 726f 6f74 0000 0000 4565 9d4f 0000 0099 root....Ee.O....
0x00a0: 7474 7976 3300 0000 726f 6f74 0000 0000 ttyv3...root....
0x00b0: 4565 9d52 0000 019e 7474 7976 3400 0000 Ee.R....ttyv4...
0x00c0: 726f 6f74 0000 0000 4565 9d54 0000 019c root....Ee.T....
0x00d0: 7474 7976 3500 0000 726f 6f74 0000 0000 ttyv5...root....
0x00e0: 4565 9d59 0000 0198 7474 7976 3600 0000 Ee.Y....ttyv6...
0x00f0: 726f 6f74 0000 0000 4565 9d5b 0000 0195 root....Ee.[....
0x0100: 7474 7976 3700 0000 726f 6f74 0000 0000 ttyv7...root....
0x0110: 4565 9d5e 0000 0000 7474 7970 3100 0000 Ee.^....ttyp1...
0x0120: 7961 7200 0000 0000 4565 8361 0000 04b2 yar.....Ee.a....

BTW, the problem manifests itself only if the packet is longer than
256 bytes.

The problem seems to stem from the following. In ether_output(),
the broadcast packet is copied and looped back up the stack, now
via if_simloop(). The copy has been made with m_copy() since 4.4BSD.
I can't tell about the old days, but today m_copy() alias m_copym()
will copy mbufs but not mbuf clusters, which results in an effectively
read-only copy. Such a copy must not be passed up the stack because
the stack is free to change it and thus destroy the original. For
a long time, enough leading bytes were in plain mbuf(s) for the bug
to stay unnoticed. However, the pattern changed in CURRENT some
day and -- here we are.

The problem can be cured by using m_dup() in place of m_copy()
(verified).

Is my analysis correct?

Sounds likely. The read-only'ness definitely.


If so, here's an idea of a general fix. Several source files do
the following:

struct mbuf *mcopy = m_copy(m, 0, M_COPYALL);
/* some even don't check mcopy for NULL here! */
if_simloop(ifp, mcopy, family, hdrlen);

It's common code, so just a flag to if_simloop() cound be introduced
meaning "m_dup() the packet properly". E.g.:

if_simloop(ifp, m, family, hdrlen, M_DUP);

In STABLE, M_COPYALL can be added to hdrlen instead to preserve the
ABI. M_COPYALL is defined as 1000000000 now, which allows for the
trick.

Comments? Thanks!

What you suggest seems ok. You might also look at m_unshare which was
added for similar purpose but is not exactly what you want. It may be
possible to combine m_copy+m_unshare code (not calls) to create the new
mbuf chain more efficiently.

Sam
_______________________________________________
freebsd-net@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscribe@xxxxxxxxxxx"



Relevant Pages

  • FreeBSD 5.3 Bridge performance take II
    ... I have just finished some profiling and analysis of the FREEBSD_5_BP code ... factor seems to be simply the number of mutexes (11 per packet ... uma_zalloc_arg (per CPU lock) ... Implement device level caching for the UMA mbuf zones. ...
    (freebsd-current)
  • FreeBSD 5.3 Bridge performance take II
    ... I have just finished some profiling and analysis of the FREEBSD_5_BP code ... factor seems to be simply the number of mutexes (11 per packet ... uma_zalloc_arg (per CPU lock) ... Implement device level caching for the UMA mbuf zones. ...
    (freebsd-current)
  • unbreaking igmpproxy
    ... I decided to go deeper into igmpproxy: ... building IGMP packet with datalen=0 ... IGMP header) or the mbuf has negative length (I didn't check if the mbuf ...
    (freebsd-net)
  • Re: Bug in OHCI2 driver for Windows CE 5.0?
    ... CE's USB stack is not derived from any of the desktop Windows stacks; ... A short packet (i.e., a packet smaller ... I'm in a middle of porting USB device-driver from Windows NT/2k/XP... ... A low-speed device (Contains couple of firmware versions) ...
    (microsoft.public.windowsce.platbuilder)
  • Re: Could this be implemented with an NDIS or TDI driver?
    ... > stack is that you see reassembled data. ... TDI is same on all Windows NT versions. ... > be able to keep packet to packet state that lets you ... There is no need in certifying the networking stack plugin with WHQL. ...
    (microsoft.public.development.device.drivers)