Re: kern/60889 - zero IP id change issues in 5.2RC2

From: Andre Oppermann (andre_at_freebsd.org)
Date: 01/08/04

  • Next message: Christophe Prevotaux: "MPD and RADIUS"
    Date: Thu, 08 Jan 2004 00:34:57 +0100
    To: rw@codeburst.co.uk, freebsd-net@freebsd.org
    
    

    Andre Oppermann wrote:
    >
    > Richard,
    >
    > I've attached a patch that fixes the problem with FIN/ACK and one more
    > case which got it wrong. I also fixed the host byte order problem in
    > ip_output.c for the normal ip_id incrementor.
    >
    > Some comments and questions:
    >
    > 1. Do you think it is neccessary to do a htons() on the randomized
    > ip_id too? I'd say yes if there is a case where it has to
    > monotonically increase afterwards. Does it?
    >
    > 2. I have a Win2k machine but have check out how I can get tcp header
    > compression to work with my Cisco AS5300 (if it doesn't do that by
    > default). Will I see the problem when I do a download from a FreeBSD
    > 5.2RC2 machine or do I have to use the Windoze as router sending
    > packets upwards?`
    >
    > 3. There are indeed devices clearing the DF bit. For example Cisco
    > is recommending this in it's trouble shooting section for broadband
    > access via DSL/L2TP where the MTU is lower than 1500 because of the
    > tunnelling overhead. Make a route-map to unset the DF bit and apply
    > it to the incoming interface.

    4. After reading the pf.conf man page from OpenBSD (where it talks about
       scrubbing IP packets) I don't think it's a good idea to set the ip_id
       to zero in the DF case. It seems to cause various kinds of problems
       when for some reason DF is removed along the path (which it shouldn't
       but whatever). I think it is clearly better to put a ip_id into every
       packet no matter what. Although the ip_randomid() function doesn't
       look really cheap to run... but then security comes at a cost.

    5. Random ip_id is already a kernel option but it is *not* enabled by
       default in 5.2RC2 GENERIC or -current. On the other hand the code
       *does* zero the ip_id when DF is set in any case which is troublesome
       as we just found out.

    Updated fix attached. Same tab-brokenness due to c&p.

    -- 
    Andre
    Index: ip_output.c
    ===================================================================
    RCS file: /home/ncvs/src/sys/netinet/ip_output.c,v
    retrieving revision 1.203
    diff -u -p -r1.203 ip_output.c
    --- ip_output.c 20 Nov 2003 20:07:38 -0000      1.203
    +++ ip_output.c 7 Jan 2004 23:31:56 -0000
    @@ -229,10 +229,10 @@ ip_output(struct mbuf *m0, struct mbuf *
            /*
             * Fill in IP header.  If we are not allowing fragmentation,
    -        * then the ip_id field is meaningless, so send it as zero
    -        * to reduce information leakage.  Otherwise, if we are not
    -        * randomizing ip_id, then don't bother to convert it to network
    -        * byte order -- it's just a nonce.  Note that a 16-bit counter
    +        * then the ip_id field is meaningless, but we don't set it
    +        * to zero.  Doing so causes various problems when devices along
    +        * the path (routers, load balancers, firewalls, etc.) illegally
    +        * disable DF on our packet.  Note that a 16-bit counter
             * will wrap around in less than 10 seconds at 100 Mbit/s on a
             * medium with MTU 1500.  See Steven M. Bellovin, "A Technique
             * for Counting NATted Hosts", Proc. IMW'02, available at
    @@ -241,17 +241,11 @@ ip_output(struct mbuf *m0, struct mbuf *
            if ((flags & (IP_FORWARDING|IP_RAWOUTPUT)) == 0) {
                    ip->ip_v = IPVERSION;
                    ip->ip_hl = hlen >> 2;
    -               if ((ip->ip_off & IP_DF) == 0) {
    -                       ip->ip_off = 0;
     #ifdef RANDOM_IP_ID
    -                       ip->ip_id = ip_randomid();
    +               ip->ip_id = ip_randomid();
     #else
    -                       ip->ip_id = ip_id++;
    +               ip->ip_id = htons(ip_id++);
     #endif
    -               } else {
    -                       ip->ip_off = IP_DF;
    -                       ip->ip_id = 0;
    -               }
                    ipstat.ips_localout++;
            } else {
                    hlen = ip->ip_hl << 2;
    Index: tcp_subr.c
    ===================================================================
    RCS file: /home/ncvs/src/sys/netinet/tcp_subr.c,v
    retrieving revision 1.172
    diff -u -p -r1.172 tcp_subr.c
    --- tcp_subr.c  6 Jan 2004 23:29:46 -0000       1.172
    +++ tcp_subr.c  7 Jan 2004 23:31:57 -0000
    @@ -459,6 +459,8 @@ tcp_respond(tp, ipgen, th, m, ack, seq,
            tlen += sizeof (struct tcpiphdr);
            ip->ip_len = tlen;
            ip->ip_ttl = ip_defttl;
    +       if (path_mtu_discovery)
    +               ip->ip_off |= IP_DF;
           }
            m->m_len = tlen;
            m->m_pkthdr.len = tlen;
    @@ -1733,6 +1735,8 @@ tcp_twrespond(struct tcptw *tw, struct s
                    m->m_pkthdr.csum_flags = CSUM_TCP;
                    m->m_pkthdr.csum_data = offsetof(struct tcphdr, th_sum);
                    ip->ip_len = m->m_pkthdr.len;
    +               if (path_mtu_discovery)
    +                       ip->ip_off |= IP_DF;
                    error = ip_output(m, inp->inp_options, NULL,
                        (tw->tw_so_options & SO_DONTROUTE), NULL, inp);
            }
    > If these questions are answered I can prepare the final patch and
    > commit it pending review and re@ approval.
    > 
    > --
    > Andre
    > 
    > (Note: tabs converted to whitespace because of c&p)
    > 
    > Index: ip_output.c
    > ===================================================================
    > RCS file: /home/ncvs/src/sys/netinet/ip_output.c,v
    > retrieving revision 1.203
    > diff -u -p -r1.203 ip_output.c
    > --- ip_output.c 20 Nov 2003 20:07:38 -0000      1.203
    > +++ ip_output.c 7 Jan 2004 22:43:54 -0000
    > @@ -246,7 +246,7 @@ ip_output(struct mbuf *m0, struct mbuf *
    >  #ifdef RANDOM_IP_ID
    >                         ip->ip_id = ip_randomid();
    >  #else
    > -                       ip->ip_id = ip_id++;
    > +                       ip->ip_id = htons(ip_id++);
    >  #endif
    >                 } else {
    >                         ip->ip_off = IP_DF;
    > Index: tcp_subr.c
    > ===================================================================
    > RCS file: /home/ncvs/src/sys/netinet/tcp_subr.c,v
    > retrieving revision 1.172
    > diff -u -p -r1.172 tcp_subr.c
    > --- tcp_subr.c  6 Jan 2004 23:29:46 -0000       1.172
    > +++ tcp_subr.c  7 Jan 2004 22:43:55 -0000
    > @@ -459,6 +459,8 @@ tcp_respond(tp, ipgen, th, m, ack, seq,
    >         tlen += sizeof (struct tcpiphdr);
    >         ip->ip_len = tlen;
    >         ip->ip_ttl = ip_defttl;
    > +       if (path_mtu_discovery)
    > +               ip->ip_off |= IP_DF;
    >        }
    >         m->m_len = tlen;
    >         m->m_pkthdr.len = tlen;
    > @@ -1733,6 +1735,8 @@ tcp_twrespond(struct tcptw *tw, struct s
    >                 m->m_pkthdr.csum_flags = CSUM_TCP;
    >                 m->m_pkthdr.csum_data = offsetof(struct tcphdr, th_sum);
    >                 ip->ip_len = m->m_pkthdr.len;
    > +               if (path_mtu_discovery)
    > +                       ip->ip_off |= IP_DF;
    >                 error = ip_output(m, inp->inp_options, NULL,
    >                     (tw->tw_so_options & SO_DONTROUTE), NULL, inp);
    >         }
    > 
    > Richard Wendland wrote:
    > >
    > > I've been asked (for freebsd-bugs) to open a discussion about PR
    > > kern/60889 on freebsd-net, to decide if a recent change to IP should be
    > > reversed before 5.2-RELEASE (scheduled this month) to give more time
    > > for some serious issues and risks my PR has raised to be considered
    > > and tested.  My proposal is to revert to 5.1 behaviour for now.
    > >
    > > The recent change is to emit a zero fragmentation id when DF is set, with
    > > the objective of improving privacy from external id sequence observation,
    > > an issue raised by Steve Bellovin's paper in IMW'02.
    > >
    > > I've identified 4 problems with this change:
    > >
    > > 1) Even with path_mtu_discovery set, for some reason TCP emits FIN-ACK
    > >    without DF.  This causes two problems for this change:
    > >
    > >    a) Currently the change doesn't really meet its objective for TCP,
    > >       it just means FIN-ACK must be observed.
    > >
    > >    b) Because now just one id is consumed per TCP connection on the
    > >       FIN-ACK, for most/many systems id becomes a close approximate
    > >       count of the number of TCP connections it has made, which external
    > >       observers can see, not possible before this change.  To me this
    > >       seems more of a privacy issue for all FreeBSD users than the NAT
    > >       issue in Steve Bellovin's paper this change seeks to solve.
    > >
    > > 2) Linux made exactly this change some time ago, around Linux 2.4.4,
    > >    but was forced to back-out the change because (I think) of practical
    > >    connectivity issues related to the comment in include/net/ip.h
    > >    ip_select_ident() where it now implements an incrementing ip_id for DF:
    > >
    > >      /* This is only to work around buggy Windows95/2000
    > >       * VJ compression implementations.  If the ID field
    > >       * does not change, they drop every other packet in
    > >       * a TCP stream using header compression.
    > >       */
    > >
    > >    I'm not aware that anyone checked that this buggy Windows95/2000 VJ
    > >    compression problem is no longer an issue in practice.  I doubt that
    > >    the large web hosters who use FreeBSD would be best-pleased if they
    > >    ran into this for even just a few users - especially as there is no
    > >    config option to disable this change.
    > >
    > > 3) This change causes ip_id for non-DF to be output in native
    > >    byte order in ip_output.c.  Unfortunately ip_id is still output in
    > >    Network Byte Order in ip_mroute.c and raw_ip.c, so this change risks
    > >    little-endian machines emitting the same fragmentation id at about
    > >    the same time from these different modules, rather than the usual
    > >    64k cycle; creating a small but real risk of re-assembly errors.
    > >    [This isn't in my PR, I've only just noticed it.]
    > >
    > > I also have a suspicion that some middle-boxes (like HTTP load-balancers)
    > > may clear DF without setting a fresh IP id - clearing DF would save them
    > > the bother of routing ICMP fragmentation needed back to the source server.
    > > If this is so this is another problem this change could show up which
    > > may cause re-assembly errors.  I know the bug is elsewhere, but it would
    > > still become a practical problem for some FreeBSD users.
    > >
    > > So before going with this change I think four things need to be done:
    > >
    > > 1) TCP changed so FIN-ACK goes out with DF if path_mtu_discovery set.
    > >
    > > 2) Tests with Windows95/2000 TCP VJ compression (RFC1144) run.
    > >
    > > 3) ip_id should be emitted in the same byte order everywhere.
    > >
    > > 4) The change made a config option, so sites can disable it should
    > >    they run into problems, just as RANDOM_IP_ID is an option.
    > >
    > > This all seems too much to do for 5.2-RELEASE, and as I think the problems
    > > and risks sufficiently serious, I propose reversing the change until this
    > > can be done.  We need a quick decision on this to get it into 5.2-RELEASE.
    > >
    > > The PR has some more detail and tcpdump output demonstrating the
    > > issue:
    > >
    > >     http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/60889
    > >
    > > The change to be reversed can be seen at:
    > >
    > >     http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet/ip_output.c#rev1.189
    > >     http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet/ip_output.c.diff?r1=1.188&r2=1.189
    > >
    > > NB If anyone can easily run tests with Windows95/2000 TCP VJ compression that
    > > would be great (using tcpdump to see if there is abnormal retransmission).
    > >
    > > NB2 If anyone knows why FIN-ACK goes out without DF that would be helpful.
    > > I've quickly looked at the source, and I can't see why.  It's emitted
    > > as TCP moves from FIN_WAIT_n state to TIME_WAIT probably at line 3091 of
    > > tcp_input.c with a call to tcp_output().  tcp_output() always appears to
    > > set IP_DF at line 998 of tcp_output.c, if path_mtu_discovery is enabled.
    > > A puzzle.
    > >
    > > Thanks to Boris Staeblow <bs@dva.in-berlin.de> and Tim Rylance
    > > <tkr@puffball.demon.co.uk> for highlighting this change and helping me
    > > diagnose the issues.
    > >
    > >         Richard
    > > --
    > > Richard Wendland                                richard@wendland.org.uk
    > > _______________________________________________
    > > freebsd-net@freebsd.org mailing list
    > > http://lists.freebsd.org/mailman/listinfo/freebsd-net
    > > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
    > _______________________________________________
    > freebsd-net@freebsd.org mailing list
    > http://lists.freebsd.org/mailman/listinfo/freebsd-net
    > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
    _______________________________________________
    freebsd-net@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-net
    To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
    

  • Next message: Christophe Prevotaux: "MPD and RADIUS"