Re: Network Stack Locking

From: Matthew Dillon (dillon_at_apollo.backplane.com)
Date: 05/25/04

  • Next message: Robert Watson: "Re: Network Stack Locking"
    Date: Mon, 24 May 2004 20:39:21 -0700 (PDT)
    To: Robert Watson <rwatson@FreeBSD.org>
    
    

    :On Mon, 24 May 2004, Eivind Eklund wrote:
    :
    :> On Fri, May 21, 2004 at 01:23:51PM -0400, Robert Watson wrote:
    :> > The other concern I have is whether the message queues get deep or not:
    :> > many of the benefits of message queues come when the queues allow
    :> > coallescing of context switches to process multiple packets. If you're
    :> > paying a context switch per packet passing through the stack each time you
    :> > cross a boundary, there's a non-trivial operational cost to that.
    :>
    :> I don't know what Matt has done here, but at least with the design we
    :> used for G2 (a private DFly-like project that John Dyson, I, and a few
    :> other people I don't know if want to be anonymous or not ran), this
    :> should not an issue. We used thread context passing with an API that
    :> contained putmsg_and_terminate() and message ports that automatically
    :> could spawn new handler threads. Effectively, a message-related context
    :> switch turned into "assemble everything I care about in a small package,
    :> reset the stack pointer, and go". The expectation was that this should
    :> end up with less overhead than function calls, as we could drop the call
    :> frames for "higher levels in the chain". We never got to the point
    :> where we could measure if it worked out that way in practice, though.
    :
    :Sounds a lot like a lot of the Mach IPC optimizations, including their use
    :of continuations during IPC to avoid a full context switch.
    :
    :Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
    :robert@fledge.watson.org Senior Research Scientist, McAfee Research

        Well, I like the performance aspects of a continuation mechanism, but
        I really dislike the memory overhead. Even a minimal stack is
        expensive when you multiply it by potentially hundreds of thousands
        of 'blocking' entities such as PCBs.. say, a TCP output stream.
        Because of this the overhead and cache pollution generated by the
        continuation mechanism increases as system load increases rather
        then decreases.

        Deep message queues aren't necessarily a problem and, in fact, having
        one or two dozen messages backed up in a protocol thread's message
        port is actually good because the thread can then process all the
        messages in a tight loop (cpu and cache locality of reference). If
        designed properly, this directly mitigates the cost of a thread switch
        as system load increases. So message queueing has the opposite effect...
        per-unit handling overhead *decreases* as system load increases.
        (Also, DragonFly's thread scheduler is a much lighter weight mechanism
        then what you have in FBsd-4 or FBsd-5).

        e.g.: lets say you have a context switch overhead of 1uS and a message
        processing overhead of 100ns.
            
            light load: 100 messages/sec: 1.1uS/message

            medium load: 1000 messages/sec, average 10 messages in queue at
                            context switch: 10*100ns+1uS = 2uS/10 =
                                                    200ns/msg

            heavy load: 10000 msgs/sec, average 100 msgs in queue:
                                                    100*100ns+1uS = 11uS/100=
                                                    110ns/msg

        The reason a deep message queue is not a problem vs other mechanisms
        is simple... a message represents a unit of work. The work must be
        done regardless, and on the cpu it was told to be done on, no matter
        whether you use a message or a continuation or some other mechanism.
        In otherwords, a deep message queue is actually an effect of the
        problem, not a cause of that problem. Solving the problem (if it
        actually is a problem) does not involve dealing with the deep message
        queue, it involves dealing with the set of circumstances that are
        causing that deep message queue to occur.

        Now, certainly end-to-end latency is an issue. But when one is talking
        about context switching one is talking about nanoseconds and microseconds.
        Turn-around latency just isn't an issue most of the time, and in those
        extremely rare cases where it might be one does the turn-around in the
        driver interrupt anyway.

                                            -Matt
                                            Matthew Dillon
                                            <dillon@backplane.com>
    _______________________________________________
    freebsd-arch@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-arch
    To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org"


  • Next message: Robert Watson: "Re: Network Stack Locking"