Re: Running the network stack without Giant -- change in default coming

From: Richard Coleman (rcoleman_at_criticalmagic.com)
Date: 08/24/04

  • Next message: Mark Johnston: "cvs-src summary for August 16-23"
    Date: Tue, 24 Aug 2004 11:41:03 -0400
    To: Robert Watson <rwatson@FreeBSD.org>
    
    

    Very very cool. It's exciting to see many of the long term
    FreeBSD projects coming together like this.

    Richard Coleman
    rcoleman@criticalmagic.com

    Robert Watson wrote:
    > For some time, one of the major goals of the FreeBSD Project has been
    > to allow the network stack to run in parallel on multiple processors
    > at a time. Per my July 19, 2004 post to the freebsd-current mailing
    > list, much of this support has now been merged to the FreeBSD
    > 5-CURRENT branch (and now 6-CURRENT), with the intent of shipping
    > this support in 5.3. And, per that post, it's now possible to run
    > large parts of the network stack in this manner through the use of a
    > system tunable at boot, debug.mpsafenet. This can result in a variety
    > of performance benefits, especially on SMP, by improving concurrency
    > and reducing latency. While it presents a "first cut" locking
    > strategy, these benefits are still pretty tangible, and the resulting
    > system is an excellent starting architecture for a broad range of
    > performance work.
    >
    > Right now, that tunable "debug.mpsafenet" defaults to off (0) in the
    > 5-CURRENT and 6-CURRENT branches. However, this will shortly change
    > in 6-CURRENT to on (1), as most commonly exercised parts of the
    > network stack are now ready for testing in this environment. Some
    > caveats before I go into the details as to how to determine whether
    > this is right for you:
    >
    > - While we've been doing pretty heavy testing in MPSAFE
    > configurations, the nature of multiprocessor development and adapting
    > code for MP safety means that it's unlikely this will "just work" for
    > every last person who tries it. However, it appears to work well in
    > a broad variety of environments and with fairly strenuous testing.
    >
    > - We've focussed primarily on getting mainstream network
    > configurations to run without Giant: this means that less mainstream
    > subsystems (parts of IPv6, some netgraph nodes, IPX, etc) are
    > currently unsafe without the Giant lock turned on. Less mainstream
    > network devices, even if the device drivers are not able to run
    > without the Giant lock. are able to operate without Giant over the
    > remainder of the stack due to compatibility code. This code comes
    > with a performance penalty beyond just running with the Giant lock,
    > so there is a strong motivation to complete locking for these
    > straggling drivers.
    >
    > - You may run into hard to diagnose problems. We'd like to try to
    > diagnose them anyway, but if you start to experience new problems,
    > you'll want to go read the Handbook chapter on preparing kernel bug
    > reports and diagnosing problems. You'll also want to be prepared to
    > run the system with INVARIANTS and WITNESS turned on. The first step
    > in debugging will be to try running with Giant turned back on by
    > changing the debug.mpsafenet flag and seeing if the problem can be
    > reproduced. Details below.
    >
    > - Not all workloads will experience a performance benefit -- some,
    > for various reasons, will get worse. However, several interesting
    > performance loads get measurably better. If you don't see an
    > improvement, or you see things get worse, please don't be surprised
    > -- you may want to look at some of the suggestions I make below on
    > ways to make the results more predictable. Generally, you shouldn't
    > see substantial performance degradation, if any, but it can't be
    > ruled out, especially due to outstanding scheduler issues that are
    > being worked on.
    >
    > - We can and will destroy your data. We don't mean to, because we
    > like your data (and you!), and we try not to, but this is, after all,
    > operating system development, and comes with risks.
    >
    > With this in mind, now is a good time to increase exposure for these
    > changes, because they will become the default in the near future.
    >
    > Here's some technical information on how to get started:
    >
    > (1) Determine if all of the stack components you will operate with
    > are MPsafe. For common configurations, answering the following
    > questions will help you decide this:
    >
    > - Are you actively using IPv6, IPX, ATM, or KAME IPSEC? If you
    > answered yes to any of these questions, it is not yet safe for you to
    > run without Giant. Note that most use of IPv6 is safe, but there are
    > some areas (multicast) that are not entirely safe yet.
    >
    > - Are your using Netgraph? If yes, it may be that you are not yet
    > able to run without Giant. The framework and many nodes are MPSAFE,
    > but some remain that are not. It is worth giving it a try, but you
    > may experience panics, etc, especially in MP configurations.
    >
    > - Are you using SLIP or kernel PPP (not to be confused with user ppp,
    > which is what most FreeBSD users use with modems). If so, there are
    > experimental patches to make SLIP safe, but out of the box you may
    > see lock assertion failures. We are working to resolve this issue.
    >
    > - Are you using any physical network interfaces other than the
    > following: ath, bge, dc, em, ep, fxp, rl, sis, xl, wi. If so, you
    > may see a performance drop.
    >
    > NOTE: Do you maintain a network interface driver? Is it not on this
    > list? Shame on you! Or maybe shame on me for not listing it, even
    > though it should work. Drop me a private e-mail with any questions
    > or comments. Please update the busdma driver status web page with
    > your driver's status.
    >
    > (2) If you are comfortable that you are using an MPSAFE-supported
    > configuration, then you can use the following tunable in loader.conf
    > to disable the Giant lock over the network stack on your system:
    >
    > debug.mpsafenet="1"
    >
    > Note that this is a boot-time only flag; you can inspect the setting
    > with a sysctl, but it cannot currently be changed at runtime. You
    > will need to reboot for the change to take effect.
    >
    > Once the default has changed, it will be necessary to explicitly
    > disable Giant-free networking if that is the desired operating mode.
    > Specifically, you will need to place the following in loader.conf to
    > get that mode of operation:
    >
    > debug.mpsafenet="0"
    >
    > Some notes:
    >
    > On SMP-centric performance measurements, such as local UNIX domain
    > socket use by MySQL on MP systems, I've observed 30%-40% performance
    > improvements by disabling Giant (some details below). My recommended
    > configuration for testing out the impact of disabling Giant on MP
    > systems is:
    >
    > - Running with adaptive mutexes (now the default) and with
    > ADAPTIVE_GIANT (also now the default) appears to make a big
    > difference.
    >
    > - Try disabling HTT. In my workloads, which tend to pound the
    > kernel, HTT appears to hurt quite a bit. Obviously, the
    > effectiveness of HTT depends on the instruction mix, so this may not
    > be for you. Builds, for example, may benefit.
    >
    > - Pick one of ULE and 4BSD, and then try the other. I found 4BSD
    > helped a lot for MySQL, but I've seen other benchmarks with quite
    > different results.
    >
    > - For stability purposes with MySQL, I currently have to disable
    > PREEMPTION (currently the default), as the MySQL benchmarks I use are
    > pretty thread-centric and trigger preemption-related bugs with the
    > kernel threading bits. Recent work-arounds committed should resolve
    > this but I have not yet run stability tests.
    >
    > - If you want to measure performance, make sure to disable
    > INVARIANTS, INVARIANTS_SUPPORT, WITNESS, etc. Also, confirm that the
    > userland malloc debugging features are disabled, as they add cost to
    > each free() operation. I believe we now have a handbook with a
    > variety of recommendations on performance measurement, such as
    > disabling various daemons (such as dhclient, etc). For latency
    > measurements, PREEMPTION is generally desired, subject to stability.
    >
    > - To increase parallelism, especially for inbound packet paths on
    > multiple interfaces, set the sysctl/tunable net.isr.enable=1, which
    > enables direct dispatch in network interface ithreads, rather than
    > defering to the netisr thread. If each interface is assigned a
    > different ithread, their inbound processing paths can run in
    > parallel, as well as with loop back traffic running in the global
    > netisr thread. We have additional work to do here in terms of
    > increasing the chances of parallel dispatch, etc, and it could be
    > some environments this is not a useful setting. I'd be interested in
    > learning about the environments where a negative performance impact
    > is measured.
    >
    > Some notes on bug reporting:
    >
    > - Make sure to identify that you are running with debug.mpsafenet on.
    > If the problem is reproduceable, make sure to indicate if it goes
    > away or persists when you disable debug.mpsafenet. This will help to
    > distinguish network stack problems which are (and are not) a result
    > of this work.
    >
    > - If you appear to be experiencing a hang/deadlock, please try
    > running with WITNESS. I'd actually like to see most people running
    > with WITNESS for a bit to shake out lock order issues, as I've
    > introduced a lot of orders. If experiencing lock order reversals,
    > please include the full console warning including stack trace and any
    > warning messages prior to the trace identifying locks, etc. If
    > dropped to DDB, "show locks" is useful.
    >
    > - INVARIANTS also considered good. Even if you aren't running with
    > WITNESS, do run with INVARIANTS. Note that there is a measurable
    > performance hit for doing so.
    >
    > - If you experience a hang, see if you can get into DDB -- if you are
    > having problems getting in using a console break, try a serial
    > console. When debugging, at minimum DDB 'ps' output, along with
    > traces of interesting processes. Typically interesting will be
    > processes that appear to be involved in the hang, etc. Obviously,
    > this requires some intuition about what causes the hang and I can't
    > offer hard and fast rules here. NMI, SW_WATCHDOG, and MP_WATCHDOG
    > can all increase the chances of getting to DDB even in hard hangs.
    >
    > - Experimenting with debug.mpsafenet=1 and UP is also interesting,
    > not just SMP. With PREEMPTION turned on, it may result in lower
    > latency and/or lower throughput. Or not. Regardless, it's
    > interesting -- you don't have to have SMP to give it a spin.
    >
    > FYI, while results can and will vary, I was pleased to observe moving
    > from a UP->MP speedup of 1.07 on a dual-processor box to a speedup of
    > 1.42 with the supersmack benchmark using 11 workers and 1000 select
    > transactions with MySQL. For reference, that was with the 4BSD
    > scheduler and adaptive mutexes. For loopback netperf with TCP and
    > UDP, I observed no change in performance (well, 1% better for UDP RR,
    > but basically no change). Note that the MySQL benchmark here is
    > basically a UNIX domain socket IPC test, and so real world databases
    > will give pretty different results since they won't be pure IPC. The
    > results appear to be very sensitive to the choice of scheduler, and
    > for a variety of reasons I've preferred 4BSD during recent testing
    > (not least, better results in terms of throughput).
    >
    > There are a lot of people who have been working on this for quite
    > some time -- I can't thank them all here, but I will point at the
    > netperf web page as a place to look for ongoing patches, change logs,
    > and some credits:
    >
    > http://www.watson.org/~robert/freebsd/netperf/
    >
    > The hard work and contributions of these many developers over several
    > years is finally coming to fruition! I try to keep it up to date
    > about once a week or so as I drop new patch sets. There's also an
    > RSS feed on the change log, which is fairly technical but might be
    > interesting to some readers.
    >
    > Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
    > robert@fledge.watson.org Principal Research Scientist, McAfee
    > Research
    >
    > _______________________________________________
    > freebsd-current@freebsd.org mailing list
    > http://lists.freebsd.org/mailman/listinfo/freebsd-current To
    > unsubscribe, send any mail to
    > "freebsd-current-unsubscribe@freebsd.org"

    _______________________________________________
    freebsd-current@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-current
    To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"


  • Next message: Mark Johnston: "cvs-src summary for August 16-23"

    Relevant Pages

    • Running the network stack without Giant -- change in default coming
      ... it's now possible to run large parts of the network stack in ... Giant lock turned on. ... device drivers are not able to run without the Giant lock. ... by disabling Giant. ...
      (freebsd-current)
    • RE: Strange Irregular DNS/Networking Problems
      ... Disable offloading in the network adapter properties ... After disabling all these things file transfers across the network are a lot ... My network is not a complicated set up and only has one domain controller. ... I tried doing a net stop server after the network stalled as from an article ...
      (microsoft.public.windows.server.dns)
    • Re: Slow Network Speed from 2008 Server
      ... Network Adaptor properties which are a bit scary. ... I'm running AD on it as well as SQL Server 2005. ... that the DHCP didn't work. ...
      (microsoft.public.windows.server.networking)
    • Running the network stack without Giant -- what to try and when
      ... As many of you have seen from status reports, e-mails, bug reports, etc, ... the FreeBSD Project has been working for some time on getting the network ... without the Giant lock, and we're ready for more people to start running ... - While we've been doing pretty heavy testing in MPSAFE configurations, ...
      (freebsd-current)
    • Re: SMB packet and secure channel signing
      ... You know, in all the times that you and I have the debate on SMB Signing, ... > Optionally you can do "if client agrees" and thus the signing will be ... > Just don't screw up in the process of disabling these suckers. ... SMB Signing puts a tag on each and every network packet ...
      (microsoft.public.windows.server.sbs)