Re: macro benchmark for mutex locks needed.

From: Robert Watson (rwatson_at_freebsd.org)
Date: 11/23/04

  • Next message: Poul-Henning Kamp: "Re: [REVIEW/TEST] nanodelay() vs DELAY()"
    Date: Tue, 23 Nov 2004 22:48:07 +0000 (GMT)
    To: Stephan Uphoff <ups@tree.com>
    
    

    On Tue, 23 Nov 2004, Stephan Uphoff wrote:

    > On Tue, 2004-11-23 at 11:32, Phil Brennan wrote:
    > > Could you post up some of your ideas to achieve these speedups? I'm
    > > fascinated by this area, because it is such a crucial one if freebsd
    > > is to perform well after all the work in unwinding giant.
    >
    > Mostly boring stuff like making sure that important mutexes live in
    > their own cache line to avoid false sharing and tweaking some code to
    > avoid unnecessary invalidation of cache lines. There are also some
    > architecture specific assembly tweaks that I like to try. Maybe a few
    > hacks for dynamic run time patching to allow processor specific and
    > SMP/UP optimizations on a GENERIC kernel. Replacing cli/sti with a
    > spl() style interrupt enabler/disabler for i386 is also something I
    > would like to test to speed up spin locks. Restoring single thread
    > wakeup for sleep mutexes is also on the list. Once I start digging I
    > will probably find more things to try.

    If you want an excellent candidate for cache line contention foo, you
    might take a glance at the uma_pcpu_mtx array in UMA.

    This may well be obsoleted by my changed to UMA to use critical sections
    instead of mutexes here, but it would be very interesting to see what
    happens here since it's an example of high probability simultaneous
    access/low probability contention mutexes that are packed tightly. The
    impact on performance, if significant, would be measurable using a broad
    range of benchmarks. Some other interesting candidates might be:

    - Mutex pool mutexes in kern_mtxpool.c.
    - The sockbuf send/receive mutexes in struct socket, and in fact the
      struct sockbufs themselves.

    We might also want to investigate a struct mtx_with_pad that includes the
    necessary padding, to be used for static mutex structures that are
    probably getting packed with the oddest stuff. I.e., sigio_mtx, devmtx,
    mac_policy_mtx, malloc_mtx, lockbuilder_pool, tid_lock, callout_lock,
    cache_lock.

    Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
    robert@fledge.watson.org Principal Research Scientist, McAfee Research

    _______________________________________________
    freebsd-arch@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-arch
    To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org"


  • Next message: Poul-Henning Kamp: "Re: [REVIEW/TEST] nanodelay() vs DELAY()"

    Relevant Pages

    • Re: [patch 10/15] Generic Mutex Subsystem, mutex-migration-helper-core.patch
      ... > + * Debugging variant of mutexes. ... mutexes, then use mutex, and don't be tempted to have a mutex up/down. ... > +struct mutex_debug { ...
      (Linux-Kernel)
    • Re: pthreads and memory, some practice in theory
      ... needed to put any locks around them. ... The struct looks something like this: ... char sourceImageData; ...
      (comp.programming.threads)
    • Re: CAS timings
      ... > the like (anything that uses the atomic instructions) ... I never hope mutexes would be effective. ... > time you land up playing cache ping-pong as each cache has to get the line ... > largest cacheline size in placing hot bits. ...
      (comp.unix.solaris)
    • Accessing ASP.NET Cache from multiple threads
      ... I was wondering how multiple threads can access the ASP.NET cache object ... Take for example the following code that I got from the ASP.NET Cache ... Do I need to use locks or mutexes ...
      (microsoft.public.inetserver.asp.general)