Re: Performance issue

From: Scott Long (scottl_at_samsco.org)
Date: 05/10/05

  • Next message: Bakul Shah: "Regression testing (was Re: Performance issue)"
    Date: Tue, 10 May 2005 07:51:49 -0600
    To: noackjr@alumni.rice.edu
    
    

    Jonathan Noack wrote:
    > On 5/9/2005 12:31 PM, Pete French wrote:
    >
    >>> 5.3 ships with SMP turned on, which makes lock operations rather
    >>> expensive on single-processor machines. 4.x does not have SMP
    >>> turned on by default. Would you be able to re-run your test with
    >>> SMP turned off?
    >>
    >>
    >>
    >> I just ran a test here with SMP turned of on 5.4-RC4 (GENERIC) I got the
    >> following result:
    >>
    >> 67.52 real 41.13 user 26.16 sys
    >> 7034 involuntary context switches
    >>
    >> i.e. it still has system time a a huge proportion of the total compared
    >> to the 4.11 kernel. Interesingly, after reading Holger Kipp's results
    >> I tried it on a genuine multi-processor box with SMP enabled running 5.3.
    >> He got a very small percentage of the time in sys (3.51 out of 81.90) but
    >> I got:
    >> 255.30 real 160.20 user 88.50 sys
    >>
    >> Once again a far higher proprtion of the time spent in sys than you would
    >> expect.
    >
    >
    > I ran into a similar issue when attempting to thread a card game solver
    > program I wrote. Performance in early versions was horrific and I
    > noticed tons of context switches. I resolved the issue by allocating
    > pools of memory beforehand. This seems to point the finger to malloc
    > and context switch overhead.
    >
    > In any case, I believe this is related to threading. Check your results
    > with libthr instead. The following are on my 2.53 GHz P4 which is
    > running CURRENT from last night (with INVARIANTS on).

    I have benchmarks from other programs that show that process scope
    threads in libpthread are extremely slow, while system scope threads
    a much much faster. The new libthr is even faster, but I'd consider
    it very experimental at this time. It is possible to build a version
    of libpthread that uses only system scope threads, look in
    /usr/src/lib/libpthread/Makefile for a comment block that talks about
    it.

    David Xu is actively working on libthr and I'm hoping that it matures
    over the next few months and becomes a viable alternative. However,
    I'd also like to see libpthread get fixed. The performance problems
    point to the UTS being extremely inefficient, and there are reports
    that it does at least one syscall for every thread switch. Since the
    whole point of KSE/SA is to avoid syscalls on thread switches, the
    problem might be an obvious one, thought the solution might not be. The
    kernel side of KSE also does a malloc on every upcall, which again is
    highly inefficient for process scope threads. The solution here seems
    to be fairly simple, it just needs someone to sit down for a few hours
    and work on it.

    Scott

    _______________________________________________
    freebsd-performance@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-performance
    To unsubscribe, send any mail to "freebsd-performance-unsubscribe@freebsd.org"


  • Next message: Bakul Shah: "Regression testing (was Re: Performance issue)"