Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs



2007/1/16, John Baldwin <jhb@xxxxxxxxxxx>:
On Tuesday 16 January 2007 15:36, Attilio Rao wrote:
> 2007/1/16, John Baldwin <jhb@xxxxxxxxxxx>:
> > On Tuesday 16 January 2007 11:51, Attilio Rao wrote:
> > > 2006/7/28, Attilio Rao <attilio@xxxxxxxxxxx>:
> > > >
> > > > After some thinking, I think it's better using init/fini methods
> > > > (since they hide the sizeof(struct turnstile) with size parameter).
> > > >
> > > > Feedbacks and comments are welcome:
> > > > http://users.gufi.org/~rookie/works/patches/uma_sync_init.diff
> > >
> > > [CC'ed all the interested people]
> > >
> > > Even if a long time is passed I did some benchmarks based on ebizzy
tool.
> > > This program claims to reproduce a real httpd server behaviour and is
> > > used into the Linux world for benchmarks, AFAIK.
> > > I think that results of the comparison on this patch is very
> > > interesting, and I think it worths a commit :)
> > > I think that results can be even better on a Xeon machine (I had no
> > > chance to reproduce this on some of these).
> > > (Results taken in consideration have been measured after some starts,
> > > in order to minimize caching differences).
> > >
> > > The patch:
> > > http://users.gufi.org/~rookie/works/patches/ts-sq/ts-sq.diff
> >
> > Looks good. Some minor nits are that in subr_turnstile.c in the comment I
> > would say "a turnstile is allocated" rather than "a turnstile is got from
a
> > specific UMA zone" as it reads a little bit clearer. Also, I would
> > say "Allocate a" rather than "Get a" for the two _alloc() functions.
Also,
> > why not just use UMA_ALIGN_CACHE and make UMA_ALIGN_CACHE (128 - 1) on
i386
> > and amd64 rather than adding a new UMA_ALIGN_SYNC?
>
> I was thinking that in this way anyone who wants to replace the
> syncronizing primitive boundary to an appropriate value can do it.
> I just used UMA_ALIGN_CACHE as default value beacause I don't know the
> better boundary (for syncronizing primitives) for other arches.

Is there a good reason to not cache-align synch primitives? That is, why
would an arch not use cache-align? Also, is there a reason to not update
UMA_ALIGN_CACHE on x86?

Beacause the cache line varies between different CPU models of the
same family. For example, L1, L2, L3 cache lines are 64 bytes wide on
P4 and Xeon.
L1 and L2 caches are 32 bytes wide on the other CPUs (P3, P2, etc.)
and in particular they have nothing to do with trace cache line size
that takes advanteges from this code.

Attilio


--
Peace can only be achieved by understanding - A. Einstein
_______________________________________________
freebsd-arch@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "freebsd-arch-unsubscribe@xxxxxxxxxxx"



Relevant Pages

  • Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs
    ... would say "a turnstile is allocated" rather than "a turnstile is got from ... better boundary (for syncronizing primitives) for other arches. ... Is there a good reason to not cache-align synch primitives? ...
    (freebsd-current)
  • Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs
    ... would say "a turnstile is allocated" rather than "a turnstile is got from ... better boundary (for syncronizing primitives) for other arches. ... Is there a good reason to not cache-align synch primitives? ...
    (freebsd-arch)
  • Re: Pascal Server Page
    ... another reason to be skeptical of the ... Module, you're constrained to Apache. ... ISAPI vs. ASP, and to a lesser degree ISAPI vs. ASP.NET. ... I guess what I'm trying to say is that benchmarks aren't the only thing ...
    (borland.public.delphi.non-technical)
  • Re: Openssl compilation and gcc options
    ... > Is this from benchmarks? ... and timing a couple of things. ... > any reason it would be faster, and I can think of reasons it would be ... much effort on 32-bit compatibility that it now runs 32-bit code ...
    (comp.security.ssh)
  • Re: [patch 0/6] mm: alloc_percpu and bigrefs
    ... That is the reason I have excluded the dst patches in this ... yes even if it is hit hard by the benchmarks. ... And there are apps which use lo (for whatever reason):(. ... send the line "unsubscribe linux-kernel" in ...
    (Linux-Kernel)