Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs
- From: "Attilio Rao" <attilio@xxxxxxxxxxx>
- Date: Fri, 28 Jul 2006 19:04:22 +0200
2006/7/26, Attilio Rao <attilio@xxxxxxxxxxx>:
2006/7/25, Attilio Rao <attilio@xxxxxxxxxxx>:
> 2006/7/25, John Baldwin <jhb@xxxxxxxxxxx>:
> > On Tuesday 25 July 2006 11:14, Attilio Rao wrote:
> > > 2006/7/25, Attilio Rao <attilio@xxxxxxxxxxx>:
> > > > Hi,
> > > > Intel documentation points out that having a 128-bytes aligned
> > > > syncronizing primitive (which fits in a cache line) will minimize the
> > > > traffic for cache bus, so this patch implements an alignment for i386
> > > > on turnstiles.
> > > >
> > > > Any comments, feedbacks?
> > >
> > > Oh, sorry, I've unforgotten the diff.
> > >
> > > Attilio
> >
> > I think a better approach would be to stick turnstiles (and sleepqueues) in a
> > UMA zone and specify cache-size alignment to the zone. However, turnstiles
> > aren't really sychronization primitives in that you don't spin on a variable
> > inside the structure, and I think it's the spinning and avoiding bouncing
> > cache lines around that Intel's documentation is really about. In that case,
> > the things you want aligned are things like mutexes, rwlocks, etc.
>
> Well, I think that this is referred in particular to the latter issue
> you mentioned.
> Spinning is not really concerned to cache bus issues (more, in
> particular, datapath latency).
> With this point of view, turnstiles (as sleepqueues) are passed around
> CPUs more than a mutex/rwlock (or a cv), I guess, so I was thinking
> that it's better optimizing turnstile than the real syncronizing
> primitive itself.
This is a patch which let turnstiles/sleepqueues using an UMA zone.
I've tried in my 6.1R branch and it works quite fine, so this HEAD
version might be alright (I've not tried yet, so please test):
http://users.gufi.org/~rookie/works/patches/uma_sync.diff
It, obviously, set default alignment for i386 at 128 bytes.
Any comments, feedbacks, ideas, are welcome.
Attilio
PS: I know that I could simplify *_alloc(), *_free() routines
implementing init/fini but it is simpler and more optimized having
things like so.
After some thinking, I think it's better using init/fini methods
(since they hide the sizeof(struct turnstile) with size parameter).
Feedbacks and comments are welcome:
http://users.gufi.org/~rookie/works/patches/uma_sync_init.diff
Thanks,
Attilio
--
Peace can only be achieved by understanding - A. Einstein
_______________________________________________
freebsd-arch@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "freebsd-arch-unsubscribe@xxxxxxxxxxx"
- References:
- [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs
- From: Attilio Rao
- Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs
- From: Attilio Rao
- Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs
- From: John Baldwin
- Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs
- From: Attilio Rao
- Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs
- From: Attilio Rao
- [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs
- Prev by Date: Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs
- Next by Date: Changes in the network interface queueing handoff model
- Previous by thread: Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs
- Next by thread: Changes in the network interface queueing handoff model
- Index(es):