Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs



2007/1/17, Ivan Voras <ivoras@xxxxxx>:
Kip Macy wrote:
> On 1/16/07, Ivan Voras <ivoras@xxxxxx> wrote:
>> But it does seem to hurt the performance a bit - maybe it's time to add
>> another CPU option like I586_CPU and I686_CPU?
>
> Unless there is a compelling reason not to do so, I think that that
> would be a good idea.

Maybe even someone finds a way to get optimized versions of memcpy in
the kernel :)

I was thinking: AFAIK the only major stopper is context saving of the
various "auxiliary" registers - FPU, MMX, SSE, right? But is it an
all-or-nothing situation? I.e. does it make sense (can it be done?) to
just elect to save the MMX context? (AFAIK they are different registers
than SSE, but overlay FPU registers?) The idea is to save something
smaller than the full set.

When I implemented fpu copy into the kernel I had a lot of thinking
about this and I think it is possible at least with some restrictions.
For example, for an xmm copy you would just save 8 registers content
but you have to ensure no pending FPU exceptions will break your
kernel and so you should preserve a clean copy of FPU state or, treact
the corner cases you can get.
For xmm, after some very productive discussions with bde@, we arrived
at the conclusion that should be pretty safe to just have an 16 byte
aligned buffer for registers saving (in this way you can use 8 movdqa
for saving them) but I didn't end to play with it.
(My implementation should deal with the problem of pinning the
scheduler too, in order to avoid a wrong reading of per-cpu datas).

Attilio


--
Peace can only be achieved by understanding - A. Einstein
_______________________________________________
freebsd-current@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@xxxxxxxxxxx"



Relevant Pages