Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs
- From: Ricardo Nabinger Sanchez <rnsanchez@xxxxxxxxx>
- Date: Wed, 17 Jan 2007 13:41:00 -0200
On Wed, 17 Jan 2007 15:50:41 +1100 (EST)
Bruce Evans <bde@xxxxxxxxxxx> wrote:
AXP: (my 5 year old system with a newer CPU): movq through MMX is 60%
faster than movsl for cached moves, but movdqa through XMM is only 4%
faster. movnt with block prefetch is 155% faster than movsl with no
prefetch, and 73% faster with no prefetch for both.
A64 in 32-bit mode: in between P4 and AXP (closer to AXP). movsl doesn't
lose by so much, and prefetchnta actually works so block prefetch is
not needed and there is a better chance of prefetching helping more
than benchmarks.
This PDF is somewhat dated, but perhaps some of it still applies today:
http://cdrom.amd.com/devconn/events/AMD_block_prefetch_paper.pdf
--
Ricardo Nabinger Sanchez <rnsanchez@{gmail.com,wait4.org}>
Powered by FreeBSD
"Left to themselves, things tend to go from bad to worse."
_______________________________________________
freebsd-arch@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "freebsd-arch-unsubscribe@xxxxxxxxxxx"
- References:
- Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs
- From: Attilio Rao
- Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs
- From: Nick Evans
- Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs
- From: Kip Macy
- Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs
- From: Kip Macy
- Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs
- From: Kip Macy
- Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs
- From: Ivan Voras
- Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs
- From: Bruce Evans
- Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs
- Prev by Date: Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs
- Next by Date: Optimized copy&move (was: Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs)
- Previous by thread: Re: Optimized copy&move (was: Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs)
- Next by thread: Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs
- Index(es):
Relevant Pages
|