Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs



On Jan 18, 2007, at 1:57 PM, Maxim Sobolev wrote:
Heh, it's so complex, so machine-dependent....

Well, yes. :)

That makes me wonder why we still don't have 3 simple to use instructions that do equivalent of memmove(), memcpy() and memset() all in hardware in the best possible way with the respect of block size, alignment, caches, chipset, you-name-it? Virtually every program would benefit from such instructions.

Unfortunately, there are simply different tradeoffs between mechanisms for copying depending on whether you want to use or avoid using/thrashing the L1/L2 caches, whether the data is cache-aligned, and so forth; the CPU can't infer what you want to occur-- you have to tell it. I find it interesting that some of the architectures (PA- RISC, SPARC) do allow for simple instructions with cache-control hinting:

http://gcc.gnu.org/projects/prefetch.html

--
-Chuck

_______________________________________________
freebsd-arch@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "freebsd-arch-unsubscribe@xxxxxxxxxxx"



Relevant Pages

  • Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs
    ... instructions that do equivalent of memmove, memcpyand memsetall in hardware in the best possible way with the respect of block size, alignment, caches, chipset, you-name-it? ... Virtually every program would benefit from such instructions. ... Unfortunately, there are simply different tradeoffs between mechanisms for copying depending on whether you want to use or avoid using/thrashing the L1/L2 caches, whether the data is cache-aligned, and so forth; the CPU can't infer what you want to occur-- you have to tell it. ... I find it interesting that some of the architectures do allow for simple instructions with cache-control hinting: ...
    (freebsd-current)
  • Re: ANSI C question about volatile
    ... The *intent* of the C Standard is clear: the hardware has some ... Only one "bus width" is supported, ... the register numbers might change in the process). ... programmer might use the "bis" or "bic" instructions on the VAX: ...
    (comp.lang.c)
  • Re: implementing Futex
    ... feature atomic instructions, ... This area is prepared by the Kernel and holds functions for all ... Moreover it should be possible to allow "hardware" designers to ... NIOS-like FPGA-processor to provide some hardware support for atomicness ...
    (Linux-Kernel)
  • Re: IBM System z9
    ... add hardware features to speed up operations for which there ... If one is going to add new instructions to a machine at *all*, ... large company that sells computers. ... Decimal floating point, therefore, offers to extend this convenience - ...
    (comp.arch)
  • Re: write statement output buffer flush off on Sun
    ... > routine is usually greater than the cost of checking for a nonzero ... > are passed on the stack, passing the addresses involves memory traffic. ... branch - and it may involve additional instructions on some ... hardware, a conditional branch costs about the same whether it's ...
    (comp.lang.fortran)