Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs



Bruce Evans wrote:
On Wed, 17 Jan 2007, Matthew Dillon wrote:
* No extranious memory writes, no uncached extranious memory reads.
If you do any writes to memory other then to the copy destination
in your copy loop you screw up the cpu's write fifo and destroy
performance.

Systems are so sensitive to this that it is even better to spend the
time linearly mapping large copy spaces into KVM and do a single
block copy then to have an inner per-PAGE loop.


I haven't tried this, but have seen and partly worked sensitivity to
linear KVA maps not being physically (non)linear enough. Some CPUs
and/or memory systems are remarkably sensitive to bank interleave.
FreeBSD's page coloring doesn't know anything about banks, and
accidentally starts up with perfect miscoloring for banks. This can
make a difference of 30% for bzero bandwidth in benchmarks (not so
much for bcopy bandwidth, and an insignificant amount for normal use).
After the system warms up, the coloring becomes random with respect
to banks, and random coloring works much better than perfect miscoloring.

About page coloring: Don't amd64 CPUs have virtually indexed, physically
tagged caches? If so, wouldn't it make sense to turn off page coloring,
since it's useless for virtually indexed caches (and probably hurts things
a bit)?

-- Suleiman
_______________________________________________
freebsd-arch@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "freebsd-arch-unsubscribe@xxxxxxxxxxx"



Relevant Pages

  • Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs
    ... block copy then to have an inner per-PAGE loop. ... and/or memory systems are remarkably sensitive to bank interleave. ... accidentally starts up with perfect miscoloring for banks. ... the coloring becomes random with respect ...
    (freebsd-current)
  • Re: Ram Card Design Help
    ... 32k banks actually are pretty bad. ... 16/48 will work OK for MP/M, but if one wants to implement CP/NET ... extensions and need to be completely in common memory. ... For CP/M 3 any scheme works and it doesn't matter. ...
    (comp.os.cpm)
  • Please help with following NUMA-related questions
    ... Difference between memory bank interleaving and node interleaving ... AUTO allows memory access to spread out over banks on the same node or across ...
    (Linux-Kernel)
  • Re: Migration from 4.2 to 5.0
    ... banks, most likely only one can be used to run your image (only one can be ... XIP region span accross discontigious memory!!! ... > NLedDriverInitialize: Create File errorGteCurrentLEDState: ... > NLedDriverInitialize: Create File errorLedOff: DeviceIoControl ...
    (microsoft.public.windowsce.platbuilder)
  • Re: FBDIMM vs DIMM - any compatibility?
    ... measurably faster if all four channels have the same size of memory on ... performance through interleaving. ... branches can have different banks open, ... each channel will slow you down to the tune of about 5 ns per DIMM ...
    (comp.arch)