Re: Multicore Is Bad News For Supercomputers



On Wed, 03 Dec 2008 14:10:39 -0800, Keith Parris <keithparris_nospam@xxxxxxxxx> wrote:

Main, Kerry wrote:
Re: new NUMA .. yeah, I know it is much different than the original
NUMA designs from the early wildfire days, but if the OS and App are
not NUMA aware, then for Supercomputer performance, it may make a
difference for those who want to take advantage of the every cycle.

When the EV7-based systems (e.g. GS-1280) were tested with and without the NUMA code enabled, there wasn't any advantage to running the NUMA code, so for those models the NUMA code within VMS is disabled by default. I've only heard of one customer in the world who found a benefit in re-enabling the NUMA code on the GS-1280.

Where does that code reside? I suspect it may be nore useful on Itanium
which has a less efficient isntruction set requiring higher bandwidth.


Since memory/cache is local to each CPU, local cache references will
be faster than going over the interconnect (granted, it is faster
than the older buses) to a remote CPU, not finding it in cache and
then going to main memory (or disk).

I detect a misoonception here. There is no "main memory" -- there is just portions of memory attached along with CPUs at nodes in the interconnect fabric; if the memory address a CPU references does not correspond to local memory, the memory reference goes across the interconnect between nodes via high-speed router hardware at each node and is accessed remotely. In an EV7 (or QuickPath) ocnfiguration, having the memory controller and high-speed, low-latency links between nodes integrated right with the CPU and memory results in very fast access to memory whether it's in local memory or a hop or two away through the interconnect fabric.

It also gets away from the bottleneck of a single shared bus, since data transfers can be active on many interconnect links at once.



--
PL/I for OpenVMS
www.kednos.com
.



Relevant Pages

  • Migrate pages from a ccNUMA node to another
    ... We are left with Non Uniform Memory Architectures. ... You can make use of the forthcoming NUMA APIs to set up your NUMA environment: ... (e.g. it is a reference benchmark) ... Page migration tries to help you out in these situations. ...
    (Linux-Kernel)
  • Re: Tukwila delayed...
    ... CPU2 for memory that is controlled by CPU2. ... Does this mean that VMS will, on IA64 begin support of the NUMA specific ... that it will be used most often by the CPU controlling that memory). ...
    (comp.os.vms)
  • Re: Multicore Is Bad News For Supercomputers
    ... NUMA designs from the early wildfire days, but if the OS and App are ... just portions of memory attached along with CPUs at nodes in the interconnect fabric; if the memory address a CPU references does not correspond to local memory, the memory reference goes across the interconnect between nodes via high-speed router hardware at each node and is accessed remotely. ...
    (comp.os.vms)
  • Re: Multicore Is Bad News For Supercomputers
    ... Kerry" wrote in message ... Multicore Is Bad News For Supercomputers ... > that for very high performance requires accessing local memory much ... It's NUMA, but not NUMA like you remember from the Wildire (GS-320, ...
    (comp.os.vms)
  • Re: NUMA API - wish list
    ... > resources from some NUMA domains to others, ... all available memory bandwidth and the best average memory latency. ... require to go all the way to a full workload manager ... But NUMA knowledge is purely for optimization. ...
    (Linux-Kernel)