Re: OT: Cache memory

JF Mezei <jfmezei.spamnot@xxxxxxxxxxxxx> wrote:
Figured someone here would know. Besides, it's Friday and cold here.

Modern CPUs have built in super fast cache memory of a few megabytes.
(many megabytes in case of Itanium).

Is it correct to state that cache memory has its own addressing space
that starts at 0 ? Or is it part of a sequential scheme where it starts
at 0 and RAM starts after the end of cache ?

Usually one shouldn't think about cache addressing at all.

Cache is normally thought of as being addressed through an associative,
or content-addressable, memory. Each cache row has, stored separately,
the address for the data in that row. When a memory cycle occurs,
the cache compares the address to all the row addresses it has stored.
If one matches, it is a cache hit, the data is retrieved and sent
to the processor. If not, it is a cache miss, the request goes to
the next lower level in hierarchy (which may be another cache).
That is called a fully associative cache.

As associative memory is hard to build, especially as it gets
larger, there is an easier way (short cut), called the direct mapped
cache. Certain address bits are used to select a cache row, and
only that row is considered. It either has or does not have the
appropriate data, as determined by checking all the address bits.

In between, there is the n-way associative, which takes some
address bits to select which of a small number of associative
memory cells (cache rows) need to be checked. This helps avoid
the problem that direct mapped cache has when going through
memory in certain power-of-two steps.

If you google (or wikipedia) for fully associative or direct mapped
cache, you should find enough hits.

I assuem it has a lookup table where it first checks to see if the
desired memory contents are already in cache. Is the mapping between
cache and RAM be done at physical addressing or virtual memory addressing ?

I believe some do it one way, some the other. It partly depends on
the history of the architecture. If added onto an existing system
it usually must be done at the physical address level, as that is
what comes out. If done as part of the processor design, it can
be done at either level. There are complications, different tasks
may be using the same virtual address (in different address spaces)
at the same time. That can cause cache thrashing, though should not
otherwise cause problems. If an address space indicator can also
be used as a cache key, then that problem goes away. Some allow
different virtual addresses to map to the same physical address,

Do architectures such a IA64, Power or even X86 provide any
instructions to access/manipulate cache ?

Traditionally, cache was transparent to the instruction set,
though that isn't quite true in many later ones. I don't know
specifically about those.

For instance, could a compiler generate code which uses the cache as
memory for temporary values that need not be written to RAM ?

As others mentioned, that happens automatically in a write-back
cache, which doesn't write to the next level until needed.
Write-through always writes to the next level.

It is possible that one day, CPUs will not need registers because
they'll just do the math/logic operatiosn directly from cache memory ?

As someone mentioned, that could be how the PDP-10 worked. But the
main reason for registers is that it allows addressing with fewer bits.
For the PDP-10, the first 16 locations in virtual address space
map to the 16 registers. One implementation is that they really
are just memory locations. Another is that the first 16 memory
locations are cached, when accessed as either memory or registers.

Otherwise, there is stack addressing on stack architectures,
which are often optimized by caching some number of entries at
the top of the stack.

I have wondered if any processors with built-in cache could
run without any external memory, as long as the addresses stayed
within the cache.

-- glen