Re: race on multi-processor solaris

From: Anton Rang (rang_at_visi.com)
Date: 12/04/03


To: Joe Seigh <jseigh_01@xemaps.com>
Date: 04 Dec 2003 10:45:40 -0600

Joe Seigh <jseigh_01@xemaps.com> writes:
> [...] So this is where per-CPU locking (if I understand
> it correctly) or what I would call asymetric locking comes in. That's
> where P is not the same for all cpus, in fact zero for most of them.
> Basically, you limit the number of processors which share certain locks/data.
> Like the scheduler.

That's not quite right. Read Jeff Bonwick's paper on the slab allocator.

To briefly summarize its per-CPU locking approach...

When allocating a block of memory, it doesn't matter where it comes from.
To reduce lock contention, assume that each running thread can tell which
CPU it is currently executing on. Give each CPU a pool of free memory
which it "owns." Now memory allocation can look roughly like:

  cpu <- get_my_cpu()
  lock(freelock[cpu])
  if memory is available in pool[cpu],
    allocate it.
  otherwise,
    lock(globalfreelock)
    get a big chunk of memory for my pool
    unlock(globalfreelock)
    add memory to pool[cpu] and allocate the piece i need
  unlock(freelock[cpu])

Now, freelock[cpu] is acquired and released each time that we need to
get memory. However, the only possibility for contention on this lock
is if this thread is switched onto another CPU between the lock and
unlock. This is extremely unlikely in the common case, since we're
not doing any blocking operations (in userland, we could take a page
fault, but that's still unlikely to slow us much -- and the fault
would block any other threads who needed to execute this code anyway,
so it doesn't matter that we have the lock). So this code has locks,
but they're generally non-blocking. Other threads executing
concurrently will use their own CPU's pool.

In the case where the cpu's pool is empty, we acquire a global lock, but
only long enough to allocate memory from it for our pool. This is rare
enough that there's very little contention on it.

-- Anton



Relevant Pages

  • Re: Response issues on GS1280, VMS 7.3-2
    ... the primary CPU in interrupt state ($MONITOR MODES/ALL could help check ... any slow I/O devices tend to build up lock queues. ... The processes in MUTEX range widely, ... ES45s can have at most 4 CPUs and their path to memory is quite short, ...
    (comp.os.vms)
  • Re: [PATCH] mm: PageLRU can be non-atomic bit operation
    ... so lock prefix is not needed. ... Think of the CPU cache like the page cache. ... The memory is the disk; ...
    (Linux-Kernel)
  • Re: race on multi-processor solaris
    ... > CPU it is currently executing on. ... Give each CPU a pool of free memory ... the only possibility for contention on this lock ...
    (comp.unix.solaris)
  • Re: [PATCH] mm: PageLRU can be non-atomic bit operation
    ... so lock prefix is not needed. ... Think of the CPU cache like the page cache. ... The memory is the disk; ...
    (Linux-Kernel)
  • Re: xchg & lock question
    ... the processor's LOCK signal is automatically ... All IA32 processors automatically lock memory referencing XCHG ... particular CPU when that CPU has the data item cached (specifically if ...
    (comp.lang.asm.x86)