Re: Linux compatible setaffinity.



On Thu, 20 Dec 2007, Andre Oppermann wrote:

Jeff Roberson wrote:
I have implemented a linux compatible sched_setaffinity() call which is somewhat crippled. This allows a userspace process to supply a bitmask of processors which it will run on. I have copied the linux interface such that it should be api compatible because I believe it is a sensible interface and they beat us to it by 3 years.

The Linux (and Solaris) style setaffinity is rather low level and
any user of it has to make many assumptions based on incomplete
knowledge of the underlying hardware and its architecture (buses,
caches, latency between cores, etc).

In practical use I'd rather have a function to bind myself to the
current CPU or CPU number X, and then to specify that new threads
or forked processes should emerge on another, but not this CPU.
Pepper that with a few hints like latency and cache affinity (important
or not important) the kernel can act on appropriately and it becomes
much more powerful and simpler to use. Taking it even further an
application may want to specify that it would like to run on a number
X of cores that are close (latency/cache) together, be permanently
bound to it and to repel any other such requests. This way I can
run my database server on socket 1 cores 1-4, and the webserver on
socket 2 cores 5-8 more or less automagically. sched_setaffinity
requires a lot of operator involvement and architecture knowledge
to make that happen.

Not that I'm against a Linux compatible sched_setaffinity(), it's
just not as practical to use as other constructs.

Food for thought.


Well my hope is that the kernel scheduler has all of the required information about the processor to make these kinds of decisions for the general case. Right now we need better topology information in the kernel, but I think userspace only uses setaffinity in very special cases. I'd hate for it to become the norm in applications to start looking at cpu topology and making decisions based on that.

Not that I would argue if someone were to implement this. I just want us to get it right often enough in the scheduler that it's not necessary.

The uses for setaffinity that I have seen so far have been very special purpose. Or quite often just spawning one thread per cpu and pinning it in place for various purposes.

Jeff


--
Andre

_______________________________________________
freebsd-arch@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "freebsd-arch-unsubscribe@xxxxxxxxxxx"



Relevant Pages

  • Re: {SOLVED} Re: CPU Frequencies Dont scale independently
    ... in a CPU package is constant across cores, ... cores while clocking one up doesn't save considerable amount of power. ... Kernel understands which specific CPUs are dependent and picks the highest frequency needed among all such dependent cores and makes single request for such frequency and reports the same. ...
    (Debian-User)
  • {SOLVED} Re: CPU Frequencies Dont scale independently
    ... in a CPU package is constant across cores, ... cores while clocking one up doesn't save considerable amount of power. ... Kernel understands which specific CPUs are dependent and picks the highest frequency needed among all such dependent cores and makes single request for such frequency and reports the same. ...
    (Debian-User)
  • No mptable found (Tyan h1000E)
    ... --- So the kernel can see all four APICs. ... CPU: Physical Processor ID: 0 ... SMP alternatives: switching to UP code ... and the BIOS can see all four cores. ...
    (Linux-Kernel)
  • Re: 2.6.24.3 SMP kernel sees only one of my 2 cores..
    ... kernel sees only 1 of my 2 cores. ... bios problem or can the CPU actually ... Check the bios settings, someone else lost most of his cores and traced it back to a bios setting that caused the cpus to not be seen in some version of the kernel. ...
    (Fedora)
  • Oops with 2.6.1_rc1-mm1
    ... Unable to handle kernel NULL pointer dereference at virtual address ... CPU: 0 ... EIP is at vt_ioctl+0x1e/0x1f00 ... Pin 2-17 already programmed ...
    (Linux-Kernel)