Re: rusage breakdown and cpu limits.



On Wed, 30 May 2007, Bruce Evans wrote:

On Tue, 29 May 2007, Jeff Roberson wrote:

I'm working with Attilio to break down rusage further to be per-thread in places where it is protected by the global scheduler lock. To support this, I am interested in moving the rlimit cpulimit check into userret(), or perhaps ast(). Is there any reason why we need to check this on every context switch? Any objections to moving it? Eventually it will require a different lock from the one we obtain to call mi_switch().

Checking it on every userret() would be a good pessimization, and
scheduling ast()'s to check it occasionally wouldn't be far behind.
Syscalls occur many times more often than context switches.

Yes, you're right. I think I'll make a callout that will check once per second.


Locking is needed for the updating the runtime in mi_switch, so it
seems best to check for the runtime becoming too large there while
holding the same lock. This lock is currently sched_lock and it covers
the following other fields:

I think you missed the point of my threadlock patch. There will no longer be a global scheduler lock. Because of this, multiple threads in a process may context switch at the same time. The list below will be protected by a thread lock or a per-process spinlock depending on the use.


Fields in in the proc struct (or attached to it, not counting the
thread struct):

p_stats->p_ru.ru_nvcsw
p_stats->p_ru.ru_nivcsw
p_stats->p_ru.rux_runtime (this)
p_stats->p_ru.rux_uticks
p_stats->p_ru.rux_iticks
p_stats->p_ru.rux_sticks
p_cpulimit (less volatile than the others)
p_sflag (needs sched_lock more than the others?)
p_flag (KSE only, invalid? -- needs PROC lock)

Related fields in the pcpu:

switchtime
switchticks

Related fields in the thread struct:

td_uticks
td_sticks
td_iticks
td_flags

Unrelated fields in the thread struct:

td_critnest (in a KASSERT() only)
td_owepreempt (in a KASSERT() only)
td_generation
td_sched (not yet proerly locked)
td_priority (this and following only in CTR*())
td_inhibitors
td_lockname
td_wmesg

I don't see why you emphasized the cpulimit check, since it is the
least delicate part of the rusage maintainance done by mi_check(). I
don't see how the locking requirements can be reduced significantly
by keeping the rusage in the thread struct more of the time (than the
tick counts already are) and only accumulating it into the proc struct
when needed. There must be a context switch of `switchtime' here, and
this seems to require sched locking. It doesn't help to accumulate
the runtime into the thread struct instead of into the rusage -- sched
locking still seems to be required to switch the runtime. The core of
this switch is:

If we move to a per-thread rusage the only time we have to aggregate and deal with it on a per-process basis is at thread or process exit, and when we check the process runtime to check resource limits. That's why I'm interested in this case. The others are more straightforward. This one must move out of mi_switch() however, or be adjusted with atomics or similar.

Jeff


new_switchtime = cpu_ticks();
delta_runtime = new_switchtime - PCPU_GET(switchtime);
PCPU_SET(switchtime, new_switchtime);

and this must be done without letting any other thread run on the CPU.
This seems to require sched_lock. After determining delta_runtime,
we can add it into either thread data for accumulation later or to
more global data (currently rusage). This part doesn't need such
strong locking. We currently add it to rusage since we already hold
the lock for this, and then checking the limit here also has low cost.

Note that in the above code, keeping `switchtime' in the pcpu is an
optimization (for space and perhaps for clarity). It could be kept
in the thread that owns the CPU. Then its locking requirements might
be clearer -- it must be switched at the same time as td_oncpu. The
latter clearly requires sched_lock. The code for the latter has moved
from mi_switch() sched_switch(). Maybe you emphasized the cpulimit
check because this point wasn't clear -- things that strictly required
sched_lock were supposed to have all been moved to sched_switch()?

p_sflag and td_flag are only needed if we detect that the cpulimit has
been exceeded (to schedule an AST). Moving the limit check to an AST
would still require sched locking for setting td_flag.

The cpulimit has a low resolution, so it doesn't need to be checked
very often. The 4BSD scheduler could easily check it every second or
so in the main loop of the scheduler.

Accesses in the CTR*()s and KASSERT()s complicate the analysis and removing
locks -- you don't want to have to acquire a lock just to debug, and the
debugging code also gets in the way of understanding the locking, not to
mention the function.

Bruce

_______________________________________________
freebsd-arch@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "freebsd-arch-unsubscribe@xxxxxxxxxxx"



Relevant Pages

  • Re: rusage breakdown and cpu limits.
    ... places where it is protected by the global scheduler lock. ... Is there any reason why we need to check this on every context switch? ... Eventually it will require a different lock from the one we obtain to call mi_switch. ... Unrelated fields in the thread struct: ...
    (freebsd-arch)
  • Re: [PATCH] [13/16] HWPOISON: The high level memory error handler in the VM v5
    ... Why not have this in rmap.c and not export the locking? ... + * Collect processes when the error hit an anonymous page. ... anon vma lock and outside inode mmap lock. ... * (code doesn't rely on that order, so you could switch it around) ...
    (Linux-Kernel)
  • Re: 1997 Cabrio- reset alarm?
    ... The way that the Volkswagen central locking system works is that each ... When you lock the drivers door or press lock on your key fob, ... computer checks to see if your doors are all shut. ... If one of your doors is not closed or there is a faulty switch the ...
    (rec.autos.makers.vw.watercooled)
  • Re: using clustered index to optimize inserts ...
    ... I will try to explain locking in terms of Sybase docs... ... Allpages Locking: Allpages locking locks both data pages and index ... the data page is locked with an exclusive lock. ... Clustered Index: The datarows will be arranged as per the clustered ...
    (comp.databases.sybase)
  • Re: [PATCH 17/18] fs: icache remove inode_lock
    ... If you understand inode locking today, ... can understand the inode scaling series quite easily. ... filesystems to lock down the object without taking a global ... Per-zone is problematic. ...
    (Linux-Kernel)