Re: rusage breakdown and cpu limits.
- From: Jeff Roberson <jroberson@xxxxxxxxxxxxxx>
- Date: Tue, 29 May 2007 13:25:51 -0700 (PDT)
On Wed, 30 May 2007, Bruce Evans wrote:
On Tue, 29 May 2007, Jeff Roberson wrote:
I'm working with Attilio to break down rusage further to be per-thread in places where it is protected by the global scheduler lock. To support this, I am interested in moving the rlimit cpulimit check into userret(), or perhaps ast(). Is there any reason why we need to check this on every context switch? Any objections to moving it? Eventually it will require a different lock from the one we obtain to call mi_switch().
Checking it on every userret() would be a good pessimization, and
scheduling ast()'s to check it occasionally wouldn't be far behind.
Syscalls occur many times more often than context switches.
Yes, you're right. I think I'll make a callout that will check once per second.
Locking is needed for the updating the runtime in mi_switch, so it
seems best to check for the runtime becoming too large there while
holding the same lock. This lock is currently sched_lock and it covers
the following other fields:
I think you missed the point of my threadlock patch. There will no longer be a global scheduler lock. Because of this, multiple threads in a process may context switch at the same time. The list below will be protected by a thread lock or a per-process spinlock depending on the use.
Fields in in the proc struct (or attached to it, not counting the
p_cpulimit (less volatile than the others)
p_sflag (needs sched_lock more than the others?)
p_flag (KSE only, invalid? -- needs PROC lock)
Related fields in the pcpu:
Related fields in the thread struct:
Unrelated fields in the thread struct:
td_critnest (in a KASSERT() only)
td_owepreempt (in a KASSERT() only)
td_sched (not yet proerly locked)
td_priority (this and following only in CTR*())
I don't see why you emphasized the cpulimit check, since it is the
least delicate part of the rusage maintainance done by mi_check(). I
don't see how the locking requirements can be reduced significantly
by keeping the rusage in the thread struct more of the time (than the
tick counts already are) and only accumulating it into the proc struct
when needed. There must be a context switch of `switchtime' here, and
this seems to require sched locking. It doesn't help to accumulate
the runtime into the thread struct instead of into the rusage -- sched
locking still seems to be required to switch the runtime. The core of
this switch is:
If we move to a per-thread rusage the only time we have to aggregate and deal with it on a per-process basis is at thread or process exit, and when we check the process runtime to check resource limits. That's why I'm interested in this case. The others are more straightforward. This one must move out of mi_switch() however, or be adjusted with atomics or similar.
new_switchtime = cpu_ticks();
delta_runtime = new_switchtime - PCPU_GET(switchtime);
and this must be done without letting any other thread run on the CPU.
This seems to require sched_lock. After determining delta_runtime,
we can add it into either thread data for accumulation later or to
more global data (currently rusage). This part doesn't need such
strong locking. We currently add it to rusage since we already hold
the lock for this, and then checking the limit here also has low cost.
Note that in the above code, keeping `switchtime' in the pcpu is an
optimization (for space and perhaps for clarity). It could be kept
in the thread that owns the CPU. Then its locking requirements might
be clearer -- it must be switched at the same time as td_oncpu. The
latter clearly requires sched_lock. The code for the latter has moved
from mi_switch() sched_switch(). Maybe you emphasized the cpulimit
check because this point wasn't clear -- things that strictly required
sched_lock were supposed to have all been moved to sched_switch()?
p_sflag and td_flag are only needed if we detect that the cpulimit has
been exceeded (to schedule an AST). Moving the limit check to an AST
would still require sched locking for setting td_flag.
The cpulimit has a low resolution, so it doesn't need to be checked
very often. The 4BSD scheduler could easily check it every second or
so in the main loop of the scheduler.
Accesses in the CTR*()s and KASSERT()s complicate the analysis and removing
locks -- you don't want to have to acquire a lock just to debug, and the
debugging code also gets in the way of understanding the locking, not to
mention the function.
freebsd-arch@xxxxxxxxxxx mailing list
To unsubscribe, send any mail to "freebsd-arch-unsubscribe@xxxxxxxxxxx"
- Prev by Date: Re: rusage breakdown and cpu limits.
- Next by Date: Re: rusage breakdown and cpu limits.
- Previous by thread: Re: rusage breakdown and cpu limits.