Re: New "timeout" api, to replace callout



On Friday 28 December 2007 05:30:14 am Robert Watson wrote:

On Fri, 28 Dec 2007, Hans Petter Selasky wrote:

The reason you need to do a drain is to allow for safe destroying of the
lock. Specifically, drivers tend to do this:

FOO_LOCK(sc);
...
callout_stop(...);
FOO_UNLOCK(sc);
...
callout_drain(...);
...
mtx_destroy(&sc->foo_mtx);

If you don't have the drain and softclock is trying to acquire the
backing
mutex while you have it held (before the callout_stop) then Bad Things
can
happen if you don't do the drain. Having the lock just "give up" doesn't
work either because if the memory containing the lock is free'd and
reinitialized such that it looks enough like a valid lock then softclock
(or its equivalent) will still try to obtain it. Also, you need to do a
drain so it is safe to free the callout structure to prevent it from
being
recycled and having weird races where it gets recycled and rescheduled
but
the timer code thinks it has a pending stop for that pointer and so it
aborts the wrong instance of the timer, etc.

I completely agree to what John Baldwin is writing. You need two
stop-functions:

xxx_stop which is non-blocking and xxx_drain which can block i.e. sleep

BTW: The USB code in P4 uses the same semantics, due to the same reasons:

usbd_transfer_stop() and usbd_transfer_drain()

The only difference is that I pass an error code to the callback which
might
happen after that usbd_transfer_stop is called.

I think that xxx_stop() and xxx_drain() is a generic approach that should
be
applied to all callback systems. Whenever you have a callback you need to
be
able to stop it and drain it.

I think the argument that Poul-Henning is making is not that you don't need
something that behaves "like drain", but rather, we're like the wait for
drain
to be a short, mutex-length wait rather than a long, msleep-length wait.
Remember that the bodies of callouts are expected to run in a very short
period of time in order to not stall the timer system, in fact, in such a
way
that a mutex could be held over the entirely timeout call. Given that this
is
the case, one might reasonably expect callout_stop() to perform the drain
rather than having a separate call. Such a model would be very advantageous
in TCP, where rather than having to defer GC'ing the inpcb/tcpcb to a GC
worker thread, which we don't do but should, we could use the stop call
safely
and eliminate a whole class of races from the stack.

The problem is if softclock() (or similar replacement in future) was preempted
and hasn't run yet but has already gotten to the point that callout_drain()
has to block, then sitting in a spin loop waiting for softclock() to
acknowledge the stop isn't very optimal. The amount of time you are asleep
in this case is actually very small, and you probably won't even block the
vast majority of the time if you follow the 'lock / stop / unlock / drain'
model. You only sleep when you lose the race and softclock() has chosen to
run your callout and is waiting for the driver/client/whatever's lock.

--
John Baldwin
_______________________________________________
freebsd-arch@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "freebsd-arch-unsubscribe@xxxxxxxxxxx"



Relevant Pages

  • Re: New "timeout" api, to replace callout
    ... -> The timer facility should provide an atomic stop/remove call ... do a 'drain' where the callback may be run anyway. ... We hold the lock the callback would have to obtain. ... What you describe here is the equivalent of callout_ ...
    (freebsd-arch)
  • Re: New "timeout" api, to replace callout
    ... -> The timer facility should provide an atomic stop/remove call ... do a 'drain' where the callback may be run anyway. ... We hold the lock the callback would have to obtain. ... What you describe here is the equivalent of callout_ ...
    (freebsd-arch)
  • Re: New "timeout" api, to replace callout
    ... -> The timer facility should provide an atomic stop/remove call ... do a 'drain' where the callback may be run anyway. ... We hold the lock the callback would have to obtain. ...
    (freebsd-arch)
  • TNC On Tour 2005 (2) The Witham Navigable Drains LONGISH (XP)
    ... off down Antons Gowt lock onto the Drains. ... junction below, so turned left along Newham Drain. ... Newham Drain at B1184 Bridge. ...
    (uk.rec.waterways)
  • Re: em0 panic: mutex em0 not owned
    ... this is a race access to callout_stopin em driver. ... basically em_stopboth wants to stop core callout and tx channel ... callout but it only holds core lock. ...
    (freebsd-current)