Re: New "timeout" api, to replace callout



On Friday 28 December 2007, John Baldwin wrote:
On Sunday 02 December 2007 07:53:18 am Andre Oppermann wrote:
Poul-Henning Kamp wrote:
In message <4752998A.9030007@xxxxxxxxxxx>, Andre Oppermann writes:
o TCP puts the timer into an allocated structure and upon close of
the session it has to be deallocated including stopping of all
currently running timers.
[...]
-> The timer facility should provide an atomic stop/remove call
that prevent any further callbacks upon return. It should not
do a 'drain' where the callback may be run anyway.
Note: We hold the lock the callback would have to obtain.

It is my intent, that the implementation behind the new API will
only ever grab the specified lock when it calls the timeout function.

This is the same for the current one and pretty much a given.

When you do a timeout_disable() or timeout_cleanup() you will be
sleeping on a mutex internal to the implementation, if the timeout
is currently executing.

This is the problematic part. We can't sleep in TCP when cleaning up
the timer. We're not always called from userland but from interrupt
context. And when calling the cleanup we currently hold the lock the
callout wants to obtain. We can't drop it either as the race would
be back again. What you describe here is the equivalent of callout_
drain(). This is unfortunately unworkable in TCP's context. The
callout has to go away even if it is already pending and waiting on
the lock. Maybe that can only be solved by a flag in the lock saying
"give up and go away".

The reason you need to do a drain is to allow for safe destroying of the
lock. Specifically, drivers tend to do this:

FOO_LOCK(sc);
...
callout_stop(...);
FOO_UNLOCK(sc);
...
callout_drain(...);
...
mtx_destroy(&sc->foo_mtx);

If you don't have the drain and softclock is trying to acquire the backing
mutex while you have it held (before the callout_stop) then Bad Things can
happen if you don't do the drain. Having the lock just "give up" doesn't
work either because if the memory containing the lock is free'd and
reinitialized such that it looks enough like a valid lock then softclock
(or its equivalent) will still try to obtain it. Also, you need to do a
drain so it is safe to free the callout structure to prevent it from being
recycled and having weird races where it gets recycled and rescheduled but
the timer code thinks it has a pending stop for that pointer and so it
aborts the wrong instance of the timer, etc.

Hi,

I completely agree to what John Baldwin is writing. You need two
stop-functions:

xxx_stop which is non-blocking and
xxx_drain which can block i.e. sleep

BTW: The USB code in P4 uses the same semantics, due to the same reasons:

usbd_transfer_stop() and usbd_transfer_drain()

The only difference is that I pass an error code to the callback which might
happen after that usbd_transfer_stop is called.

I think that xxx_stop() and xxx_drain() is a generic approach that should be
applied to all callback systems. Whenever you have a callback you need to be
able to stop it and drain it.

--HPS
_______________________________________________
freebsd-arch@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "freebsd-arch-unsubscribe@xxxxxxxxxxx"



Relevant Pages

  • Re: New "timeout" api, to replace callout
    ... -> The timer facility should provide an atomic stop/remove call ... do a 'drain' where the callback may be run anyway. ... We hold the lock the callback would have to obtain. ... What you describe here is the equivalent of callout_ ...
    (freebsd-arch)
  • Re: New "timeout" api, to replace callout
    ... -> The timer facility should provide an atomic stop/remove call ... do a 'drain' where the callback may be run anyway. ... We hold the lock the callback would have to obtain. ... What you describe here is the equivalent of callout_ ...
    (freebsd-arch)
  • Re: New "timeout" api, to replace callout
    ... -> The timer facility should provide an atomic stop/remove call ... We hold the lock the callback would have to obtain. ... What you describe here is the equivalent of callout_ ...
    (freebsd-arch)
  • Re: Resolving race conditions with callbacks and cancellation?
    ... acquiring the lock and checking for a 'canceled' state should the 'real' ... you're holding the while calling the callback. ... to modify the object (e.g. re-arm the timer, stop a repeating timer, ... Read (in IOLib): ...
    (comp.programming.threads)
  • Re: New "timeout" api, to replace callout
    ... If you don't have the drain and softclock is trying to acquire the backing mutex while you have it held then Bad Things can happen if you don't do the drain. ... Having the lock just "give up" doesn't work either because if the memory containing the lock is free'd and reinitialized such that it looks enough like a valid lock then softclock will still try to obtain it. ... Also, you need to do a drain so it is safe to free the callout structure to prevent it from being recycled and having weird races where it gets recycled and rescheduled but the timer code thinks it has a pending stop for that pointer and so it aborts the wrong instance of the timer, etc. ... The only difference is that I pass an error code to the callback which might happen after that usbd_transfer_stop is called. ...
    (freebsd-arch)