Re: Attention pf/ipfw users with uid/gid/jail rules (Re: Reminder: NET_NEEDS_GIANT, debug.mpsafenet going away in 7.0)




On Fri, 27 Jul 2007, Max Laier wrote:

does "problem" include a LOR message, or only a deadlock?
I've seen plenty of the first, but not the second.

Various users have reported definite deadlocks relating to uid/gid
^------^ ^------^
firewall rules in the past.

I don't think the strong wording is true. I have seen a few reports of deadlocks in the past where debug.mpsafenet "fixed" the issue, but none of the reporters where able to provide enough debugging details to actually identify the culprit.

There was a period when there were definitely deadlocks being reported, but I believe quite a few cases were resolved by Christian's work to push the inpcb down the transmit path in more cases, allowing firewalls to avoid acquiring pcbinfo/pcb locks, which in turn avoided a lock order reversable between the pcbinfo and pcb locks (pcb locks are held over transmit from the IP layer, and the lock order is pcbinfo -> pcb). So to expand on Max's thoughts -- the worry here is that we've corrected the deadlock but not the witness warnings associated with a non-deadlockable reversal. Witness is reasonable to get upset about lock order, but current reasoning is that because this lock is only acquired readably when used in combination with other locks, there isn't deadlock potential, since the read acquire will be non-blocking when involved with those locks, preventing a cycle of waiting. It's good reasoning, and possibly correct :-).

Robert N M Watson
Computer Laboratory
University of Cambridge


Also note that a lot has changed since the early reports. What WITNESS is
warning about now is something like:

rlock(&lock1);
mtx_lock(&lock2);
mtx_unlock(&lock2);
runlock(&lock1);

vs.

mtx_lock(&lock2);
rlock(&lock1);
runlock(&lock1);
mtx_unlock(&lock2);

It's obvious that this can't cause a deadlock unless there is a third
codepath that does either:

wlock(&lock1);
mtx_lock(&lock2);
mtx_unlock(&lock2);
wunlock(&lock1);

or

mtx_lock(&lock2);
wlock(&lock1);
wunlock(&lock1);
mtx_unlock(&lock2);

I have an idea how to teach WITNESS about this, but it's an awful hack,
yet.

--
/"\ Best regards, | mlaier@xxxxxxxxxxx
\ / Max Laier | ICQ #67774661
X http://pf4freebsd.love2party.net/ | mlaier@EFnet
/ \ ASCII Ribbon Campaign | Against HTML Mail and News

_______________________________________________
freebsd-arch@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "freebsd-arch-unsubscribe@xxxxxxxxxxx"



Relevant Pages

  • Re: Recursion bug in -rt
    ... >> the theory that the locks themselves would not deadlock. ... As I said, if you don't want futex to deadlock the kernel, the ... >> the kernel deadlocks or not, because the deadlocking of the user app ...
    (Linux-Kernel)
  • Re: deadlock questions
    ... a deadlock victim is chosen and then ... locks until the transaction completes. ... Acquire row locks. ...
    (microsoft.public.sqlserver.server)
  • Re: Recursion bug in -rt
    ... this is to prevent a kernel hang due to application error. ... >> Can't you promote a user space futex deadlock into a kernel spin deadlock ... > the order of locks taken. ... When resolving the mutex chain (task A locks mutex 1 owned by B blocked ...
    (Linux-Kernel)
  • Re: Deadlock problem / tablock
    ... before the background job. ... getting exclusive table locks on the 4 tables it updates before doing any ... locks were acquired on tables a, c and d, but the job was waiting for table ... have no more information on what caused the deadlock, ...
    (microsoft.public.sqlserver.programming)
  • Re: [patch] Real-Time Preemption, -RT-2.6.10-rc1-mm2-V0.7.1
    ... >> to hold spinlocks in the mutexes as the dependency tree is atomically ... However this will deadlock under MP due to the ... >> unpredictable order of mutexes traversed. ... > is the order of locks in the dependency chain really unpredictable? ...
    (Linux-Kernel)