kqueue giant-locking (&kq_Giant, locking)

From: Brian Fundakowski Feldman (green_at_FreeBSD.org)
Date: 04/17/04

  • Next message: Devon H. O'Dell: "Re: [patch] lockf(3) user-exploitable kernel panic"
    To: arch@FreeBSD.org
    Date: Fri, 16 Apr 2004 22:12:42 -0400
    
    

    I believe I have come up with a good solution to the kqueue woes in 5.X, and
    I'd like to get some feedback on work that so far is letting me (on
    uniprocessor, at least) run make -j8 buildworld, with USE_KQUEUE in make(1),
    with no ill effect :) The locking thus far is one global kqueue lock, and I
    firmly believe we should use MUTEX_PROFILING to determine if we should lock
    it down any further at this point.

    There are several major differences so far (of course, fixing that
    stack-paged-out-kernel-crash-bug is one of them) and several major
    things still to be fixed.

    1. The recursion has been removed from kqueue. This means kqueues cannot be
       added to other kqueues for EVFILT_READ -- yes, that ability has been
       around since r1.1 of kern_event.c, but it is utterly pointless and if you
       take a look at my previous patch, severely complicates many things. Of
       course, I'm sure someone will notice and complain, but there isn't any
       documentation that suggests you should kevent() another kqueue().
    2. Because of this, KNOTE() can't end up calling another KNOTE() unless
       the consumer does something stupid (call KNOTE() from filter::event()).
    3. Kqueue does the locking for you when it comes to the non-object lists.
       All of the filter::attach() and filter::detach() routines need to lock
       their object lists, but they don't touch kqueue or knote other than
       setting their own knote's fields. Both of those routines are called
       without any locks held on kqueue's part.
    4. The filter::event() routines are called with internal kqueue locking
       held. You can lock anything else you need to, but you may not sleep;
       it is essentially like an interrupt handler. You must not call into
       KNOTE() with locks held, but you should reference your object. I've
       fixed what appears to be the most egregious offender, sys_pipe.c
    5. If KNOTE() as an interrupt does not work for you, you may call KNOTE()
       with any locks you like except the ones it uses internally (mainly
       filedesc and file), but the only information you can give your
       filter::event() is the hint argument.

    Examples of #4 are bpf and pipe; they do not need to pass any information
    in the filter::event() hint, and as every handler that works on the object
    instead of on hints needs to do, they verify for certain whether or not
    the KNOTE() should have actually fired and ignore falses.

    The biggest example of #5 is process events. There are many different
    process-type locks that may be held when KNOTE() is called, but the
    implementation of filter::event() is mostly correct in locking nothing.

    In kern_fork.c, KNOTE() is called outside of the proc lock (p1->p_klist not
    locked as it should be) because it has to be special-cased somehow.
    This is the most disgusting thing EVAR.
    (NB: See http://green.homeunix.org/~green/kqueue-locking.1.patch for that.)

    Current patch at: http://green.homeunix.org/~green/kqueue-giant-locking.0.patch

    -- 
    Brian Fundakowski Feldman                           \'[ FreeBSD ]''''''''''\
      <> green@FreeBSD.org                               \  The Power to Serve! \
     Opinions expressed are my own.                       \,,,,,,,,,,,,,,,,,,,,,,\
    _______________________________________________
    freebsd-arch@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-arch
    To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org"
    

  • Next message: Devon H. O'Dell: "Re: [patch] lockf(3) user-exploitable kernel panic"

    Relevant Pages

    • Re: mtx_lock_recurse/mtx_unlock_recurse functions (proof-of-concept).
      ... >> The headache of kqueue is that KNOTEultimately calls back the caller ... maybe we can attach an event queue to a knote list. ... > The MAC Framework is also an incestuous subsystem, but it does a lot less ... If you treat the KNOTE lock as a leaf lock, ...
      (freebsd-arch)
    • Re: Kqueue write event position?
      ... On 11/23/06, Ivan Voras wrote: ... > It seems to me you would have to propagate that info along the ... > knote() is generic and is used for all types of notifications, ... For kqueue to work, each file would have to be opened (and take ...
      (freebsd-hackers)
    • once again, MP-safe kqueue...
      ... EVFILT_PROC+NOTE_TRACK and allows recursing KNOTE() ) calls. ... There is one subsystem lock still, ... There can be one writer which may sleep and drop the kqueue subsystem ...
      (freebsd-arch)
    • Re: lio_listio fixes and adding kqueue notification patch
      ... > Here's a patch to fix some lio_listio bugs and add kqueue support to it. ... > during the locking changes. ... there could be a knote added... ... [aio stuff deleted] ...
      (freebsd-current)