Re: LOR with netisr changes



Max Laier wrote:
On Monday 04 December 2006 17:49, John Baldwin wrote:
On Sunday 03 December 2006 10:49, Munehiro Matsuda wrote:
From: John Baldwin <jhb-at-freebsd.org>
Date: Wed, 29 Nov 2006 12:39:56 -0500

::On Tuesday 28 November 2006 14:41, Sam Leffler wrote:
::> Robert Watson wrote:
::> > On Wed, 29 Nov 2006, Munehiro Matsuda wrote:
::> >> JFYI, I got following LOR started after the netisr changes:
::> >
::> > In general, device driver locks should not be held over entry
::> > to the network stack. However, things get a bit tricky in the
::> > 802.11 code due to lock sharing, I believe, so it could be a
::> > bit more tricky to fix that.
::>
::> It's just a bug in the driver. Driver locks should be dropped
::> when packets get passed up the stack. This issue was pointed out
::> before iwi ever was committed but since nothing immediately comes
::> back via the bridge (or similar) it's been ignored.
::
::How about this:

<snip>

::(I'm unsure if the net82011 layer protects 'ic' or if the driver is
:: supposed to do that.)

That's the main problem of net80211 as far as I understand it. The 'ic'
stores a lot of state and configuration and it is unclear how it is
protected. Most of the time, it seems to be the driver's job. However,
there are entry points in net80211 that change the 'ic' underneath the
driver without giving them a chance to protect it. IIRC, there has been
the idea to expose a driver mutex to the net80211 layer in order to take
care of this - not sure what has become of that idea, though.

Hey, speak for yourself :) The ic is mostly treated as an extension of
the softc and is intended to be covered by the driver's softc lock.
There are exceptions (e.g. the node table) but that's the main intent.
The complications come from the layering of net80211 over the driver but
because net80211 has no access to the driver's softc lock confusion
abounds and you get problems like calls into ieee80211_ioctl wanting to
release the lock to do a copyin but being unable to.

As you note I've proposed exposing the driver softc lock but quite a
long time but gotten zero feedback so ignored it. I also find it hard
to spend a lot of time on code that's been heavily rewritten and that
has mostly different issues (i.e. vap's have very different locking
requirements though some of the same issues remain).

In this case the upcall must not hold the driver softc lock
regardless--it's got nothing to do with net80211, you'll hit LOR's when
you got through other mid-layer code.


Sorry for the late reply, and thanks for the patch.
I needed to tweak the patch to make it compile, but seems to work
fine. Modified patch attached.

From: "Bjoern A. Zeeb" <bzeeb-lists-at-lists.zabbadoz.net>
Date: Sat, 2 Dec 2006 12:25:28 +0000 (UTC)

::ok, whoever is going to take care of this and perhaps commit the
:: fix let me know. I assigned it LOR ID 194 on "The LOR page":
:: http://sources.zabbadoz.net/freebsd/lor.html#194

LOR seems to be gone with the patch, so it could be removed when
committed.

Thanks,
Haro

--- if_iwi.c.org 9 Nov 2006 16:05:43 -0000
+++ if_iwi.c 3 Dec 2006 14:17:45 -0000
@@ -1230,6 +1230,7 @@
struct mbuf *mnew, *m;
struct ieee80211_node *ni;
int type, error, framelen;
+ IWI_LOCK_DECL;
Hmm, why this? If it's for spl's, note that the rest of the patch is
specific to 5.x and up, shouldn't really be dropping spl in 4.x. Also,
even if it did, it would pass a garbage value to splx(), so it would
really break things badly.

No, this is a hack to work around a recursion in net80211. There are a
couple of places where the driver calls back into net80211 with the lock
held (to protect the 'ic', see above). net80211 then might call back
into the driver, which means we would have either have to use a recursive
lock (which can't be passed to msleep) or avoid the recursion at leat for
the lock (which is what IWI_LOCK_DECL does). I know it's ugly and wrong,
but unless there is a propper sollution as to how the net80211 data
structures are protected in various call paths, it's not something that
can be fixed.

See above. This is not about recursive locks or net80211; this is about
how to handle middle layer code in the network stack. With direct
dispatch the rx path is directly coupled to the tx path for traffic and
locking must compensate or there must be some other mechanism to
decouple the context.


framelen = le16toh(frame->len);
if (framelen < IEEE80211_MIN_LEN || framelen > MCLBYTES) {
@@ -1310,6 +1311,7 @@

bpf_mtap2(sc->sc_drvbpf, tap, sc->sc_rxtap_len, m);
}
+ IWI_UNLOCK(sc);

ni = ieee80211_find_rxnode(ic, mtod(m, struct ieee80211_frame_min
*));

@@ -1319,6 +1321,7 @@
/* node is no longer needed */
ieee80211_free_node(ni);

+ IWI_LOCK(sc);
if (sc->sc_softled) {
/*
* Blink for any data frame. Otherwise do a

=--------------------------------------------------------------------
---------- _ _ Munehiro (haro) Matsuda
-|- /_\ |_|_| Internet Solution Dept., KGT Inc.
/|\ |_| |_|_| 2-8-8 Shinjuku Shinjuku-ku Tokyo 160-0022, Japan
Tel: +81-3-3225-0767 Fax: +81-3-3225-0740


_______________________________________________
freebsd-current@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@xxxxxxxxxxx"



Relevant Pages

  • Re: OS Question
    ... driver, should not distress you. ... I've made the unwarranted assumption that having multiple threads is ... To protect data in user mode ... Permanence (if apps all crash the lock state is retained because ...
    (microsoft.public.development.device.drivers)
  • Re: [PATCH 3/3] VFIO V4: VFIO driver: Non-privileged user level PCI drivers
    ... * into DMA'ble space using the IOMMU ... Noone else seems to take a task lock for this sort of thing. ... This driver seems to lock all DMA memory, ... * it under the terms of the GNU General Public License as published by ...
    (Linux-Kernel)
  • Re: 6.0-BETA2: taskqueue_drain for if_xl.c:2796
    ... let's go back to the detach case where you actually need a drain. ... guarantee that that won't be forever since it can ensure the task handler ... with the lock held the entire time and doesn't drop it. ... Also, in general the entry points to the driver (ifnet functions, ifmedia ...
    (freebsd-current)
  • Re: [PATCH] Remove process freezer from suspend to RAM pathway
    ... My question referred to drivers trying to bind or unbind a device ... The thread trying to bind is holding a lock which is ... Spinning in the driver with the lock not held is impossible, ... would be easiest to jump directly into the freezer! ...
    (Linux-Kernel)
  • Re: [patch 00/15] clocksource / timekeeping rework V4 (resend V3 + bug fix)
    ... which lock already depends on the new lock. ... now at 1915353302473 nsecs ... # You can enable one or both FireWire driver stacks. ...
    (Linux-Kernel)