Re: Deadlock in the routing code
- From: Maxime Henrion <mux@xxxxxxxxxxx>
- Date: Mon, 17 Dec 2007 11:10:09 +0100
Julian Elischer wrote:
Gleb Smirnoff wrote:
On Thu, Dec 13, 2007 at 10:33:25AM -0800, Julian Elischer wrote:
J> Maxime Henrion wrote:
J> > Replying to myself on this one, sorry about that.
J> > I said in my previous mail that I didn't know yet what process was
J> > holding the lock of the rtentry that the routed process is dealing
J> > with in rt_setgate(), and I just could verify that it is held by
J> > the swi1: net thread.
J> > So, in a nutshell:
J> > - The routed process does its business on the routing socket, that
ends up
J> > calling rt_setgate(). While in rt_setgate() it drops the lock on
its
J> > rtentry in order to call rtalloc1(). At this point, the routed
J> > process hold the gateway route (rtalloc1() returns it locked), and
it
J> > now tries to re-lock the original rtentry.
J> > - At the same time, the swi net thread calls arpresolve() which ends
up
J> > calling rt_check(). Then rt_check() locks the rtentry, and tries to
J> > lock the gateway route.
J> > A classical case of deadlock with mutexes because of different locking
J> > order. Now, it's not obvious to me how to fix it :-).
J>
J> On failure to re-lock, the routed call to rt_setgate should completely
abort J> and restart from scratch, releasing all locks it has on the way
out.
Do you suggest mtx_trylock?
I think that would be the cleanest way..
So, here's what I've got. I have yet to test it at all, I hope that
I'll be able to do so today, or tomorrow. Any input appreciated.
Cheers,
Maxime
diff -Nru /sys/net/route.c net/route.c
--- /sys/net/route.c Tue Oct 30 19:07:54 2007
+++ net/route.c Mon Dec 17 11:05:56 2007
@@ -996,6 +996,7 @@
struct radix_node_head *rnh = rt_tables[dst->sa_family];
int dlen = SA_SIZE(dst), glen = SA_SIZE(gate);
+again:
RT_LOCK_ASSERT(rt);
/*
@@ -1029,7 +1030,16 @@
RT_REMREF(rt);
return (EADDRINUSE); /* failure */
}
- RT_LOCK(rt);
+ /*
+ * Try to reacquire the lock on rt, and if it fails,
+ * clean state and restart from scratch.
+ */
+ ok = RT_TRYLOCK(rt);
+ if (!ok) {
+ RTFREE_LOCKED(gwrt);
+ RT_LOCK(rt);
+ goto again;
+ }
/*
* If there is already a gwroute, then drop it. If we
* are asked to replace route with itself, then do
diff -Nru /sys/net/route.h net/route.h
--- /sys/net/route.h Tue Apr 4 22:07:23 2006
+++ net/route.h Fri Dec 14 11:47:48 2007
@@ -289,6 +289,7 @@
#define RT_LOCK_INIT(_rt) \
mtx_init(&(_rt)->rt_mtx, "rtentry", NULL, MTX_DEF | MTX_DUPOK)
#define RT_LOCK(_rt) mtx_lock(&(_rt)->rt_mtx)
+#define RT_TRYLOCK(_rt) mtx_trylock(&(_rt)->rt_mtx)
#define RT_UNLOCK(_rt) mtx_unlock(&(_rt)->rt_mtx)
#define RT_LOCK_DESTROY(_rt) mtx_destroy(&(_rt)->rt_mtx)
#define RT_LOCK_ASSERT(_rt) mtx_assert(&(_rt)->rt_mtx, MA_OWNED)
_______________________________________________
freebsd-net@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscribe@xxxxxxxxxxx"
- Follow-Ups:
- Re: Deadlock in the routing code
- From: Maxime Henrion
- Re: Deadlock in the routing code
- From: Julian Elischer
- Re: Deadlock in the routing code
- References:
- Deadlock in the routing code
- From: Maxime Henrion
- Re: Deadlock in the routing code
- From: Julian Elischer
- Re: Deadlock in the routing code
- From: Gleb Smirnoff
- Re: Deadlock in the routing code
- From: Julian Elischer
- Deadlock in the routing code
- Prev by Date: WOL suport in Broadcom 5721 (57XX)
- Next by Date: Re: Added native socks support to libc in FreeBSD 7
- Previous by thread: Re: Deadlock in the routing code
- Next by thread: Re: Deadlock in the routing code
- Index(es):
Relevant Pages
|