Re: panic in rt_check_fib()



On Sat, 13 Sep 2008 23:28:51 -0700, Julian Elischer <julian@xxxxxxxxxxxx> wrote:
To recap on this, I rewrote this function a couple of week sagobecause I
couldn't keep track of what was going on, and I thought it might
havesome bad edge cases. a couple of days later Giorgos contacted me
saying hta the had a fairly reproducible situation
where this was triggered and it appeared to be an edge case in
this function that allowed it to try lock the same lock twice.

I immediatly thought "ah=hah!" I may have a solution to this,
and gave him a copy of my new function and indead it DOES fix that
panic. however after deleting and recreating intefaces a few hundred
times without crashing in rt_check_fib() it then fails somewhere else,
(actually it leacks some resources and eventually networking stops).

I'm not convinced that is a problem with the new or old rt_check() but
it did stop me from just committing the new code.

I rereading the way the function (did and still does) work it
occurred to me that there was a large flaw in teh way it worked..

It dropped a the lock on one route while it went off an did something
else that might block, On returning it blindly re-grabbed that lock,
completely ignoring the fact that the route might not even be valid any
more. (or any of several other things that may have changed while
it was away (maybe sleeping)).

the code Giorgos is referring to is a patch I suggested to him to
fix this oversight and not the one that I originally tested and
had suggested to fix the edge case.

I do however ask that some other people look at this patch!

Exactly. Thanks for summarizing this so well :)

I have started a kernel with your latest patch (from the quoted message
above), and I can't panic my kernel with the script that did it in a
semi-reliable manner before:

% root@kobe:/root# while true ; do \
% sh home.sh > /dev/null 2>&1 ; \
% vmstat -z | sed -n -e 1p -e /rt/p ; \
% sleep 1 ; \
% done
% ITEM SIZE LIMIT USED FREE REQUESTS FAILURES
% rtentry: 120, 0, 19, 77, 43, 0
% ITEM SIZE LIMIT USED FREE REQUESTS FAILURES
% rtentry: 120, 0, 20, 76, 47, 0
% ITEM SIZE LIMIT USED FREE REQUESTS FAILURES
% rtentry: 120, 0, 21, 75, 51, 0
% ITEM SIZE LIMIT USED FREE REQUESTS FAILURES
% rtentry: 120, 0, 23, 73, 55, 0
% ITEM SIZE LIMIT USED FREE REQUESTS FAILURES
% rtentry: 120, 0, 24, 72, 59, 0
% ITEM SIZE LIMIT USED FREE REQUESTS FAILURES
% rtentry: 120, 0, 25, 71, 62, 0
% ITEM SIZE LIMIT USED FREE REQUESTS FAILURES
% rtentry: 120, 0, 26, 70, 65, 0
% ITEM SIZE LIMIT USED FREE REQUESTS FAILURES
% rtentry: 120, 0, 27, 69, 69, 0
% ITEM SIZE LIMIT USED FREE REQUESTS FAILURES
% rtentry: 120, 0, 29, 67, 73, 0
% ITEM SIZE LIMIT USED FREE REQUESTS FAILURES
% rtentry: 120, 0, 30, 66, 76, 0
% ^C
% root@kobe:/root# sh home.sh

rtentries seem to be going up every time I cycle through the script,
which essentially brings down both wireless and wired interfaces and
then brings up the wired interface of my laptop. The core of the script
is currently:

# network interface options
export ifconfig_re0="inet 192.168.1.10/24"
export defaultrouter='192.168.1.1'

echo '## Stopping network interfaces.'
/etc/rc.d/netif stop re0 && ifconfig re0 delete
/etc/rc.d/netif stop iwn0 && ifconfig iwn0 delete

echo '## Bringing up network interface.'
/etc/rc.d/netif start re0

echo "## Reloading firewall rules."
/etc/rc.d/pf reload

# The default route may be pointing to another interface. Find out
# the IP address of the default gateway, delete it and point to the
# default gateway configured as ${defaultrouter}.
if [ -n "${defaultrouter}" ]; then
echo '## Setting default router.'
_oldrouter=`netstat -rn | grep default | awk '{print $2}'`
if [ -n "${_oldrouter}" ]; then
route delete default "${_oldrouter}"
unset _oldrouter
fi
route add default "$defaultrouter"
fi

With your version of rt_check_fib() I have no panics so far. This
doesn't mean we don't have a bug elsewhere, or that it will not panic
tomorrow, but it's nice that thing seem a bit more stable now. The old
version of rt_check_fib() used to panic about one third of the time I
ran my 'home.sh' script...

Now an interesting question is: Is it `normal' that the USED rtentry
objects keep going up at every interface restart and are (at least at
first glance) not reclaimed as fast as they are acquired?

_______________________________________________
freebsd-current@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@xxxxxxxxxxx"



Relevant Pages

  • Re: Cant send packets via WiFi (possibly route issue)
    ... I think that this WiFi device should route all ... it should *all* be routed to the eth1 interface. ... Wireless is much like a point-to-point link using ... The connection between the two networks is the "Wifi AP". ...
    (comp.os.linux.networking)
  • RE: Running public IPs inside an RFC 1597 network
    ... > I'm running a typical Class C RFC 1597 network in my lab. ... know or care if we humans designate a subnet as public or private. ... is the absolute most general route there is for a machine. ... In a correctly configured system when you define an interface, ...
    (freebsd-questions)
  • Equal cost paths from separate OSPF Processes
    ... the network statements are valid in all OSPF processes. ... interface Serial0.300 point-to-point ... router#sh ip route ospf ... connected, includes subnets in redistribution ...
    (comp.dcom.sys.cisco)
  • kludgily solved: bridge with access on both interfaces
    ... address on each interface that is only accessible ... from its own interface side of the bridge - which is actually what we ... there are some checks in the ARP code ... wound up having better luck with route, though it was still a struggle ...
    (freebsd-net)
  • Re: 6.0BETA3 panic in ip_output (vlan/RIP related?)
    ... this occurs when an interface is removed while ... > IP multicast membership is still present for multicast groups on the ... over when it stumbled over the invalid mutex in the destroyed vlan. ... I have coredumps from both the panics if they will help at all, ...
    (freebsd-current)