Re: Frequent network access freeze (in 7.0)
- From: Unga <unga888@xxxxxxxxx>
- Date: Tue, 26 Feb 2008 01:57:53 -0800 (PST)
--- Robert Watson <rwatson@xxxxxxxxxxx> wrote:
On Wed, 20 Feb 2008, Unga wrote:
I'm running 7.0-PRERELEASE (RC2, dated15/02/2008), compiled from sources on
i386 machine (512MB RAM, 3.0GHz, tx0: <SMCEtherPower II 10/100>).
ping to any ip address. The
Network access freezes very frequently. Cannot
only way to get networking working again isreboot.
it from BETA4. I have
I'm having this problem on 7.0 ever since I tried
reported also to this list before but sadly nobodywas interested on it.
problem, I could furnish with
If somebody is interested to look into this
more detail and participate in testing.
This sort of problem frequently turns out to be a
bug in a device driver or a
problem with interrupt probing/configuration, so my
first guess would be a
problem with the if_tx driver. The usual starting
diagnostics when ping fails
are to try to use tcpdump to determine whether it's
receive or transmit
failing (or both). Quiet the network between two
endpoints as much as you can
so you can avoid noise from making the dumps more
complex, and dump arp and
icmp at both endpoints. Now try to ping from each
end point to the other.
One potential source of confusion is that ping
requires ARP to work, and ARP
can be a slightly confusing protocol as it usually
resolves actively (query,
response) but sometimes it receives passive updates
or extends existing
entries.
What you want to look for is a packet sent by one
side that isn't received by
the other. You might find, for example, that your
host receives packets fine,
but the packets it transmits are never received.
This would be indicative of a
driver bug in which it fails to properly handle (for
example) transmit queues
filling, and might only trigger under very high
load. Or, you might find that
your host never receives anything the other side
transmits, but can send fine.
This might be indicative of a driver bug involving
the receive code, or a
problem with how interrupts are being handled more
generally.
It looks like the last non-routine maintenance to
the driver was done by
Maxime in about 2003; the more recent changes have
all been updates to
newbus/busdma infrastructure, ifnet changes, locking
changes, etc. I've CC'd
him as it sounds like he may have hardware... My
advice would be to do the
above tests and see if you can narrow down whether
it's transmit, receive, or
both failing.
Here are the detail when net access is working and
when not working:
When net access working
-----------------------
$ ifconfig
tx0:
flags=108843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,NEEDSGIANT>
metric 0 mtu 1500
options=8<VLAN_MTU>
ether 00:e1:20:34:bb:36
inet 192.168.1.20 netmask 0xffffff00 broadcast
192.168.1.255
media: Ethernet autoselect (10baseT/UTP)
status: active
plip0:
flags=108810<POINTOPOINT,SIMPLEX,MULTICAST,NEEDSGIANT>
metric 0 mtu 1500
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric
0 mtu 16384
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3
inet6 ::1 prefixlen 128
inet 127.0.0.1 netmask 0xff000000
$ netstat -r
Routing tables
Internet:
Destination Gateway Flags Refs
Use Netif Expire
default 192.168.1.1 UGS 0
1090 tx0
localhost localhost UH 0
186 lo0
192.168.1.0 link#1 UC 0
0 tx0
192.168.1.1 00:91:d2:4c:54:f8 UHLW 2
0 tx0 892
Internet6:
Destination Gateway Flags Netif
Expire
localhost localhost UHL lo0
fe80::%lo0 fe80::1%lo0 U lo0
fe80::1%lo0 link#3 UHL lo0
ff01:3:: fe80::1%lo0 UC lo0
ff02::%lo0 fe80::1%lo0 UC lo0
When net access NOT working
---------------------------
$ ifconfig
tx0:
flags=108843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,NEEDSGIANT>
metric 0 mtu 1500
options=8<VLAN_MTU>
ether 00:e1:20:34:bb:36
inet 192.168.1.20 netmask 0xffffff00 broadcast
192.168.1.255
media: Ethernet autoselect (10baseT/UTP)
status: active
plip0:
flags=108810<POINTOPOINT,SIMPLEX,MULTICAST,NEEDSGIANT>
metric 0 mtu 1500
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric
0 mtu 16384
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3
inet6 ::1 prefixlen 128
inet 127.0.0.1 netmask 0xff000000
$ netstat -r
Routing tables
Internet:
Destination Gateway Flags Refs
Use Netif Expire
default 192.168.1.1 UGS 0
3338 tx0
localhost localhost UH 0
204 lo0
192.168.1.0 link#1 UC 0
0 tx0
192.168.1.1 00:91:d2:4c:54:f8 UHLW 2
28 tx0 997
192.168.1.2 link#1 UHLW 1
1 tx0
Internet6:
Destination Gateway Flags Netif
Expire
localhost localhost UHL lo0
fe80::%lo0 fe80::1%lo0 U lo0
fe80::1%lo0 link#3 UHL lo0
ff01:3:: fe80::1%lo0 UC lo0
ff02::%lo0 fe80::1%lo0 UC lo0
tcpdump -i tx0 -v
NOTE: When ping to 192.168.1.1, no tcpdump output.
ping 192.168.1.1
PING 192.168.1.1 (192.168.1.1): 56 data bytes
^C
--- 192.168.1.1 ping statistics ---
58 packets transmitted, 0 packets received, 100.0%
packet loss
/var/log/messages:
Feb 26 15:26:14 blacktower kernel: tx0: ERROR! Can't
stop Rx DMA
Feb 26 15:26:14 blacktower kernel: tx0: promiscuous
mode enabled
Note: These two messages keep on repeat on
/var/log/messages.
/var/log/messages at the time of send this email:
Feb 26 17:32:17 blacktower kernel: tx0: link state
changed to DOWN
Feb 26 17:36:25 blacktower kernel: tx0: link state
changed to UP
Feb 26 17:36:30 blacktower kernel: tx0: link state
changed to DOWN
Feb 26 17:37:07 blacktower kernel: tx0: link state
changed to UP
Feb 26 17:37:14 blacktower kernel: tx0: link state
changed to DOWN
Feb 26 17:37:22 blacktower kernel: tx0: link state
changed to UP
When reboot, net access start working again.
Please let me know what other information is required.
Kind regards
Unga
____________________________________________________________________________________
Be a better friend, newshound, and
know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
_______________________________________________
freebsd-current@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@xxxxxxxxxxx"
- Follow-Ups:
- Re: Frequent network access freeze (in 7.0)
- From: admin
- Re: Frequent network access freeze (in 7.0)
- References:
- Re: Frequent network access freeze (in 7.0)
- From: Robert Watson
- Re: Frequent network access freeze (in 7.0)
- Prev by Date: Re: Topology aware scheduling algorithm.
- Next by Date: Re: Topology aware scheduling algorithm.
- Previous by thread: Re: Frequent network access freeze (in 7.0)
- Next by thread: Re: Frequent network access freeze (in 7.0)
- Index(es):
Relevant Pages
|