Re: hostap recently broken

From: Michal Mertl (mime_at_traveller.cz)
Date: 07/27/05

  • Next message: Divacky Roman: "gdb problem when attaching to processes"
    To: Sam Leffler <sam@errno.com>
    Date: Wed, 27 Jul 2005 09:53:57 +0200
    
    

    Sam Leffler wrote:
    > Michal Mertl wrote:
    > > I think I found what change causes the problem I experience. See below.
    > >
    > > Michal Mertl wrote:
    > >
    > >>I'm sorry I forgot to answer one of Sam's questions.
    > >>
    > >>Michal Mertl wrote:
    > >>
    > >>>Sam Leffler píše v út 26. 07. 2005 v 09:29 -0700:
    > >>>
    > >>>>Michal Mertl wrote:
    > >>>>
    > >>>>>Sam Leffler wrote:
    > >>>>>
    > >>>>>
    > >>>>>>Michal Mertl wrote:
    > >>>>>>
    > >>>>>>
    > >>>>>>>Hello,
    > >>>>>>>
    > >>>>>>>I've just found out that something very recently broke hostap on FreeBSD
    > >>>>>>>CURRENT. The client associates and gets the MAC address of the AP. When
    > >>>>>>>I run tcpdump on the AP I see the pings from the client getting in but
    > >>>>>>>the AP doesn't reply. The ARP protocol works but nothing else does.
    > >>>>>>>
    > >>>>>>>Source checked on 2005-07-22 16:00 UTC works fine.
    > >>>>>>>
    > >>>>>>>The AP card is atheros but just reverting the last changes to the driver
    > >>>>>>>doesn't help.
    > >>>>>>
    > >>>>>>I just tried with CURRENT (from last night). 5212 card setup with TKIP
    > >>>>>>for PTK and GTK. ap operating in 11g. Powerbook running Tiger
    > >>>>>>associated and operated fine. 29Mb/s for upstream tcp netperf (sta and
    > >>>>>>ap in close proximity--rssi 41).
    > >>>>>>
    > >>>>>>I appreciate you testing stuff but please try to diagnose your problems
    > >>>>>>a bit harder and then provide more useful info like the h/w revs and the
    > >>>>>>exact steps you use to setup a non-working system.
    > >>>>>
    > >>>>>
    > >>>>>Sorry, I had the exact same HW setup as before which I described in my
    > >>>>>email about the problem with bridging.
    > >>>>>
    > >>>>>I've got several Atheros 5212 cards (mac 5.9 phy 4.3 radio 3.6) and also
    > >>>>>IPW notebook all running CURRENT, the notebook and the client several
    > >>>>>days old (from before 2005-07-22 16:00 UTC).
    > >>>>>
    > >>>>>The most basic setup - 'ifconfig ath0 192.168.0.1 mediaopt hostap ssid
    > >>>>>aaa' on the AP and 'ifconfig ath0 192.168.0.2 ssid aaa' worked like a
    > >>>>>charm before the date and not after. With the newer kernel on the AP the
    > >>>>>cards associate and as I've just found I can communicate between the
    > >>>>>stations on the AP. Ping to the AP doesn't work even when I get the MAC
    > >>>>>address of the AP via ARP. Adhoc connection works.
    > >>>>
    > >>>>I am unclear still on what happens. I believe you are saying:
    > >>>>
    > >>>>ping 192.168.0.1
    > >>>>
    > >>>>from the station to the ap fails. If so what does 80211stats show on
    > >>>>the ap when this happens (do releveant error stats go up)? If you do
    > >>>
    > >>> ./80211stats -a
    > >>>00:0b:6b:35:dc:d4:
    > >>> rx_mgmt 1
    > >>> tx_data 107 tx_bytes 9788
    > >>>
    > >>>00:0b:6b:35:dc:f0:
    > >>> rx_data 107 rx_mgmt 1 rx_bytes 10430
    > >>> tx_data 6 tx_mgmt 2 tx_bytes 36
    > >>> tx_assoc 1 tx_auth 1
    > >>>
    > >>>
    > >>>./athstats
    > >>>8 tx management frames
    > >>>3 tx frames discarded prior to association
    > >>>93 tx failed 'cuz too many retries
    > >>>930 long on-chip tx retries
    > >>>1 tx frames with no ack marked
    > >>>8148 beacons transmitted
    > >>>27 periodic calibrations
    > >>>834 rate control checks
    > >>>rssi of last ack: 48
    > >>>avg recv rssi: 49
    > >>>1 switched default/rx antenna
    > >>>Antenna profile:
    > >>>[1] tx 8 rx 97
    > >>>[2] tx 1 rx 0
    > >>>
    > >>>
    > >>>These are shortly after reboot after several minutes of inactivity and
    > >>>now ping running 150 sec.
    > >>>
    > >>>After some 20 secs:
    > >>>
    > >>>./athstats
    > >>>8 tx management frames
    > >>>3 tx frames discarded prior to association
    > >>>181 tx failed 'cuz too many retries
    > >>>1810 long on-chip tx retries
    > >>>1 tx frames with no ack marked
    > >>>9021 beacons transmitted
    > >>>30 periodic calibrations
    > >>>923 rate control checks
    > >>>rssi of last ack: 48
    > >>>avg recv rssi: 44
    > >>>1 switched default/rx antenna
    > >>>Antenna profile:
    > >>>[1] tx 8 rx 185
    > >>>[2] tx 1 rx 0
    > >>>
    > >>>./80211stats -a
    > >>>00:0b:6b:35:dc:d4:
    > >>> rx_mgmt 1
    > >>> tx_data 183 tx_bytes 16780
    > >>>
    > >>>00:0b:6b:35:dc:f0:
    > >>> rx_data 183 rx_mgmt 1 rx_bytes 17878
    > >>> tx_data 6 tx_mgmt 2 tx_bytes 36
    > >>> tx_assoc 1 tx_auth 1
    > >>>
    > >>>
    > >>>
    > >>>>80211debug +input
    > >>>
    > >>>
    > >>>>on the ap do you get any log msgs about discarded frames?
    > >>>
    > >>>Nothing is displayed.
    > >>>
    > >>>
    > >>>>You also seem to say the sta resolves the ip w/ arp. Is the same true
    > >>>>for the ap (i.e. that it resolves the ip address of the sta)? I'm
    > >>>>assuming you are NOT running firewall rules do not have crypto setup and
    > >>>>have not fiddled with parameters like apbridge (you didn't provide
    > >>>>ifconfig output for each side).
    > >>
    > >>I forgot to answer the question about ARP:
    > >>
    > >>The STA gets the MAC address of the AP via ARP but the AP most often
    > >>doesn't. AP gets it only when both it and the STA doesn't have the ARP
    > >>record and STA initiates ping. When I delete the ARP entry on the AP
    > >>afterwards, it won't recreate it no matter what direction I ping.
    > >>
    > >>When doing tcpdump on the STA I see the arp who-has coming in and reply
    > >>coming out. When I configure a static ARP entry on the AP I still can't
    > >>communicate. When I ping from AP to STA I see both echo and echo-reply
    > >>in tcpdump on the STA but the reply doesn't make it to the AP or
    > >>something.
    > >>
    > >>I see the echo replies even in tcpdump on the AP:
    > >>
    > >>21:07:31.589408 44us DA:00:0b:6b:35:dc:f0 BSSID:00:0b:6b:35:dc:d4
    > >>SA:00:0b:6b:35:dc:d4 LLC, dsap SNAP (0xaa), ssap SNAP (0xaa), cmd 0x03:
    > >>oui Ethernet (0x000000), ethertype IPv4 (0x0800): (tos 0x0, ttl 64, id
    > >>15394, offset 0, flags [none], proto: ICMP (1), length: 84) 192.168.0.1
    > >>
    > >>>192.168.0.2: ICMP echo request, id 65028, seq 0, length 64
    > >>
    > >>21:07:31.589801 44us BSSID:00:0b:6b:35:dc:d4 SA:00:0b:6b:35:dc:f0
    > >>DA:00:0b:6b:35:dc:d4 LLC, dsap SNAP (0xaa), ssap SNAP (0xaa), cmd 0x03:
    > >>oui Ethernet (0x000000), ethertype IPv4 (0x0800): (tos 0x0, ttl 64, id
    > >>1528, offset 0, flags [none], proto: ICMP (1), length: 84) 192.168.0.2 >
    > >>192.168.0.1: ICMP echo reply, id 65028, seq 0, length 64
    > >>
    > >>21:07:31.589813 60us DA:00:0b:6b:35:dc:d4 BSSID:00:0b:6b:35:dc:d4
    > >>SA:00:0b:6b:35:dc:f0 LLC, dsap SNAP (0xaa), ssap SNAP (0xaa), cmd 0x03:
    > >>oui Ethernet (0x000000), ethertype IPv4 (0x0800): (tos 0x0, ttl 64, id
    > >>1528, offset 0, flags [none], proto: ICMP (1), length: 84) 192.168.0.2 >
    > >>192.168.0.1: ICMP echo reply, id 65028, seq 0, length 64
    > >>
    > >>
    > >>>From reading this I got puzzled - why are there multiple packets with
    > >>the reply? When I disable the apbridge with 'ifconfig ath0 -apbridge'
    > >>everything works!
    > >>
    > >>I hope this helps.
    > >>
    > >
    > >
    > > It helped me I guess :-).
    > >
    > > Rev. 1.67 of src/sys/net80211/ieee80211_input.c did almost exact shift
    > > of several rows of code from the body of ieee80211_input() to a new
    > > function. The only difference I see is a change of one check.
    > >
    > > The old code "if (ni1->ni_associd != 0) {" was replaced by "if
    > > (ieee80211_node_is_authorized(ni1)) {".
    > >
    > > The called function is this:
    > >
    > > ieee80211_node_is_authorized(const struct ieee80211_node *ni)
    > > {
    > > return (ni->ni_flags & IEEE80211_NODE_AUTH);
    > > }
    > >
    > > The code in question is only called when the interface is in apbridge
    > > mode and that's why I was able to locate the problem rather easily. The
    > > state of apbridge setting is only checked at one place.
    > >
    > > I don't know what is the correct way to fix it, if the old code should
    > > be restored here or what.
    > >
    > > Definitely changing the line back to pre 1.67 contents fixes the problem
    > > for me.
    > ...
    >
    > The change to validate the station is authorized is correct; this was a
    > longstanding bugfix I'd been meaning to pull into cvs. The issue was
    > that you cannot bridge directly to the bss node as traffic to it must
    > take the normal input path. I've committed a change that I believe
    > corrects the problem. Thank you.
    >
    > Sam

    Thank you so much.

    I haven't tested the change (rev. 1.76 ieee80211_input.c) yet but I
    think the fix looks correct. I'll inform you if it doesn't work for me.

    Michal

    _______________________________________________
    freebsd-current@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-current
    To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"


  • Next message: Divacky Roman: "gdb problem when attaching to processes"