cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
6256
Views
16
Helpful
12
Replies

Devices unable to connect to wifi network

Our wifi network consists of:

  • vWLC (upgrade from 8.6.101.0 to 8.10.130.0 did not solve the problem)
  • 31 x AP (7 x AIR-AP2802I-E-K9, 24 x AIR-AP1815I-E-K9)
  • Flexconnect mode (I think local switching or not does not affect anything)

To not ruin any settings on the existing WLANs I created additional simple test WLAN SSID with WPA2 PSK (see attach).

 

Problem.

Several different devices (laptops, smartphones) from what I saw by myself are unable to connect to PSK protected network through some APs and are able to connect to open SSID networks with no problem.

At first I assumed that the problem is related to AP model - AIR-AP2802I-E-K9 and I tried to connect to 5 different APs of that model and got 1 success and 4 failures. No problems noted with AP1815.

 

WLC debug client shows that the problem occur in the EAPOL stage:

Starting key exchange to mobile 64:6e:69:aa:bb:bd, data packets will be dropped
Sending EAPOL-Key Message to mobile 64:6e:69:aa:bb:bd state INITPMK (message 1), replay counter 00.00.00.00.00.00.00.00
Allocating EAP Pkt for retransmission to mobile 64:6e:69:aa:bb:bd
802.1x 'timeoutEvt' Timer expired for station 64:6e:69:aa:bb:bd and for message = M2
Retransmit 1 of EAPOL-Key M1 (length 99) for mobile 64:6e:69:aa:bb:bd
802.1x 'timeoutEvt' Timer expired for station 64:6e:69:aa:bb:bd and for message = M2
Retransmit 2 of EAPOL-Key M1 (length 99) for mobile 64:6e:69:aa:bb:bd
802.1x 'timeoutEvt' Timer expired for station 64:6e:69:aa:bb:bd and for message = M2
Retransmit 3 of EAPOL-Key M1 (length 99) for mobile 64:6e:69:aa:bb:bd
802.1x 'timeoutEvt' Timer expired for station 64:6e:69:aa:bb:bd and for message = M2
Retransmit failure for EAPOL-Key M1 to mobile 64:6e:69:aa:bb:bd, retransmit count 4, mscb deauth count 0
Resetting MSCB PMK Cache Entry @index 0 for station 64:6e:69:aa:bb:bd
Removing BSSID b4:de:31:d7:91:21 from PMKID cache of station 64:6e:69:aa:bb:bd
Setting active key cache index 0 ---> 8
4way handshake timeout, send deauth and cleanup the mscb
Setting active key cache index 8 ---> 8
Deleting the PMK cache when de-authenticating the client.
Global PMK Cache deletion failed.

I tried to catch some details from the perspective of nearby APs (attach 2ap_test pcap) and the client device itself (attached filtered.txt).

In the pcap dump there are no association response and EAPOL frames from AP (however maybe that's because they are too far away).

From the client device logs I draw the conclusion that device got associated, but failed with authentication.

 

End device is the lenovo laptop with QCA9377 wifi chip with latest drivers. However 'driver problem' is not seem to be the case here 'cause device is able to connect to nearby AP1815 with no problem.

 

 

 

There are maybe some little inconsistencies between contents of attached files due to the fact that dumps were taken at different times of troubleshooting process.

 

 

12 Replies 12

Scott Fella
Hall of Fame
Hall of Fame
Clean up your ssid and only use wpa2 with aes and only select psk. That might be the issue.
-Scott
*** Please rate helpful posts ***

marce1000
VIP
VIP

 

 - For you convenience I have analyzed wlc_debug.txt with : https://cway.cisco.com/wireless-debug-analyzer/  The (sample) result is shown below. You may want to do that yourself again as the forum usually wraps output. Also play with the output options show after the mac address ,which can provide more or less info , depending on which flags are set :

 

Connection 1 of 1
  •  
  •  
  • 1
  •  
  •  


TimeTaskTranslated

Dec 01 18:49:07.887 *apfMsConnTask_5 Client made new Association to AP/BSSID BSSID b4:de:31:d7:91:21 AP servernaya
Dec 01 18:49:07.887 *apfMsConnTask_5 The WLC/AP has found from client association request Information Element that claims PMKID Caching support
Dec 01 18:49:07.887 *apfMsConnTask_5 Client has successfully cleared AP association phase
Dec 01 18:49:07.887 *apfMsConnTask_5 Client is entering PSK Dot1x or WEP authentication phase
Dec 01 18:49:07.887 *apfMsConnTask_5 WLC/AP is sending an Association Response to the client with status code 0 = Successful association
Dec 01 18:49:07.909 *Dot1x_NW_MsgTask_5 4-Way PTK Handshake, Sending M1
Dec 01 18:49:13.109 *osapiBsnTimer 4-Way PTK Handshake, Client did not respond with M2
Dec 01 18:49:13.109 *Dot1x_NW_MsgTask_5 4-Way PTK Handshake, Retransmitting M1 retry #1
Dec 01 18:49:18.209 *osapiBsnTimer 4-Way PTK Handshake, Client did not respond with M2
Dec 01 18:49:18.209 *Dot1x_NW_MsgTask_5 4-Way PTK Handshake, Retransmitting M1 retry #2
Dec 01 18:49:23.325 *osapiBsnTimer 4-Way PTK Handshake, Client did not respond with M2
Dec 01 18:49:23.325 *Dot1x_NW_MsgTask_5 4-Way PTK Handshake, Retransmitting M1 retry #3
Dec 01 18:49:28.433 *osapiBsnTimer 4-Way PTK Handshake, Client did not respond with M2
Dec 01 18:49:28.433 *Dot1x_NW_MsgTask_5 Client has been deauthenticated
Dec 01 18:49:28.433 *Dot1x_NW_MsgTask_5 Client expiration timer code set for 10 seconds. The reason: Roaming failed due to WLAN security policy mismatch between controllers (configuration error). It can also be used to report EAPoL retry errors, and GTK rotation failure (in 8.5)
Dec 01 18:49:38.641 *apfReceiveTask Client session has timed out
Dec 01 18:49:38.641 *apfReceiveTask Client expiration timer code set for 10 seconds. The reason: Client was marked for deletion, and it was on associated, power save or blacklist state. Other message would provide reason for delete


-- Each morning when I wake up and look into the mirror I always say ' Why am I so brilliant ? '
    When the mirror will then always repond to me with ' The only thing that exceeds your brilliance is your beauty! '

Yep, I did it already. Besides errors with 4way handshake we see error about 'roaming' which we can observe also on the end device logs.

But what to do with this

Do not use WPA3. You should see an option of WPA+WPA2.
-Scott
*** Please rate helpful posts ***

Ok, if i choose WPA+WPA2, click apply and then refresh WLAN configuration page it shows WPA2+WPA3.

So if I choose only WPA2 policy checkboxes layer 2 security is saved as WPA2+WPA3, but if I choose WPA and WPA2 policy checkboxes then layer 2 security is saved as WPA+WPA2. But still I have the same problem.

I'll try 8.10.142.0 as suggested below.

I cleared all known ssids from laptop and recreated 'test' SSID on the WLC.

But nothing changed...

 

zzz.PNG

I would try 8.10.142.0 to see if that resolves the problem. 

It fixes https://bst.cloudapps.cisco.com/bugsearch/bug/CSCvu65125 "Some clients cannot connect to WPA2+WPA3 WLAN"

 

And please *check* that your APs have been reloaded and are running the same code version as the WLC (don't trust the output shown on WLC).  https://bst.cloudapps.cisco.com/bugsearch/bug/CSCve14291 does not list 8.6.101.0 as affected but it was a very short-lived release so might be anway in which case the APs could still be running 8.6.101.0.

Upgraded to 8.10.142.0 but still same problem.

Checked one AP with the strong signal to device. It's upgraded to 8.10.142.0 too.

 

So I guess it's time to collect debugs and packet captures and open a TAC case

 

I found something interesting. If I enable 'Flexconnect Local Auth' checkbox in the WLAN settings laptop starts connecting to the network. If I disable this option WLAN returns to inoperative state. It seems like unchecking 'Flexconnect Local auth' option does not force AP 2802 to forward authentication frames to WLC and continue snooping to authentication process and this explains why I don't see EAP frames in the air.

I didn't find any information that 2802 does not support central authentication.