cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
10913
Views
16
Helpful
13
Replies

Devices unable to connect to wifi network

Our wifi network consists of:

  • vWLC (upgrade from 8.6.101.0 to 8.10.130.0 did not solve the problem)
  • 31 x AP (7 x AIR-AP2802I-E-K9, 24 x AIR-AP1815I-E-K9)
  • Flexconnect mode (I think local switching or not does not affect anything)

To not ruin any settings on the existing WLANs I created additional simple test WLAN SSID with WPA2 PSK (see attach).

 

Problem.

Several different devices (laptops, smartphones) from what I saw by myself are unable to connect to PSK protected network through some APs and are able to connect to open SSID networks with no problem.

At first I assumed that the problem is related to AP model - AIR-AP2802I-E-K9 and I tried to connect to 5 different APs of that model and got 1 success and 4 failures. No problems noted with AP1815.

 

WLC debug client shows that the problem occur in the EAPOL stage:

Starting key exchange to mobile 64:6e:69:aa:bb:bd, data packets will be dropped
Sending EAPOL-Key Message to mobile 64:6e:69:aa:bb:bd state INITPMK (message 1), replay counter 00.00.00.00.00.00.00.00
Allocating EAP Pkt for retransmission to mobile 64:6e:69:aa:bb:bd
802.1x 'timeoutEvt' Timer expired for station 64:6e:69:aa:bb:bd and for message = M2
Retransmit 1 of EAPOL-Key M1 (length 99) for mobile 64:6e:69:aa:bb:bd
802.1x 'timeoutEvt' Timer expired for station 64:6e:69:aa:bb:bd and for message = M2
Retransmit 2 of EAPOL-Key M1 (length 99) for mobile 64:6e:69:aa:bb:bd
802.1x 'timeoutEvt' Timer expired for station 64:6e:69:aa:bb:bd and for message = M2
Retransmit 3 of EAPOL-Key M1 (length 99) for mobile 64:6e:69:aa:bb:bd
802.1x 'timeoutEvt' Timer expired for station 64:6e:69:aa:bb:bd and for message = M2
Retransmit failure for EAPOL-Key M1 to mobile 64:6e:69:aa:bb:bd, retransmit count 4, mscb deauth count 0
Resetting MSCB PMK Cache Entry @index 0 for station 64:6e:69:aa:bb:bd
Removing BSSID b4:de:31:d7:91:21 from PMKID cache of station 64:6e:69:aa:bb:bd
Setting active key cache index 0 ---> 8
4way handshake timeout, send deauth and cleanup the mscb
Setting active key cache index 8 ---> 8
Deleting the PMK cache when de-authenticating the client.
Global PMK Cache deletion failed.

I tried to catch some details from the perspective of nearby APs (attach 2ap_test pcap) and the client device itself (attached filtered.txt).

In the pcap dump there are no association response and EAPOL frames from AP (however maybe that's because they are too far away).

From the client device logs I draw the conclusion that device got associated, but failed with authentication.

 

End device is the lenovo laptop with QCA9377 wifi chip with latest drivers. However 'driver problem' is not seem to be the case here 'cause device is able to connect to nearby AP1815 with no problem.

 

 

 

There are maybe some little inconsistencies between contents of attached files due to the fact that dumps were taken at different times of troubleshooting process.

 

 

13 Replies 13

Scott Fella
Hall of Fame
Hall of Fame
Clean up your ssid and only use wpa2 with aes and only select psk. That might be the issue.
-Scott
*** Please rate helpful posts ***

Mark Elsen
Hall of Fame
Hall of Fame

 

 - For you convenience I have analyzed wlc_debug.txt with : https://cway.cisco.com/wireless-debug-analyzer/  The (sample) result is shown below. You may want to do that yourself again as the forum usually wraps output. Also play with the output options show after the mac address ,which can provide more or less info , depending on which flags are set :

 

Connection 1 of 1
  •  
  •  
  • 1
  •  
  •  


TimeTaskTranslated

Dec 01 18:49:07.887 *apfMsConnTask_5 Client made new Association to AP/BSSID BSSID b4:de:31:d7:91:21 AP servernaya
Dec 01 18:49:07.887 *apfMsConnTask_5 The WLC/AP has found from client association request Information Element that claims PMKID Caching support
Dec 01 18:49:07.887 *apfMsConnTask_5 Client has successfully cleared AP association phase
Dec 01 18:49:07.887 *apfMsConnTask_5 Client is entering PSK Dot1x or WEP authentication phase
Dec 01 18:49:07.887 *apfMsConnTask_5 WLC/AP is sending an Association Response to the client with status code 0 = Successful association
Dec 01 18:49:07.909 *Dot1x_NW_MsgTask_5 4-Way PTK Handshake, Sending M1
Dec 01 18:49:13.109 *osapiBsnTimer 4-Way PTK Handshake, Client did not respond with M2
Dec 01 18:49:13.109 *Dot1x_NW_MsgTask_5 4-Way PTK Handshake, Retransmitting M1 retry #1
Dec 01 18:49:18.209 *osapiBsnTimer 4-Way PTK Handshake, Client did not respond with M2
Dec 01 18:49:18.209 *Dot1x_NW_MsgTask_5 4-Way PTK Handshake, Retransmitting M1 retry #2
Dec 01 18:49:23.325 *osapiBsnTimer 4-Way PTK Handshake, Client did not respond with M2
Dec 01 18:49:23.325 *Dot1x_NW_MsgTask_5 4-Way PTK Handshake, Retransmitting M1 retry #3
Dec 01 18:49:28.433 *osapiBsnTimer 4-Way PTK Handshake, Client did not respond with M2
Dec 01 18:49:28.433 *Dot1x_NW_MsgTask_5 Client has been deauthenticated
Dec 01 18:49:28.433 *Dot1x_NW_MsgTask_5 Client expiration timer code set for 10 seconds. The reason: Roaming failed due to WLAN security policy mismatch between controllers (configuration error). It can also be used to report EAPoL retry errors, and GTK rotation failure (in 8.5)
Dec 01 18:49:38.641 *apfReceiveTask Client session has timed out
Dec 01 18:49:38.641 *apfReceiveTask Client expiration timer code set for 10 seconds. The reason: Client was marked for deletion, and it was on associated, power save or blacklist state. Other message would provide reason for delete


-- Let everything happen to you  
       Beauty and terror
      Just keep going    
       No feeling is final
Reiner Maria Rilke (1899)

Yep, I did it already. Besides errors with 4way handshake we see error about 'roaming' which we can observe also on the end device logs.

But what to do with this

Do not use WPA3. You should see an option of WPA+WPA2.
-Scott
*** Please rate helpful posts ***

Ok, if i choose WPA+WPA2, click apply and then refresh WLAN configuration page it shows WPA2+WPA3.

So if I choose only WPA2 policy checkboxes layer 2 security is saved as WPA2+WPA3, but if I choose WPA and WPA2 policy checkboxes then layer 2 security is saved as WPA+WPA2. But still I have the same problem.

I'll try 8.10.142.0 as suggested below.

I cleared all known ssids from laptop and recreated 'test' SSID on the WLC.

But nothing changed...

 

zzz.PNG

I would try 8.10.142.0 to see if that resolves the problem. 

It fixes https://bst.cloudapps.cisco.com/bugsearch/bug/CSCvu65125 "Some clients cannot connect to WPA2+WPA3 WLAN"

 

And please *check* that your APs have been reloaded and are running the same code version as the WLC (don't trust the output shown on WLC).  https://bst.cloudapps.cisco.com/bugsearch/bug/CSCve14291 does not list 8.6.101.0 as affected but it was a very short-lived release so might be anway in which case the APs could still be running 8.6.101.0.

Upgraded to 8.10.142.0 but still same problem.

Checked one AP with the strong signal to device. It's upgraded to 8.10.142.0 too.

 

So I guess it's time to collect debugs and packet captures and open a TAC case

 

I found something interesting. If I enable 'Flexconnect Local Auth' checkbox in the WLAN settings laptop starts connecting to the network. If I disable this option WLAN returns to inoperative state. It seems like unchecking 'Flexconnect Local auth' option does not force AP 2802 to forward authentication frames to WLC and continue snooping to authentication process and this explains why I don't see EAP frames in the air.

I didn't find any information that 2802 does not support central authentication.

Hello!

This action worked for me when using authentication with PSK (WPA2 + AES). Devices not authenticating when this option "Flexconnect Local Auth" is disable. When enabled its worked immediately.

WLC 5508 - 8.5.171.0

AP 1800 in FlexConnect

Devices: Any

That's exactly what @Alexander Proskurnin said above but in both cases you're using old software (8.10.142.0 & 8.5.171.0).  If neither of you opened a TAC case so that TAC could open a bug for it then it may not have been fixed.  Either way you should be using the latest software as per links below (at the moment that's 8.5.182.12 and 8.10.196.0) and if the problem is still seen - then open a TAC case.  If not already fixed then it will not be fixed at all in AireOS which is now end of life.

[Edited to update to latest code versions but always refer to TAC recommended link below]

jamesbos96602
Level 1
Level 1

certifc probem on the controler,  these devices dont use passwords to athencate but use certifices that cant be replaceded,  so what happens all i seen here, as finale step is athencate throw certifices, and if they out date none system will work, replace software is only fix, and sorry to say to get help from cisco lot harder then most think,

what i do if was me is clear conifg and rebuild it if still had same probems its a ios error, that nothing be done fix it, short of getting cisco to replace two or 3 updates, ,   quisition is huge one one can u reconfig the unit and can u update it,   short that u need some one who can do it, but software still needed regardless , and depends on last time was updated if was more 3 years ago this is the probem, and i know i help cisco fix two devices that had this probem,   but my fix was much more hard i had whipe all data even from flash, software was 10 years out date all settings was incompatable with new software, so we started with router got i up dated i config system my self then we work on controler card they had update 3 pices of software on it, to get it to work, agrement was they get all logs i get all updates, and they hardly say no as i only one can build a system from ground floor up, strip of all commands software get it to boot reconfig it , with out cisco,  and i was not even train work on this stuff and it to me is easy , most of it ,  some of it be on what i can understand,    but cisco got all working, now helping another busness none software working, none of it, so they got get codeing right on one part so codeing part on 2ed part be fix will meen firmware update, , is only fix,   i just sent all logs they need find out why system is going wrong, wont work under http now, allways something, but i got old  Cisco air wlc2125 k9  i in same probem i need software fix it cisco wont give it up, same probem i need software bad but as if cisco will do something with out getting something ,

Review Cisco Networking for a $25 gift card