cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
957
Views
15
Helpful
3
Replies

Client Devices disconnecting across the enterprise on WLCs with ISE

jippsgk
Level 1
Level 1

Hi,

Client Devices are connecting via WPA2- Preshared Key.
Cisco ISE
Cisco 9300s, 9400s, etc.

The wifi client, which is Wi-Fi Bridge that is NAT'd - disconnects every 30 minutes.

WiFi Client says the following:
kernel log (from wireless endpoint) is full of :

[83604.313128] br-lan: port 1(wlan0) entered blocking state
[83604.313141] br-lan: port 1(wlan0) entered forwarding state
[85367.284954] wlan0: deauthenticated from xx:xx:xx:xx:xx:xx (Reason: 16=GROUP_KEY_HANDSHAKE_TIMEOUT)

Controller Client Logs:

In the debug, we do see the controller sending M5 4 times, here the 3 replies: *osapiBsnTimer: Oct 17 14:22:49.092: xx:xx:xx:xx:xx:xx 802.1x 'timeoutEvt' Timer expired for station xx:xx:xx:xx:xx:xx and for message = M5 *Dot1x_NW_MsgTask_3: Oct 17 14:22:49.092: xx:xx:xx:xx:xx:xx key Desc Version FT - 0 *Dot1x_NW_MsgTask_3: Oct 17 14:22:49.092: xx:xx:xx:xx:xx:xx Retransmit 1 of EAPOL-Key M5 (length 131) for mobile xx:xx:xx:xx:xx:xx

*osapiBsnTimer: Oct 17 14:22:50.108: xx:xx:xx:xx:xx:xx 802.1x 'timeoutEvt' Timer expired for station xx:xx:xx:xx:xx:xx and for message = M5 *Dot1x_NW_MsgTask_3: Oct 17 14:22:50.108: xx:xx:xx:xx:xx:xx key Desc Version FT - 0 *Dot1x_NW_MsgTask_3: Oct 17 14:22:50.108: xx:xx:xx:xx:xx:xx Retransmit 2 of EAPOL-Key M5 (length 131) for mobile xx:xx:xx:xx:xx:xx

*osapiBsnTimer: Oct 17 14:22:51.116: xx:xx:xx:xx:xx:xx 802.1x 'timeoutEvt' Timer expired for station xx:xx:xx:xx:xx:xx and for message = M5 *Dot1x

Then, controller should deauth the client, as it fails to ack with M6 back to the AP / WLC. It does happen here: *Dot1x_NW_MsgTask_3: Oct 17 14:22:51.116: xx:xx:xx:xx:xx:xx Retransmit failure for EAPOL-Key M5 to mobile xx:xx:xx:xx:xx:xx, retransmit count 3, mscb deauth count 0

The next authentication happens without any issues. I can see M1 – M4 , then RUN state: After the deauth, new assoc triggered by client, and succesfull to authenticate. *Dot1x_NW_MsgTask_3: Oct 17 14:22:54.757: xx:xx:xx:xx:xx:xx Starting key exchange to mobile xx:xx:xx:xx:xx:xx, data packets will be dropped *Dot1x_NW_MsgTask_3: Oct 17 14:22:54.757: xx:xx:xx:xx:xx:xx Sending EAPOL-Key Message to mobile xx:xx:xx:xx:xx:xx

state INITPMK (message 1), replay counter 00.00.00.00.00.00.00.00

*Dot1x_NW_MsgTask_3: Oct 17 14:22:54.757: xx:xx:xx:xx:xx:xx Allocating EAP Pkt for retransmission to mobile xx:xx:xx:xx:xx:xx *dot1xSocketTask: Oct 17 14:22:54.767: xx:xx:xx:xx:xx:xx validating eapol pkt: key version = 2 *Dot1x_NW_MsgTask_3: Oct 17 14:22:54.767: xx:xx:xx:xx:xx:xx Received EAPOL-Key from mobile xx:xx:xx:xx:xx:xx *Dot1x_NW_MsgTask_3: Oct 17 14:22:54.767: xx:xx:xx:xx:xx:xx Ignoring invalid EAPOL version (1) in EAPOL-key message from mobile xx:xx:xx:xx:xx:xx *Dot1x_NW_MsgTask_3: Oct 17 14:22:54.767: xx:xx:xx:xx:xx:xx key Desc Version FT - 0 *Dot1x_NW_MsgTask_3: Oct 17 14:22:54.767: xx:xx:xx:xx:xx:xx Received EAPOL-key in PTK_START state (message 2) from mobile xx:xx:xx:xx:xx:xx *Dot1x_NW_MsgTask_3: Oct 17 14:22:54.767: xx:xx:xx:xx:xx:xx Encryption Policy: 4, PTK Key Length: 48 *Dot1x_NW_MsgTask_3: Oct 17 14:22:54.767: xx:xx:xx:xx:xx:xx Successfully computed PTK from PMK!!! *Dot1x_NW_MsgTask_3: Oct 17 14:22:54.767: xx:xx:xx:xx:xx:xx Received valid MIC in EAPOL Key Message M2!!!!! *Dot1x_NW_MsgTask_3: Oct 17 14:22:54.767: xx:xx:xx:xx:xx:xx Compare RSN IE in association and EAPOL-M2 frame(Skip pmkIdLen:0,and grpMgmtCipherLen:0) *Dot1x_NW_MsgTask_3: Oct 17 14:22:54.767: xx:xx:xx:xx:xx:xx Dumping RSNIE received in Association request *Dot1x_NW_MsgTask_3: Oct 17 14:22:54.767: 00000000: 30 14 01 00 00 0f ac 04 01 00 00 0f ac 04 01 00 0............... *Dot1x_NW_MsgTask_3: Oct 17 14:22:54.767: 00000010: 00 0f ac 02 00 00 ...... *Dot1x_NW_MsgTask_3: Oct 17 14:22:54.767: xx:xx:xx:xx:xx:xx Dumping RSNIE received in EAPOL M2 : *Dot1x_NW_MsgTask_3: Oct 17 14:22:54.767: 00000000: 01 00 00 0f ac 04 01 00 00 0f ac 04 01 00 00 0f ................ *Dot1x_NW_MsgTask_3: Oct 17 14:22:54.768: 00000010: ac 02 00 00 .... *Dot1x_NWMsgTask_3: Oct 17 14:22:54.768: xx:xx:xx:xx:xx:xx Stopping retransmission timer for mobile xx:xx:xx:xx:xx:xx *Dot1x_NW_MsgTask_3: Oct 17 14:22:54.768: xx:xx:xx:xx:xx:xx key Desc Version FT - 0 *Dot1x_NW_MsgTask_3: Oct 17 14:22:54.768: xx:xx:xx:xx:xx:xx key Desc Version FT - 0 *Dot1x_NW_MsgTask_3: Oct 17 14:22:54.768: xx:xx:xx:xx:xx:xx Sending EAPOL-Key Message to mobile xx:xx:xx:xx:xx:xx state PTKINITNEGOTIATING (message 3), replay counter 00.00.00.00.00.00.00.01 *dot1xSocketTask: Oct 17 14:22:55.270: xx:xx:xx:xx:xx:xx validating eapol pkt: key version = 2 *Dot1x_NW_MsgTask_3: Oct 17 14:22:55.270: xx:xx:xx:xx:xx:xx Received EAPOL-Key from mobile xx:xx:xx:xx:xx:xx *Dot1x_NW_MsgTask_3: Oct 17 14:22:55.270: xx:xx:xx:xx:xx:xx Ignoring invalid EAPOL version (1) in EAPOL-key message from mobile xx:xx:xx:xx:xx:xx *Dot1x_NW_MsgTask_3: Oct 17 14:22:55.270: xx:xx:xx:xx:xx:xx key Desc Version FT - 0 *Dot1x_NW_MsgTask_3: Oct 17 14:22:55.270: xx:xx:xx:xx:xx:xx Received EAPOL-key in PTKINITNEGOTIATING state (message 4) from mobile xx:xx:xx:xx:xx:xx *Dot1x_NW_MsgTask_3: Oct 17 14:22:55.270: xx:xx:xx:xx:xx:xx Stopping retransmission timer for mobile xx:xx:xx:xx:xx:xx *Dot1x_NW_MsgTask_3: Oct 17 14:22:55.270: xx:xx:xx:xx:xx:xx Freeing EAP Retransmit Bufer for mobile xx:xx:xx:xx:xx:xx *Dot1x_NW_MsgTask_3: Oct 17 14:22:55.270: xx:xx:xx:xx:xx:xx apfMs1xStateInc

 

EAP Controller Config:

EAP-Identity-Request Timeout (seconds)........... 30

EAP-Identity-Request Max Retries................. 2

EAP Key-Index for Dynamic WEP.................... 0

EAP Max-Login Ignore Identity Response........... enable

EAP-Request Timeout (seconds).................... 30

EAP-Request Max Retries.......................... 2

EAPOL-Key Timeout (milliseconds)................. 400

EAPOL-Key Max Retries............................ 4

EAP-Broadcast Key Interval....................... 3600

3 Replies 3

jonathga94
Level 1
Level 1

When the wireless client goes through the 4 way handshake, 2 encryption keys are created. One of them is used for encrypting unicast frames and the other one is used to encrypt broadcast/multicast frames. This last key is the group temporal key and is the same for all the clients connected to the same AP. It could be cracked by an eavesdropper, so to avoid compromising the key security, it is updated every hour by default.

What's happening in your logs is that the AP is requesting that the client update the group encryption key, but it doesn't receive a reply from the client. As a result, the AP kicks the client out of the network, and the client has to reconnect and start a new connection (and updating the group key in an unfriendly way). So this seems to be an issue on the client side.

There are a few things you can try, first, you can update the wireless drivers on your device. This might fix the problem. If not, you can try to increase the broadcast key rotation interval on your WLC. This will reduce the chances of the client being disconnected for being unresponsive. However, this also means that the group key has a higher chance of being cracked.

To change the broadcast key interval, you need to use the command "config advanced eap bcast-key-interval <seconds>"
To review the configuration change, you can use the command "show advanced eap."

thank you for the helpful insight!

I wanted to provide an update. We created a new Cisco AireOS lab network to test this. The same issue occurred on both AireOS 8.2 and 8.5 (WLC2504,3702s). We performed Wireshark capture right on the interface of the device. Cisco AireOS was blocking the device.
 
As this is a LAB, we did not connect it to a Radius server or anything. In the Lab, AireOS was still blocking the device. We added the device MAC MAC filtering under Security --> AAA --> MAC Filtering (Cisco ACS List).
 
Then we went back to the WLAN that we had configured. Advanced ---- > FlexConnect ---- > Turn ON FlexConnect Local Switching, FlexConnect Local Auth, Learn Client IP Address
 
Everything worked. The client has not disconnected since. It hasn't gone down once, and no GTK timeouts. We tested this using 120-second EAP-Broadcast Key Timeouts. If we turn FlexConnect Local Auth Off, it starts GTK timing out again.
 
Why could this be happening in the first place?!?
 
 Thank you again!
Review Cisco Networking for a $25 gift card