05-24-2024 10:53 AM
Dear all
During few weeks I have tried to identify a mobility issue. I have a local environment with a good coverage and many type of devices, however, the honeywell handhelds corporate managed by SOTI platform, stuck during the roaming and the only way to recover it is rebooting. I understand that my problem description points to HH or SOTI, but in every log, in every radioactive tracert the log is the same. IT seams like the AP cannont "transfer" the information to the next one. However other devices can recovery by themselves but no HH
attached a trace with the client log
I really appreciate any suggestion or your similar experiences
Debbie.
Solved! Go to Solution.
07-17-2024 12:49 PM
Well, I came back just to share the solution. I rebuilded the whole WLAN, ans test each object that the productive one uses, it seams (I dont know why) the local policy was corrupted, after apply the local policy in test and in productive network the problem got resolved. This post was very useful to tshoot and identify that the problem was in the WLC
Thanks for your indirect support @Scott Fella
https://community.cisco.com/t5/wireless/802-1x-authentication-and-roaming-issues/td-p/2167420
05-24-2024 06:50 PM
What is the firmware of the controller?
What is the model of the APs involved?
What are the uptimes of the WLC and the APs?
05-26-2024 12:17 PM
Hello Leo,
We performed the upgrade to 17.9.4a last December 26, since that date the WLC has been up. The zone is coveraged by 3802-E-A access point using antennas ANT-2566. The Aps has been restarted several times in order to discard an issues, also we verified the status of the cabling, during that test the cable was disconnected for few minutes.
Thanks. Debbie
05-26-2024 12:27 PM
- Have an overall checkup of the 9800 wireless controller with the CLI command
show tech wireless and feed the output into Wireless Config Analyzer
M.
05-26-2024 06:45 PM - edited 06-12-2024 04:15 PM
(A x800 coupled with 17.9.4a. What could possibly go wrong.)
Let me start with this: 2800/3800/4800/1560/6300 has a hardware design defect. There is a particular wireless chip used in the entire family where it would routinely crash. Combined with poor coding work, Cisco has been unable to fix or apply a workaround since the this behaviour first emerged in the early days of 8.5MR4. One of the telltale signs is the 2800/3800/4800/1560/6300 would randomly drop packets or traffic. No amount of "reboot the AP" is going to fix the issue. I have compiled a list of Bug IDs and they can be found HERE.
When the 9800 first emerged, it was hoped that new coding would help. Unfortunately, the bugs discovered in AireOS were soon updated with IOS-XE version numbers. And the list (bugs only affecting 2800/3800/4800/1560/6300 in IOS-XE) are getting longer and longer.
On the "bright side" (for the lack of the a better description), I have found a number of "candidate" bugs smelling very similar to what is going on now. And they are:
CSCwi18057: 4-way handshake failure, missing M3 packet.
CSCwk17514: C9800 WLC clients are unable to connect due to M3 failure
CSCwa25735: AP1832 does not forward packets to radio, adding serviceability log to track this issue
CSCwi49862: AP transmitting M1 after 3 seconds (HIDDEN)
CSCwj04146: 2800/3800/4800/1560/6300 AP not sending traffic over the air when using dot1x (HIDDEN)
CSCwh74663: 2800/3800/4800/1560/6300 not sending QoS data frames downstream
CSCwi69696: Access point experiencing random drops in traffic towards wireless clients
By the way, what is the model of the WLC? If it is a 9800-80, how many APs are there? If the WLC is a 9800-80, is there an SSID with web auth?
05-28-2024 08:42 AM
Hello Leo
The model of this WLC is 9800-40, there are joined 30 APs 3802E and 45 9120I (for office enviroment), there is only a webauth SSID for guest users but only few clients are connected there, the affected SSID is a FT802.1x EAP authentication. I'm starting a test, I asked for a fabric default HH and I connected it to the network without SOTI enrollment. I'll share the results soon
Deb.
05-24-2024 10:33 PM
- Advising to go for 17.9.5 and test again ,
M.
05-29-2024 12:52 PM
Well, weird news. Yesterday I did the test using a non enroll handheld and the issue persists, I walked for other environments with a good coverage using 9120i APs and same result.
024/05/29 13:03:29.317547534 {wncd_x_R0-1}{1}: [sanet-shim-translate] [16275]: (ERR): 0c23.699c.6f76 Failed to sanet session handle from auth FSM context
2024/05/29 13:03:29.317547886 {wncd_x_R0-1}{1}: [client-iplearn] [16275]: (ERR): MAC: 0c23.699c.6f76 Failed to fetch client vlan
2024/05/29 13:03:29.317548276 {wncd_x_R0-1}{1}: [client-orch-sm] [16275]: (ERR): MAC: 0c23.699c.6f76 CO failed to process Handoff message
2024/05/29 13:03:29.317562338 {wncd_x_R0-1}{1}: [ewlc-infra-evq] [16275]: (ERR): 0c23.699c.6f76Failed to process: handoff, from: IP: <controller IP>, reason: MM_FAILED_TO_DECODE_MOBILITY_PAYLOAD, subtype: MM_HANDOFF_OK [v4-anchor: 0.0.0.0 , v4-foreign: 0.0.0.0 , v4-xfer: <inverted controller IP> , v6-anchor: 0000:0000:0000:0000:0000:0000:0000:0000 , v6-foreign: 0000:0000:0000:0000:0000:0000:0000:0000 , v6-xfer: 0000:0000:0000:0000:0000:0000:0000:0000 ]
some clue?
Thanks a lot. Deb.
07-17-2024 12:49 PM
Well, I came back just to share the solution. I rebuilded the whole WLAN, ans test each object that the productive one uses, it seams (I dont know why) the local policy was corrupted, after apply the local policy in test and in productive network the problem got resolved. This post was very useful to tshoot and identify that the problem was in the WLC
Thanks for your indirect support @Scott Fella
https://community.cisco.com/t5/wireless/802-1x-authentication-and-roaming-issues/td-p/2167420
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide