cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
572
Views
9
Helpful
8
Replies

Very Frustrating Mobility Issue with Honeywell Handhelds

LaDebs
Level 1
Level 1

Dear all
During few weeks I have tried to identify a mobility issue. I have a local environment with a good coverage and many type of devices, however, the honeywell handhelds corporate managed by SOTI platform, stuck during the roaming and the only way to recover it is rebooting. I understand that my problem description points to HH or SOTI, but in every log, in every radioactive tracert the log is the same. IT seams like the AP cannont "transfer" the information to the next one. However other devices can recovery by themselves but no HH

LaDebs_1-1716572845501.png

LaDebs_0-1716572814463.png

attached a trace with the client log
I really appreciate any suggestion or your similar experiences
Debbie.

1 Accepted Solution

Accepted Solutions

LaDebs
Level 1
Level 1

Well, I came back just to share the solution. I rebuilded the whole WLAN, ans test each object that the productive one uses, it seams (I dont know why) the local policy was corrupted, after apply the local policy in test and in productive network the problem got resolved. This post was very useful to tshoot and identify that the problem was in the WLC

Thanks for your indirect support @Scott Fella 

https://community.cisco.com/t5/wireless/802-1x-authentication-and-roaming-issues/td-p/2167420

View solution in original post

8 Replies 8

Leo Laohoo
Hall of Fame
Hall of Fame

What is the firmware of the controller?
What is the model of the APs involved?
What are the uptimes of the WLC and the APs?

Hello Leo,

We performed the upgrade to 17.9.4a last December 26, since that date the WLC has been up. The zone is coveraged by 3802-E-A access point using antennas ANT-2566. The Aps has been restarted several times in order to discard an issues, also we verified the status of the cabling, during that test the cable was disconnected for few minutes.

Thanks. Debbie

 

    - Have an overall  checkup of the 9800  wireless  controller with the CLI command 
             show tech wireless and feed the output into Wireless Config Analyzer

 M.
                                                                                   



-- Each morning when I wake up and look into the mirror I always say ' Why am I so brilliant ? '
    When the mirror will then always repond to me with ' The only thing that exceeds your brilliance is your beauty! '

(A x800 coupled with 17.9.4a.  What could possibly go wrong.)

Let me start with this:  2800/3800/4800/1560/6300 has a hardware design defect.  There is a particular wireless chip used in the entire family where it would routinely crash.  Combined with poor coding work, Cisco has been unable to fix or apply a workaround since the this behaviour first emerged in the early days of 8.5MR4.  One of the telltale signs is the 2800/3800/4800/1560/6300 would randomly drop packets or traffic.  No amount of "reboot the AP" is going to fix the issue.  I have compiled a list of Bug IDs and they can be found HERE.

When the 9800 first emerged, it was hoped that new coding would help.  Unfortunately, the bugs discovered in AireOS were soon updated with IOS-XE version numbers.  And the list (bugs only affecting 2800/3800/4800/1560/6300 in IOS-XE) are getting longer and longer.  

On the "bright side" (for the lack of the a better description), I have found a number of "candidate" bugs smelling very similar to what is going on now.  And they are: 

CSCwi18057: 4-way handshake failure, missing M3 packet.
CSCwk17514:  C9800 WLC clients are unable to connect due to M3 failure
CSCwa25735: AP1832 does not forward packets to radio, adding serviceability log to track this issue
CSCwi49862: AP transmitting M1 after 3 seconds (HIDDEN)
CSCwj04146: 2800/3800/4800/1560/6300 AP not sending traffic over the air when using dot1x (HIDDEN)
CSCwh74663: 2800/3800/4800/1560/6300 not sending QoS data frames downstream
CSCwi69696: Access point experiencing random drops in traffic towards wireless clients

By the way, what is the model of the WLC?  If it is a 9800-80, how many APs are there?  If the WLC is a 9800-80, is there an SSID with web auth?

 

Hello Leo

The model of this WLC is 9800-40, there are joined 30 APs 3802E and 45 9120I (for office enviroment), there is only a webauth SSID for guest users but only few clients are connected there, the affected SSID is a FT802.1x EAP authentication. I'm starting a test, I asked for a fabric default HH and I connected it to the network without SOTI enrollment. I'll share the results soon 

Deb.

marce1000
VIP
VIP

 

         - Advising to go for 17.9.5 and test again , 

 M.



-- Each morning when I wake up and look into the mirror I always say ' Why am I so brilliant ? '
    When the mirror will then always repond to me with ' The only thing that exceeds your brilliance is your beauty! '

LaDebs
Level 1
Level 1

Well, weird news. Yesterday I did the test using a non enroll handheld and the issue persists, I walked for other environments with a good coverage using 9120i APs and same result. 

 

LaDebs_0-1717011976855.png

024/05/29 13:03:29.317547534 {wncd_x_R0-1}{1}: [sanet-shim-translate] [16275]: (ERR): 0c23.699c.6f76 Failed to sanet session handle from auth FSM context
2024/05/29 13:03:29.317547886 {wncd_x_R0-1}{1}: [client-iplearn] [16275]: (ERR): MAC: 0c23.699c.6f76 Failed to fetch client vlan
2024/05/29 13:03:29.317548276 {wncd_x_R0-1}{1}: [client-orch-sm] [16275]: (ERR): MAC: 0c23.699c.6f76 CO failed to process Handoff message
2024/05/29 13:03:29.317562338 {wncd_x_R0-1}{1}: [ewlc-infra-evq] [16275]: (ERR): 0c23.699c.6f76Failed to process: handoff, from: IP: <controller IP>, reason: MM_FAILED_TO_DECODE_MOBILITY_PAYLOAD, subtype: MM_HANDOFF_OK [v4-anchor: 0.0.0.0 , v4-foreign: 0.0.0.0 , v4-xfer: <inverted controller IP> , v6-anchor: 0000:0000:0000:0000:0000:0000:0000:0000 , v6-foreign: 0000:0000:0000:0000:0000:0000:0000:0000 , v6-xfer: 0000:0000:0000:0000:0000:0000:0000:0000 ]

some clue?

Thanks a lot. Deb.

LaDebs
Level 1
Level 1

Well, I came back just to share the solution. I rebuilded the whole WLAN, ans test each object that the productive one uses, it seams (I dont know why) the local policy was corrupted, after apply the local policy in test and in productive network the problem got resolved. This post was very useful to tshoot and identify that the problem was in the WLC

Thanks for your indirect support @Scott Fella 

https://community.cisco.com/t5/wireless/802-1x-authentication-and-roaming-issues/td-p/2167420

Review Cisco Networking for a $25 gift card