04-04-2024 11:21 AM
This is more of a public service announcement about 9130AXIs on 17.12.3. I have a TAC case open on this but it appears that https://bst.cloudapps.cisco.com/bugsearch/bug/CSCwh49406 is a symptom of some other bug involving cleanair on 2.4GHz with 9130AXIs. On 17.12.3, 9130AXI APs (in my case specifically the 9130AXI-B) can experience under some condition a syslog message about excessive cleanair errors on slot 0, and appear on the WLC with both the 2.4 and 5ghz cleanair sensors enabled but down (appears to also affect 5ghz sensor but I'm only seeing a syslog message about slot 0 on the AP). Trying to manually re-enable cleanair on the 5ghz band in my case doesn't work, in the output of "show ap <name> config slot 1" I just see a message about an unrecoverable cleanair sensor error occurring and it actually recommends reloading the AP in that output. APs in this state also appear to randomly disjoin and rejoin the WLC.
My hypothesis is CSCwh49406 was the result of a developer putting in some debugging that bypassed syslog filters and was unintentionally left in 17.12.2. CSCwh49406 is fixed in 17.12.3, but the cleanair sensor problem they were apparently trying to resolve is still there. Ostensibly this also affected 17.12.2, but I had cleanair disabled on 2.4ghz when I was running 17.12.2 to work around CSCwh49406.
My workaround for this issue in 17.12.3 is the same as the workaround for CSCwh49406: disable cleanair on the 2.4GHz band globally ("no ap dot11 24ghz cleanair") and reload/reboot the affected APs.
In my case I'm on 17.12.3 to get the feature to be able to use a WPA2+3 transition WLAN on 6ghz radio.
04-04-2024 01:55 PM - edited 04-04-2024 01:57 PM
We hit this bug too. Lucky it was only a building with 75 APs and only one AP doing the mischief.
This bug opened up a can of worms. First, one AP was utilizing 100% of the switch port (Tx). Next, two Avigilon CCTV cameras decided to join into the act and "replayed" the CleanAir broadcast and we had a full-blown DDoS in our hands. Finally, the workaround did not help so we ended up taking a gamble and upgraded our controller to 17.12.3.
As soon as the APs boot into 17.12.3 the link utilization for the entire building dropped and normalized.
We are currently evaluating 17.12.3 because this bug only started about 36 hours after we have upgraded to 17.12.2. The link utilization from the offending AP first climbed to 15% after 36 hours, then up to 45% after another 36 hours (or 72 hours uptime) and then rocket up to 100% after another 36 hours on top.
04-04-2024 03:03 PM - edited 04-04-2024 03:03 PM
If you have 9130s I'd keep cleanair disabled on 2.4ghz in 17.12.3, they fixed the syslog spam issue but apparently didn't fix the underlying issue with the cleanair sensor, if you do
show ap dot11 5ghz cleanair summary | i Down
show ap dot11 24ghz cleanair summary | i Down
you might see a 9130s with cleanair down on either band, and those APs will become unstable after a while. The problem seems to originate on the 2.4GHz band but I noticed cleanair down on 5ghz as well on many APs.
If you take a support bundle from the affected APs and look at the messages file you'll see this:
kernel: [*03/25/2024 16:05:16.1518] CLEANAIR: Excessive errors on slot 0. CleanAir turned off until disabled/re-enabled
workaround is to turn it off globally on 2.4ghz and reboot all the affected 9130s.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide