cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
3597
Views
30
Helpful
26
Replies

CSCwk48338 - AX AP doesn't accept clients

Man, there has to be another solution other than upgrading to iOS 17.12.3 .. We can't use that IOS because we still have 900 2702's on our controllers.  We have 1500 9130's and 2800 3802's.
I really wish Cisco would escalate this issue and get a patch put out. We are screwed if our 9130's continue to drop and no longer permit client connections.  Wish we knew this was a problem before we migrated 4 months ago off of our old  8540's.  Not to mention, these 9800 WLC's have crashed 4 times in 4 months and the 8540's would go years without crashing.  Not happy with Cisco right now

26 Replies 26

Hi brannonsweet@cox.net yes 17.12.4 is affected by CSCwj93876 so it will leak memory (until it crashes) while nmsp is enabled.  So either install the SMU or use the workaround (no nmsp enable) if you are not using Spaces or anything else which requires nmsp.  I decided to use the workaround because we don't need nmsp.

brannonsweet@cox.net

Don't go blindly installing SMU &/or APSP as a "quick fix" because either the SMU &/or the APSP can introduce more trouble than fixing one specific bug.  

I agree with @Rich R, however, it is of my opinion to disable DNA Spaces regardless DNA Spaces is being used or not.  Keep this disabled until 17.12.5 comes out in 5 weeks time and then re-evaluate or re-enable.  

CSCwk48338 is marked as Unreproducible.
That means Cisco TAC and developers have given up on it and are not even trying to fix it.

CSCwk12169 was supposed to address a similar issue but definitely is not fixed for 9105 in 17.12.4 APSP4 as it was supposed to have been.  TAC are now asking to provide all the same troubleshooting, debug and OTA data that has already been provided, again!!!  I suspect this is one of the reasons why TAC have not made 17.12.4 recommended yet.  Not sure whether this bug affects 17.12.3 or not.

Leo, you mentioned a dozen bugs where 9k AP's stop passing traffic, essentially. Are all of the bugs that you are referring to on 17.x.x ios?


brannonsweet@cox.net wrote:
Leo, you mentioned a dozen bugs where 9k AP's stop passing traffic, essentially. Are all of the bugs that you are referring to on 17.x.x ios?

Yes.  

I would also like to clarify that this behaviour of "APs stop passing traffic" is not exclusive to Catalyst 9k APs.  It is very much present in the 2800/3800/4800/1560 APs as well.  

In the case of the 2800/3800/4800/1560, this bug has been present since 8.5.13X.X and has remained unfixed and will remain unfixed.  

And because the Wave 2 APs are end of software maintenance now they'll stop working on any bugs that aren't fixed already.  Those APs may still pick up some fixes provided for the 91xx APs where it's common code though, but because it's not tested on them it could (and likely will in some cases), cause additional problems.

eglinsky2012
Spotlight
Spotlight

@Rich R- Regarding CSCwk48338, BST states that the bug is not seen in 17.12.3 or above. I can confirm we're not having it anymore after our upgrade to 17.12.4/APSP2. Or at least, we're not having those symptoms of clients not able to connect to 5 GHz. Our 9166s and 9130s are doing okay now.

However, I've not seen the CSCwk12169 issue. We have very few 9120s, but we do have a few hundred 9105s in a new residence hall, and last I checked a number of weeks ago, those seemed fine.

brannonsweet@cox.net, I would tentatively recommend 17.12.4*. Since we already established that your Wave 1 APs will work with it, it may be worth a try, especially since it should eliminate the 5 GHz issue at hand.

*However, since you also have 3800 series, beware of this bug: CSCwm31864 Do you have multiple WLCs where you could have one on the current code for the 3800s to continue running on and another to upgrade to 17.12.4 for the 9130s?

More information on that bug on a thread I just started here: https://community.cisco.com/t5/wireless/cos-ap-memory-leak-causing-crash-cscwm31864/m-p/5224656#M277736

APSP4 was also supposed to address a number of kernel panic AP crashes we've been seeing on 17.12 but we're still seeing 1832's affected.  TAC basically told us because they're end of software maintenance now, they are not tested, and will not be fixed on 17.12 so although Wave 2 are technically still supported I would think twice about it.  The Wave 1 APs are probably safer because that code has barely changed for years.  We're considering only migrating 91xx APs to 9800 now.

I appreciate your input on this Rich. Thank you.

So if I upgrade, I will no longer have issues with my 1566 9130's not accepting client connections, but i run the risk that my 2947 3802's could crash on a regular basis from a memeory bug?

This isn't feeling reassuring.  We really can't split up our schools to have the 3802's running on one WLC with one version of code, while all of our other schools with 2702's and 9130's run a the 2nd WLC with a different code.  I mean it's theoretically possible, but it just doesn't seem right that we have to even look at doing this. We spend massive amount on money on Cisco because its suppose to be the best.  9130's and 3802's should run on the 9800's WLC without having to play games to get different model access points to work with differnet code version simultaniously.

I'm not frustrated at you, I am thankful for your input. My frustration lies with Cisco the company.

Thanks again for your time Rich.

brannonsweet@cox.netThat seems to sum it up. Nothing's promised, though. You may not experience the memory issue with the 3802s, or the 5 GHz on 9130s may not be fixed. Your mileage may vary as they say. In my circumstances, at my organization, I feel we're better off on 17.12.4 with the 9100 series running well. The memory issue on the 2800s is pretty isolated, and as I shared in the other discussion I linked to previously, I have a way to find ones experiencing the issue to reboot them hopefully before they become problematic. So I'll limp along before a fix is available.


brannonsweet@cox.net wrote:
My frustration lies with Cisco the company.

Cisco is no longer in control of publishing "industry standard" documentation to the same quality that they used to do a decade ago.  Nor is Cisco able to publish a webpage free of "poisoned" links.  

Quality control is no longer on the table with overworked developers with tight schedules to keep and very strict KPIs.  The quality of the code has been so bad that certain bugs get triggered just by powering up the router, switch, WLC, AP, etc.  All relevant BUs know about this and their response has been a collective "meh".  

I manage about >100 school sites and our solution is to reboot the APs every night using EnergyWise.  

eglinsky2012
Spotlight
Spotlight
 
Long story short, you should be able to upgrade to 17.12.4 with APSP7 to both resolve your original issue and avoid the 3800 series memory leak.
Review Cisco Networking for a $25 gift card