cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
541
Views
0
Helpful
2
Comments
Saravanan Lakshmanan
Cisco Employee
Cisco Employee

With multiple AP flaps(between two different WLCs running same or different code), In this uncontrolled environment, We're parallel stress testing all critical control and data plane ques and components aggressively with reoccurring AP code download, AP join process, mobility tunnel, roaming/web-auth/aaa instance, avc/netflow/bonjour, client traffic, AP/client entries that created and deleted on WLC's database, this may create unexpected user experience due to ripple effect until it all settle down on their expected WLC by itself. This situation should be avoided.

How to avoid/contain continuous APs flapping between WLCs that happens due to network outage/un planned migration?
//AP flap created consequence is not an problem, it is an WLC misconfiguration that could create an unintended incident. Per Best practice it is recommended to disable AP-fallback option, so when primary goes down, APs moves and stays with secondary, irrespective of primary's new status(up/down/not reachable).

Different N+1 deployment outage/migration scenarios and how to handle them:-

issue #1 L2/L3 core/network outage issue.
//on WLCs, Disable AP fallback(nice to have this disabled all the time), Globally disable 2.4 & 5ghz network and WLC's data uplink to avoid packet storm until the issue is restored, once conforming the issue restoration enable WLCs data-port, wait for APs to stable, enable 2.4/5ghz network and monitor the stability.

issue #2 Primary controller going down incident:-
//With AP-fallback setting is disabled on primary and secondary, And when all APs on one/primary 8510, if that fails all APs expected to go to/failover to secondary 8510 and stays there. Once the issue is remedied on WLC primary, Identify downtime and move the expected APs back to primary by enabling AP-fallback on secondary wlc.

issue #3 unplanned/emergency/uncontrolled migration due to incident.
//This should be avoided at all cost when there is no catastrophic issue related to wireless is triggered/root caused. follow below steps from disabling AP-fallback on primary and secondary wlc and go from there.

issue #4 Planned & Controlled migration.

Expected datacenter planned migration process per Best practice:-
#It is highly recommended to follow pre-download procedure to offload downloading AP image and reduce downtime.
#Scenario – Aps joined/load-balanced on both primary and secondary WLC.
#Make sure AP HA - WLC name and ip are properly configured on all the APs for this to work right.
#Plan for scheduled outage.
#Pick timing that has least load of the day depends on the customer business or midnight where Less clients and less traffic expected.
#In general, AP fall-back needs to be disabled when frequent AP-flap is expected and OR on N+1 deployment with high volume of APs expected to frequently flap on WLC running different code on salt and pepper deployment.
#Disable AP fallback on primary and secondary WLC, if already enabled.
#on primary and secondary WLC, Globally, Disable 2.4 and 5ghz network. And disable all WLANs. It will disconnect/shut all the client traffic on WLCs.
#Disable Primary's uplink data-port, APs will move to secondary WLC, now secondary WLC have all the APs and no clients.
#Enable primary WLC's dataport(APs that are left won't fallback to primary since AP fallback disabled on secondary), now Upgrade primary WLC and reboot.
#Enable AP-fallback on secondary WLC, now those specific intended APs should come back download new code and join the primary wlc. Wait on all the intended/expected APs to join.
#Disable AP-fallback on secondary WLC.
#Follow the same steps for secondary/other WLC.
#Disable secondary(intended primary) WLC's uplink data-port, existing APs move to other(intended secondary) WLC and download new code and join the wlc.
#Enable secondary(intended primary) WLC's uplink and upgrade secondary wlc and reboot(APs that are left won't fallback since AP fallback disabled).
#Enable AP-fallback on primary(intended secondary), the intended APs would join now and wait until all APs are stable.
#Disable AP-fallback on primary(intended secondary).
#on each WLC, Globally, Enable 2.4 and 5ghz network, Enable WLAN one after other.
#Monitor for stability.

Comments
Freerk Terpstra
Level 7
Level 7

Hi Saravanan,

I totally understand that you disable the "AP-fallback" feature while doing an upgrade to steer the AP behavior. However, having this feature disable by default "per best practice" is new for me. It really depends on your environment if you ask me if it is a good choice to enable or disable this feature. However, if it is "best practice" maybe it is an idea to add it into this document?

Saravanan Lakshmanan
Cisco Employee
Cisco Employee

//If no issues expected or issue expect to happen once then irrespective of AP-fallback config the issue has no consequences.

Below can occur at any time.(most of the time the WLC default config is the best practice for that config.)
//When failure on one WLC in an N+1 with salt and pepper deployment on 8500, 7500, 5500 due to network flap or WLC caught on boot loop or any incident makes WLC unreachable like clock work at certain frequency. It will disrupt the current production and introduces unnecessary stress conditions on WLCs.

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: