cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
485
Views
1
Helpful
6
Replies

Rebooting the AP after its disconnected from WLC

sroic
Level 1
Level 1

Hi, so we have a central 9800 controller in AWS and APs in Flexconnect mode running in our offices. They are switching traffic locally and using IPsec to reach the WLC. Our main SSID is dot1x and using central authentication with ISE also in AWS.

The issue occurs when e.g. an ISP fails in an office and the APs get disconnected from WLC. If the issue is not longer then couple of minutes the APs usually reconnect back and everything is fine. But if its longer the APs don't reconnect and at this point only PSK wifis work, dot1x stops authenticating new people. Retransmit timers under AP Join profile are set to max (Count: 8, Interval: 5 sec), afaik this cannot be set to more. I don't understand why the AP doesn't keep looping indefinitely with the reconnection attempts, or at least have an option to set this.

What we are left at this point is to:

1. manually connect to AP via SSH and try capwap ap restart command which sometimes works and something doesn't.

2. manually connect to AP/switch and reboot the whole AP. This is basically what we are doing right now

 

Now, there are options to reboot the AP from the WLC and DNAC but ofcourse non of these work when the AP itself is not connected to WLC.

I'm trying to find a way how to reconnect AP back to the controller after it has been disconnected for longer then e.g. 5 min. Right now only option I see is developing a custom script that will track WLC logs and then after some timeout go to switch/AP and reboot it. Which seems like an overkill, hard to fine tune and hard to administrate in the future. Is there some integrated option that I'm missing, how do you guys do it and has this been an issue for your deployments?

6 Replies 6

Scott Fella
Hall of Fame
Hall of Fame

That should not be the case.  You do have high availability defined on the access points with the hostname and ip of the controller(s) you have in AWS?  With this, the ap should always try to join the controller as long as the ap has a valid dhcp address.  What you can do is gather data by consoling into the ap and capturing the output.  Also make sure you don't have any DHCP option of DNS that might be pointing to another controller or possibly an ip that was once used by a controller.  While you are doing this, you should also open a TAC case because it is a good idea to have someone look to see how you have everything configured. The HA for the ap's can be defined on each ap or in the ap join profile. 

-Scott
*** Please rate helpful posts ***

sroic
Level 1
Level 1

We have 2 controllers in N+1 setup, both added as primary and secondary in the AP join profile and also configured on each AP itself. Also we use dhcp option 43 to point to the primary controller IP when booting up, nothing else.

Also I'm quite sure the APs don't retry the discovery process after the timeout I mentioned above but will test it once again.

I'm interested if this same issue happens to other engineers in their deployments and how do they solve it. Maybe it doesn't and we have something misconfigured but I don't see it.

 

@sroic   >...Maybe it doesn't and we have something misconfigured but I don't see it.
                    It's always useful to validate the configuration of the 9800 controller in  AWS
                    using the CLI command show tech wireless and feed the output into Wireless Config Analyzer
                    Checkout all advisories!
                                      Use the command denoted in green , do not use show tech-support
                                      for WirelessAnalyzer.

  M.



-- Each morning when I wake up and look into the mirror I always say ' Why am I so brilliant ? '
    When the mirror will then always repond to me with ' The only thing that exceeds your brilliance is your beauty! '

Rich R
VIP
VIP

The APs should keep on re-trying, no special config needed.
What version of software are you using? Refer to TAC recommended link below.

 using IPsec to reach the WLC
Can you be more specific?  What are the 2 endpoints of the IPsec tunnel?
Problem is much more likely to be the IPsec than the CAPWAP - sounds like an SA timing out.

Check the WLC config as recommended by @marce1000 but this doesn't sound like a WLC config problem to me.

sroic
Level 1
Level 1

Thank you for your inputs. We have an IPsec tunnel between local Fortigate firewall and AWS Cisco router. Will check that link for packet loss etc.

Meanwhile if this is the case that the AP keeps trying to reconnect to WLC indefinitely, what is the purpose of these retransmit timers:

sroic_0-1741772985364.png

 

Well that timer is in place for a reason, but maybe you have other issues on the FW.  Maybe check for stale entries or if the FW start blocking the discovery requests.  There are ports that need to be allowed for the timers to function as intended.

-Scott
*** Please rate helpful posts ***
Review Cisco Networking for a $25 gift card