05-07-2025 05:56 AM
Hi!
We are a small science institut having like 35-40 Cisco access points that are connected to the near-by University and their Cisco Wireless Controller (9800-40-A?). However, some of those APs do loose their connection to the WLC or cannot find the WLC reliably. Sometimes they are connected and have clients on them, but the next morning the AP is flashing green/red. Sometimes are power cycle helps, more often it does not. It's not always the same set of APs having these issues.
Previous network setup was a VLAN 300 that was connected directly via the switch and dark fiber to the University. Than the University forced us to remove the VLAN 300, because they wanted to get rid off of that VLAN. So we set up a new VLAN 30 which is now behind our pfSense firewall (yeah, we are low on budget).
We have these types of APs:
We have no access to the controller and the guys at the University are sometimes unresponsive. There are times when 50% of the APs are offline, causing great frustration of the users about the unstable (or: not working) Wifi.
The Firewall allows the necessary ports like 5246-5248, NTP, DNS, mDNS.
I'm running out of ideas and my debugging options are limited as I have no access to the controller. Currently 9 APs are "offline":
Please find the attached log file from console output of a CW9176I AP failing to find the controller. Any help/suggestions are appreciated!
Thanks!
Ingo
Solved! Go to Solution.
05-21-2025 05:14 AM
FYI, it seems that we solved our issue with the APs not being able to join the controller.
Not really sure what did the trick but we saw some "Single:No_Traffic" states in pfsense Firewall. This leads to the assumption that NAT is not correctly working and that one way is working (to the controller), but the other way is not working (from controller to AP).
I've then added another NAT rule in front of our general rule of doing NAT for all 10.0.0.0/8 network to non-RFC1918 networks (i.e. the Internet), but when adding that other rule for 10.10.30.0/24 to the specific destination network, the AP in question suddenly worked.
Why this did the trick for the missing 4 APs while it worked without that additional rule for all the other 34 APs: absolutely no idea! Maybe the Elders of the Internet will know, though...
Anyway, thanks to all who replied and tried to help! Very much appreciated!
05-07-2025 06:09 AM
@inju hi, according to logs, its more in to network related. seems link your AP cannot discover properly.
1. check the end to end network connection stability from AP to WLC
2. try configuring the WLC IP manually in the AP and see if that change the situation.
05-07-2025 06:22 AM
Thanks @Kasun Bandara for your quick reply!
Sadly we have no access to the AP, at least the default Cisco/Cisco does not work. I guess, when the AP accidentially found its controller once it will get a new password from the controller, which is unknown to me.
In terms of networking we already increased the UDP timeouts for UDP First/UDP Single/UDP Multiple from 60/30/60 to 600/300/600 but without improvement. Increasing UDP timeout was a recommendation of the University controller admin, though.
However, 29 of 38 APs are indeed online, so I won't assume that it is a network problem. Basically the network is the same for all APs...
05-07-2025 06:43 AM
@inju hi, mm ok. i assumed your all APs are randomly going offline which may due to some network bandwidth or instability. but if its stable as you say you can remove that fact.
according to debug log, AP have difficulty to discover and find the WLC. i suspect some issue in discovering WLC. if network is stable, next points are DNS where AP using to discover IP of WLC. this can force by manually setting WLC IP in the AP.
is this affecting to other APs at main campus site? or only for your location?
05-07-2025 06:55 AM
5246-5248
Udp ports need are
5246 and 5247
Are you sure you allow ports?
MHM
05-08-2025 01:04 AM
Yes, the firewall reports no blockage, so I'm pretty sure. And at least all APs are within one subnet and right now 28 of 38 APs have no issues and are working. So for me there is no general issue in networking. That doesn't mean that there might be issues, of course... but then again it would be helpful to know what I need to look for to catch those... 
05-08-2025 07:01 AM
Only control port is open and flow' this work if you have AP central auth and flex data' you need both port open when you use central switching abd authc.
Open port 5247.
MHM
05-09-2025 01:04 AM
From within the network nmap -sU -p 5246-5248 to the controller shows:
PORT STATE SERVICE
5246/udp open|filtered capwap-control
5247/udp open|filtered capwap-data
5248/udp open|filtered caacws
05-09-2025 03:59 PM
That doesn't tell you anything useful - it just means :
Nmap does not know for sure whether the port is open or being filtered. The
UDP, IP protocol, FIN, NULL, and Xmas scans classify ports this way
05-07-2025 07:10 AM
What's different about the APs that are offline vs. online? Are they a different model then the others? Are they in a different VLAN/subnet? Are they connected to a different access layer switch and/or a different distribution switch/router? Check interface statistics on all switch ports between the APs and the WLC, are there CRC/input/output errors?
05-08-2025 01:14 AM
Nope... all APs are within the same VLAN and subnet. They are of course connected to different switches. Some are connected to old WS-C2960S-48LPS-L (hey, don't blame me! I just took over what my predecessor in this job handed me over! ;-)) or some Extreme switches like X440G2-24p-10G4.
There is no clear pattern. An AP that is running today might be offline tomorrow. Or vice versa.
The APs are powered via PoE and the switches can not provide full PoE power for all features, but again: the APs are able to run with that (without USB) until they fail eventually.
sh int | i Fast|input errors shows no interface errors, though...
05-07-2025 08:11 AM
  - @inju  There seem to be lots of parameters beyond your control;  you might be better of to manage the APs yourself; you can for instance download and deploy the virtual 9800 controller for free with up to 50 APs : 
                           https://software.cisco.com/download/home/286322605/type/282046477/release/Dublin-17.12.4
M.
05-07-2025 09:17 AM
Since you say that sometimes the APs are registered and sometimes those drop offs, I am expecting the config to be okay (but no harm in double checking the WMI configuration of WLC).
The logs says that the AP has sent the discovery request and waiting for a response. Now the things need to be checked is whether this discovery is hitting the controller or not - and the only way to confirm is to have WLC access and check join stats and take some RA trace.
Curious to know - when APs drop off and especially sending Discovery Request and not getting any response back - at that time if you ping the WLC WMI from the AP connected core switch using source vlan 30 with 100/1000 repeat, how it goes?
as @Mark Elsen said, running a centralised deployment without having access to the WLC is not worthy. Better you explore 9800-CL or CW9800M
05-08-2025 01:26 AM
Being a Cisco guy (well, UC & Network engineer for >11 years) that would be my solution as well, but as stated in the OP we are short on budget (which is quite normal for the public sector), so having an own 9800 controller might not the way to go over here. It is apparently totally ok that I walk every other day through the buildings and power-cycle all the not-working APs, wasting my working time for the last 8 months this is now ongoing. But investing some money to get a working infrastructure is a big thing that everyone (except me) wants to avoid. ¯\_(ツ)_/¯
Regarding the ping to WLC I need to investigate and come back later with the results. Thanks for the pointer...
05-07-2025 04:46 PM
Are the Cisco APs connected to Cisco Catalyst 9k switches?
 
					
				
				
			
		
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide