05-07-2025 05:56 AM
Hi!
We are a small science institut having like 35-40 Cisco access points that are connected to the near-by University and their Cisco Wireless Controller (9800-40-A?). However, some of those APs do loose their connection to the WLC or cannot find the WLC reliably. Sometimes they are connected and have clients on them, but the next morning the AP is flashing green/red. Sometimes are power cycle helps, more often it does not. It's not always the same set of APs having these issues.
Previous network setup was a VLAN 300 that was connected directly via the switch and dark fiber to the University. Than the University forced us to remove the VLAN 300, because they wanted to get rid off of that VLAN. So we set up a new VLAN 30 which is now behind our pfSense firewall (yeah, we are low on budget).
We have these types of APs:
We have no access to the controller and the guys at the University are sometimes unresponsive. There are times when 50% of the APs are offline, causing great frustration of the users about the unstable (or: not working) Wifi.
The Firewall allows the necessary ports like 5246-5248, NTP, DNS, mDNS.
I'm running out of ideas and my debugging options are limited as I have no access to the controller. Currently 9 APs are "offline":
Please find the attached log file from console output of a CW9176I AP failing to find the controller. Any help/suggestions are appreciated!
Thanks!
Ingo
05-08-2025 01:16 AM
Nope, old stuff:
Cisco WS-C2960S-48LPS-L and Extreme X440G2-24p-10G4
05-08-2025 12:28 AM
When you had vlan 300 did it work without any issues ? was it after creating the new vlan 30. Can you check the DHCP pool to see enough scope is configured ?
05-08-2025 02:43 AM
Regarding prior use of VLAN 300: here the feedback differs whom you'll ask. Manager says that it worked fine before the migration, some users say that it didn't work back then as well. As the VLAN 300 to VLAN 30 migration was at my job start I can't comment much how well it was working before the migration.
Meanwhile I have an estimate from our Account Manager and let me state it this way: having me walking around the house twice a week for 30 mins each to reset the APs is more expensive than ordering a virtual controller.
But until then I still have to deal with the APs and unresponsive admin of the controller... *sigh*
05-09-2025 01:02 AM
There is enough space in the pool. We have like 40 APs and the pool is configured to use the whole /24...
VLAN 30 was created after VLAN 300, but then again: out of 38 APs in that VLAN 30, as of now 29 APs are actually working. 9 APs do not.
05-09-2025 05:19 AM
> There is enough space in the pool. We have like 40 APs and the pool is configured to use the whole /24...
Well it depends what DHCP server you are using ... There is a well known issue with Microsoft DHCP server that the APs will cause the server to mark addresses as "bad" and then the server runs out of usable IPs even though there should be plenty free. Cisco provided a fix for it on AireOS but have decided not to fix it on 9800. See the details at:
https://bst.cisco.com/bugsearch/bug/CSCvj14517
Although this is only known to happen with MS DHCP server it's possible that it could affect other products too.
So rather than assuming the addresses have not run out you should make sure <wink>
05-08-2025 02:05 PM
Well your troubleshooting options are limited without access to the controller!
From the log it's clear that the AP is not getting any response to the discovery requests - presume those 4 WLC IPs it's trying are correct?
So either:
- the discovery requests do not reach the WLC
- the WLC does not respond to the requests
- the requests or responses are getting dropped by the network
Ideally you'd be starting with a packet capture on the WLC to make sure the requests are received and the WLC replies then you'd at least know whether the problem is the WLC or network.
What is the connectivity between you and the WLC? We did once have a problem with a customer using a 3rd party network service which had integrated IDS/IPS (which we only found out about after they had a similar complaint). It turned out the IDS/IPS was detecting the CAPWAP (UDP 5246/5247) as an "attack" and dropping the packets occasionally. They had to ask the network provider to exclude the CAPWAP traffic on the IDS/IPS - so check whether your "VLAN 300" has anything like that. If you see ping to the WLC working while CAPWAP isn't that would also point to something like this.
Otherwise you're really going to need the WLC owner to work with you ...
05-09-2025 07:26 AM
Hi Rich!
Well, 2 of the 4 IPs are not valid and are actively blocked in our firewall. According to the admin of the controller, these are IPs for another site location of theirs and not reachable from out network, but he can't get some configured out for us. I can only accept this as a given fact.
According the IPS/IDS idea: yeah, we had this idea as well. Or more exactly the idea that something is blocking or discarding packets, because the APs are behind NAT and something like WLC or another external Firewall is blocking or rate limiting our connections. But regarding our IDS/IPS: neither the IPs nor the ports can be found in the IDS logs...
In the meanwhile the WLC admin wrote an email today that he yesterday updated the second controller to a current version. The controllers were first on an older version that rejected the 9176 APs and later one controller got updated and now the second one as well.
We'll see if this update brings improvements...
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide