11-09-2021 06:36 AM
I have migrated to a new 9800-40 from an 8510. Recently, at approximately noon every day some APs drop from the 9800. I did use an AP template to move the access points to the new controller, but I left the old controller in as a secondary controller. I have the old controller off so the APs don't drop for long. I have a case open with TAC but they haven't been the most responsive. I have include a few log files they had requested. AP models are 3702i and 3802i.
One odd thing I have noticed in the AP config, the ones that drop have a very long Controller Association Latency time, usually 4-5 minutes long. Not sure exactly what that means.
02-08-2022 12:48 PM
Okay... just making sure that the ap's are not trying to join another controller.
02-08-2022 12:44 PM
Did you remove the native vlan configuration under the WLC uplinks? You also need to remove any native VLAN config from switch side port configuration as well.
Also I am curious to know why /16 subnet being used in the WMI interface? If all the AP's are in one site, you have to remember that if there are more than 100 AP's it is recommended that you have a dedicated AP Management VLAN, so WMI interface is isolated. If AP's are in remote site, AP's can register to WLC only if the AP's are speaking to SVI VLAN2. So make sure that you advertise option43 correctly.
02-08-2022 01:02 PM
No I haven't. Why is that bad if the device is reachable that way?
Well when you say site, do you mean a site tag or the physical sites the APs are at. I don't have a AP Management vlan, never have.
02-08-2022 11:16 PM
@jasonmeyer just in case, if AP mode is Flexconnect then do you know that is best practice not to use site tags with more than 100 APs?
02-09-2022 07:30 AM
APs are in local mode.
02-09-2022 07:58 AM
So your AP's are in local mode, so are your AP's in the same physical location as the controller? Did you isolate if the issue is with specific access points or every single AP? Have you tried to place access points on the same subnet as your controller management, if your AP's are in the same location? There is no firewall between the AP's and the controller?
I want to make sure that you don't have any discovery methods that can make the AP search for an existing AP on the network. This can make it very difficult to troubleshoot if the AP has the information of an existing controller and decides to move. Even if the AP tries to move and fails back to the 9800, you will see the ap disassociate. Removing any entries for HA on the AP, discovery methods on DHCP, DNS of upd forwarding will help. Mobility Groups will also share that information to access points, so if you have created a mobility peering between your 9800 and any AireOS controller or another 9800.
02-09-2022 08:08 AM
02-09-2022 08:39 AM
Now I recall this is a school. Why are you not using FlexConnect? Can this also be an issue with congestion over the WAN? Local mode will send all traffic back tot he controller over your WAN. Even if you wanted traffic to come back to the site the controller(s) are located, having AP's in FlexConnect mode would be better. Now the max is 100 AP's per FlexConnect Group, but you can have multiple FlexConnect Groups if you want. This might keep your AP's stable, but really up to you. If you have more than 100 access points in a site, you can divide the site where there might not be any roaming. Like separate the main building for any outside buildings, maybe gym's, etc. Do keep in mind that roaming is supported within a FlexConnect Group, but roaming to a different FlexConnect Group will cause a re-auth.
02-09-2022 08:49 AM
02-09-2022 09:30 AM
Well that can also be something in the WAN. With FlexConnect, your AP's will stay online and is not sensitive to any congestion or disruption on the WAN. You need to look at your overall traffic flow and also look at how the wired infrastructure is in each school. The reason I say this, is because all wired traffic hits the switch and if any traffic isn't in the local site, the traffic will egress that site router to its destination. This is the same concept with FlexConnect. Traffic would be placed on the local switch and the infrastructure will handle the routing. Of course, the WLAN would have to be defined for local switching in order to drop traffic locally. As an example, maybe the building where the controller is located is the egress point for internet. The guest SSID can still be defined as local, so all guest traffic would come back to the controller.
Now back to FlexConnect and having more than 100 AP's. This is something you would need to look at closely. There are some things that you don't really care about with roaming. Maybe from the basement to the 1st floor or any buildings/structures outside of the main building. This is how many have worked around the 100 AP max limitations. They might break up floors... floors 1-5, 6-10 in a high-rise building, etc.
So think about the traffic flow, then think about where roaming is not possible or where its okay to re-auth. Re-auth is not bad for ope or PSK networks. with 802.1x, does it matter when folks are walking with their laptops in their bad or the lid shut? Anyway's that would be another project, its not bad to get done, but there are changes that have to be made to migrate from local to FlexConnect. Its more work on the backend and transparent to the users when done right.
02-09-2022 11:00 AM
02-09-2022 11:12 AM
Its a better design especially if traffic stay's local to the site.
02-09-2022 01:41 PM
02-09-2022 02:03 PM
Well the other sites can be FlexConnect and the HS can stay local. That would be better.
02-10-2022 06:28 AM
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide