12-01-2023 12:36 AM - edited 12-01-2023 01:30 AM
Hi all,
I manage a school environment with 4 locations. Until last year, there was one Cisco 3504 and 20-30 1832 APs at each site. Then came a central vWLC to manage all 4 locations. This is installed on a HyperV at one of the 4 locations. With the move to the central controller, the WLANs were switched to FlexConnect and the 4 3504 controllers were kept at the locations as backup controllers (as would actually not be necessary with FlexConnect, but that's what we have).
Three SSIDs are configured for each location on the central controller and the respective VLANs (which are identical at each location) are assigned via a FlexConnect Group. Authentication works easily in every SSID using a WPA2 password.
Across the entire environment, there are now frequent problems with clients logging in that are unable to log in to the SSIDs despite multiple connection attempts. After debugging on the central controller for several clients, I increasingly suspect a problem in combination with WPA2 authentication. If I set the authentication to open or WPA(1) as a test, then there are no problems with the login. And the quick switching between the SSIDs is possible with a client without any problems. I have attached the logs from a failed connection attempt. Maybe someone can help me with the problem.
att1.txt -> Android phone tries 2 or 3 times to connect to a SSID and failed. As a result, the device then switches back to a different SSID to which it was previously connected.
att2.txt -> Here the client is not able to connect at all and is rejected.
att3.txt -> Same behavior like att1
12-01-2023 01:03 AM
- Have the attachments analyzed with : https://cway.cisco.com/tools/WirelessDebugAnalyzer/
M.
12-01-2023 01:29 AM
in att1 I see client receive a valid DHCP and goes into RUN state:
*DHCP Socket Task: Nov 30 13:06:32.339: e4:84:d3:65:b2:0d 10.4.50.36 RUN (20) NO release MSCB
*DHCP Socket Task: Nov 30 13:06:32.339: e4:84:d3:65:b2:0d Assigning Address 10.4.50.36 to mobile
*DHCP Socket Task: Nov 30 13:06:32.339: e4:84:d3:65:b2:0d DHCP success event for client. Clearing dhcp failure count for interface management.
While in att2 I see client disconnected due to a change in WLAN:
*apfMsConnTask_7: Nov 30 13:18:31.754: e4:84:d3:65:b2:0d Deleting client immediately since WLAN has changed
And same for att3 than for att2:
*apfMsConnTask_7: Nov 30 13:18:31.754: e4:84:d3:65:b2:0d Deleting client immediately since WLAN has changed
12-01-2023 01:49 AM
Thanks for both answers. I had to re-upload the att3, it was the wrong file. I didn't know the "Wireless debug analyzer" yet. Only the Config analyzer. I once looked at the 3 logs in this tool, but I'm not really any smarter as before. @JPavonM : If the client ends up going into run state, it happens in the wrong SSID. He is supposed to connect to the SSID where he would get 10.4.55.x and that fails. The client then switches to the SSID to which it was previously connected and then receives an address 10.4.50.x
12-01-2023 01:56 AM
In my opinion, the error always occurs in the same way. But I still don't know what the problem is.
12-01-2023 04:50 AM
>...In my opinion, the error always occurs in the same way. But I still don't know what the problem is.
- Note that for Wireless Debug Analyzer it is advised to input a longer debug session (from a particular client) to get a better overview of what is happening ,
M.
12-01-2023 02:13 AM
Check both Policy profile forwarding VLAN or Flex profile's vlan-2-wlan mapping.
12-01-2023 02:22 AM
What do you mean specifically by “Policy profile forwarding VLAN”
12-01-2023 02:49 AM - edited 12-01-2023 02:51 AM
your wlan-vlan map is associating GRS-HE-UT and GRS-BEB-UT to VLAN55 and that vlan is not allowed on the trunk port:
*apfMsConnTask_7: Nov 30 13:02:10.865: e4:84:d3:65:b2:0d Processing assoc-req station:e4:84:d3:65:b2:0d AP:dc:f7:19:45:a0:e0-00 ssid : GRS-BEB-UT thread:dcef819d30
*
12-01-2023 02:54 AM
switchport trunk allowed vlan remove 1-49,51-54,56-59,61-249,251-4094
->in configuration of the switch you can all not allowed (removed) vlans. it is a cisco small business switch with an extraordinary way to display the Vlans in the trunk.
12-01-2023 04:49 AM
@Heiko Kelling as previously said you are missing VLAN55 on the trunk port of the switch, so any device connecting to those SSID won't join.
12-03-2023 07:42 AM - edited 12-03-2023 07:43 AM
@JPavonM you're misreading the switchport config. It's REMOVED VLANs not ALLOWED VLANS. So 55 is allowed - between 54 and 56.
@Heiko Kelling
1. What version of software are you running on the vWLC?
2. What version of software are you running on the 3504 WLCs? (should be same version as central)
(hint: refer to TAC recommended versions link below - currently should be 8.10.190.0)
3. Do you have mobility configured between the vWLC and the site WLCs?
4. You say "WLANs were switched to FlexConnect" but it's APs not WLANs which are configured for FlexConnect. WLANs on a FlexConnect AP can be configured for central or local switching and authentication so it's the specifics of the WLAN configuration which matter.
5. Do you have Fast SSID changing enabled? Who knows why Cisco did not make this default but you have to enable it explicitly otherwise any client switching between SSIDs will get access denied (your mention of switching SSIDs hints that this could be an issue)
https://www.cisco.com/c/en/us/td/docs/wireless/controller/8-10/config-guide/b_cg810/client_roaming.html#fast-ssid-changing
6. Make sure your AP groups and Flexconnect groups on the local and central WLCs are identical (same number of identically named and numbered WLANs and in the same order). There have been a number of bugs around this fixed over the years but we've still seen occasional problems if they are not the same - the AP gets "confused" sometimes.
7. Obviously you should have a separate flexconnect group for each site.
Going back to your design - if you have on-site and central WLCs why not make the on-site WLC the primary with the central WLC as backup?
12-04-2023 03:27 AM
Hello @Rich R , thank you very much for your detailed answer and your effort to help me!
to 1. Version 8.8.130.0 is currently running on the central controller and version 8.10.190.0 is running on the local controller at one of the schools. The reason I moved the access point at one school to the local controller was to be able to recreate the problem without affecting the other schools. First I had 8.8.130.0 locally and centrally and as a test I later updated to 8.10.190.0.
to 2. see answer 1.
3. No, I haven't done that at the moment because either all access points are always on the local controller or all access points are always on the central controller. There is no case where both controllers have active access points.
4.Ok sorry. You're right. The access points are all in FlexConnect mode. The SSIDs are all configured to use local switching. This is what the settings look like for each SSID.
5. I currently do not have Fast SSID change activated. However, this is the first setting I will try the next time I am at the school. I came across this again and again during my research and when evaluating my logs. In general, however, you can say that there are problems not only when switching between SSIDs, but also when devices completely reconnect without first being in one of the other SSIDs. I was still thinking about unchecking “Client Exclusion” because clients always seem to end up on the dynamic blacklist. The third point I kept finding was changing the “Management Frame Protection”. But I'm not sure if a change will make a difference. But these are all things that I would like to change and test step by step at the next on-site appointment.
6. I'm relatively sure that there are at least minor differences, but I've always been of the opinion that as soon as the APs switch to a different controller they reload all their settings...or is that not the case?
7. I've already thought about it. I'll plan that for the next appointment too.
"Going back to your design - if you have on-site and central WLCs why not make the on-site WLC the primary with the central WLC as backup?"
This was done at the request of the project performance back then. From my point of view, that makes sense, because the goal is to have central management in one place.
12-06-2023 10:38 AM
The trouble with having APs switch between WLCs on different software versions is that every time they move (even for a few minutes they must download the software (if they don't already have it on flash) and then reload. If you're using N+1 then all your WLCs should run the same version.
Again I will say if you have N+1 then AP groups and WLANs should be identical - APs do get confused. Don't risk differences, it will catch you out eventually.
Just enable Fast SSID switching - I've never heard a reason for disabling it.
Although your WLAN is using flex local switching it's still using central auth. What authentication method are you using? You might want to consider using flex local auth too.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide