I have six 5508 WLCs and just upgrade one of them from v184.108.40.206 to v8.2.100; the upgrade included the field recovery image to v220.127.116.11. To descibe the problem, I think a short diagram is needed: (WLC1 id on v8.2.100 and WLC2 is on v18.104.22.168)
Central site: | Remote site:
AP1(2702)-----cisco2960XR----Nexus7706-----10GE Wan to remote site----cisco4500X-16----cisco2960X----AP2(2702)
1. AP1 registered to WLC1; AP2 resgitered to WLC2
2. clients associated with AP1,SSID1, work as expected - no problems
3. clients associeted with AP2,SSID1, work as expected - no problems
4. Register AP2 to WLC1; clients associate with SSID1, can ping internal and external by IP or by name; but CANNOT browse, ftp, etc.
5. Bring AP2 to central site, still registered with WLC1; everything works as expected - no problems
6. Bring AP1 to remote site and replace AP2, AP1 still registered with WLC1 - cleints still can't browse, etc.
7. Bring AP2 back to remote site and re-register to WLC2 - everything works agin, as expected.
I took a packet capture, wireshark, of the cleint experence when not working, but have not had the time to look in detail yet.
I found a bug the "sort of" describes my exact problem, CSCux92224, but it does not explain why the clients work when the AP2 is located at the central site and registered to the WLC1, but do not work when the AP2 is located at the remote site while registered to the same controller.
The obvious variables here are the new code on the AP, the cisco4500X, and the difference between the access switches (2960XR vs 2960X). The access ports on the switches are the same and the uplink ports from the access switches are configured the same and allowing all vlans
I can ultimatly open a TAC case on this, but wanted to float it here first. So, the questions I have are; Is this overly obvious and I'm missing something? Has anyone had this same problem? If so, what was the reason?
Thanks for any help.
Are you using FlexConnect? I recall there being some bug using FlexConnect with code around that release. It was something to do with SSID to VLAN mapping. I'm a bit grey on it now.
If you are using FlexConnect, and if you can turn off FlexConnect for the APs at the site, and if the issue goes away ... then you probably hit that bug.
Thanks Philip, but all the APs are in local mode; I should have mentioned that in my original post. I have been doing some looking at the wireshark capture of the client during the broken period and see a tremendous amount of "Dup ACKs". Also the following:
I'll have access again tomorrow and dig a bit deeper.
This is starting to smell like a Nexus config issue.
Is it possible to plug the WLC's into the cisco2960XR to see if the same issue happens?
I can't easily bring the 5508 to the remote site and connect to the 2960X; What I did do today was bring a 2960XR to the remote site, trunked it to the 4500X, and connected AP1 to the 2960XR; still have the same client browsing problem.
To your point about the configuration of the Nexus; first, I made sure the mtu1500 was set from end to end; The uplinks from both the controllers are configured the same: 1x4GB aggregate to a port-channel, mode on. If the Nexus/path was the problem, then why would the APs at the remote site work fine when registered to the WLC2 (old code), and not work when they register to the WLC1(new code)? Still a head scratcher though as to why the central site APs work fine when registered to with the WLC1 or WLC2! Driving me nuts and all I can find is the "Dup ACKs" in the capture. I spoke with a Cisco Wireless engineer about this problem today and he said don't use the version of code (the 8.2.100) and rather wait for the M3 release. I need to put this on the back burner for a while and start again, later, with a fresh perspective.
Thanks for looking and responding.