11-27-2012 04:42 AM - edited 07-03-2021 11:07 PM
Hi,
Some background regarding our setup. We have a WLC (5508) in our main office in Brisbane that is hosting two WLANs. One provides wireless access to our internal network and the second provides wireless guest access. The guest WLAN is anchored to a controller sitting in the DMZ at our Data Centre.
In the DMZ the anchor controller has a management interface and an interface in the DMZ for the wireless guest access. I am using the DHCP server on the anchor DMZ to provide IPs etc to wireless guest clients. The default gateway is 10.8.144.1 which is a VIP or a pair of firewalls.
Initially everything works fine. Guests connect to the guest network, have to authenticate via a web portal (Cisco ISE server) and then can go on an use the internet. Works perfectly until the firewalls fail over and the secondary firewall takes over the VIP address. All access to the internet is lost at that point. If I try to disconnect and then reconnect a wireless client it connects, as in it will get an IP address, but DNS resolution stops and I do not get redirected to the web auth portal. If the firewalls are failed back to the primary then everything works again, no issues. However, if I reboot the WLC while the secondary firewall has the VIP IP everything will work fine as it did on the primary. If the firewalls now fail over to the primary again everything goes to crap. Until either the firewalls are failed back or the anchor WLC is rebooted.
Initially I thought this was an issue on the firewall, but this doesn't appear to be the case. When the firewall fails over it sends out a gratuitous ARP advising of the change in MAC address for the 10.8.144.1 IP address. The WLC seems to update its ARP table because if I run the command "show arp switch" it has the 10.8.144.1 IP address with the MAC address of the active firewall. From the client perspective I have run a wireshark and captured packets on the wireless interface when trying to connect. The laptop is continuously send ARP requests for 10.8.144.1 but gets not reply. Without this the client cannot send an ethernet frame to the gateway and hence get to the DNS server and WEB portal. Internet access breaks. Doing a TCP dump on the active firewall shows it receiving and then sending a reply to the ARP request. It just never gets to the wireless client. Debugging ARP packets on the anchor WLC seems to indicate that the controller is receiving the ARP replies from the firewall. So I'm at a loss as to why things should break when the firewalls fail over.
To make things even weirder....I have a 3750 switch in the DMZ with SVI of 10.8.144.4. I thought I could get a work around where I would make this the default gateway. The theory being that this interface MAC address would never change. However I was wrong. Even with this IP set as the gateway address for the wireless clients I see the exact same bahaviour when the firewalls fail over. I can't explain it other than to say that the gratuitous ARP sent by the firewalls seems to kill the ability of ARP replies to be sent back to the wireless client.
I'm at a total loss at the moment. Any suggestions, no matter how crazy will be appreciated.
Cheers guys.
11-27-2012 05:17 AM
David
Good detail in your question. First, I have an identical setup and had a recent fw failover and didn't have any issues.
I'm curious, what code is your anchor and foreign controllers on ?
Only difference in my design is that I am using a real dhcp server. But that shouldnt have any play on this ..
Sent from Cisco Technical Support iPhone App
11-27-2012 04:52 PM
Hi George,
The software versions that I am running are:
1. Anchor: 7.2.110.0
2. Foreign: 7.2.103.0
I don't have a lot of experience with setting up the Cisco wireless network so I have perhaps not configured something correctly.
Also I took another wireshark file capture this time spanning the port on the foreign controller where the EoIP traffic originates / terminates. Once again I see the ARP traffic being forwarded by the foreign controller, but no replies coming back the other way. It would appear as though the issue is with the anchor controller.
11-30-2012 02:17 PM
When the firewall fails over are you able to do an mping and eping from anchor to foreign?
Alex
01-27-2014 06:45 AM
this behavior seem to be an bug. upgrade foreign and anchor to 7.4.121.0.
11-09-2016 03:31 PM
I know this is an old thread, but I have the exact same anchor controller setup but running version 8.0.133. I was experiencing this with version 8.0.121 and upgraded to 8.0.133 (that's the only code I could get for now), hoping it would solve the problem.... it does not. Firewalls are Cisco ASAs with firepower IPS. This happens whenever the firewalls failover in any direction. The only fix is to clear host xlate entries on active firewall, or reboot the anchor controller. I don't see a bug report on this issue in any release. Anyone experiencing this on the 8.0 train? Any help? I can't engage TAC right now, waiting for Smartnet issues to be resolved but need to work this out sooner than it will take to get the SmartNet issues worked.
Thanks
09-10-2013 08:56 PM
Just to share with you:
There is a good discussion available on Guest Access with Mobility Anchor Chalk Talk:
01-27-2014 02:16 AM
Hi David,
have you got your issue resolved? I'm facing the same problem in an almost same
configuration (anchor controller in the dmz, 2 checkpoints in between, no arp reply)
Robert
01-27-2014 03:01 PM
Hi Robert,
Yes I did get this resolved. I was told by TAC that it was a bug with how gratuitous ARP was being handled by the controllers. At the time it was recommended that I upgrade to 7.3.101 version of the software. This fixed the issue for me. Hope it does for you also.
Cheers,
David
01-28-2014 12:44 AM
Thanks David,
my WLCs are still on 7.2.111.3. The only version I can upgrade to right now
is 7.2.115.2 - but I see nothing in the release notes indicating the problem
beeing fixed there.
Cannot upgrade to 7.3 or higher, because we still use Cisco NCS 1.1.
So first I have to upgrade the NCS, then I can upgrade the controllers.
But I'll have to upgrade the NCS either way, cause we just purchased
Cisco ISE 1.2, which requires the NCS upgrade.
So I'm in a kinda version hell right now, looking for an optimal upgrade
path;-)
12-07-2023 06:47 PM
i faced similar issue recently. We have similar environment mentioned in description. i dont know what triggered this issue but we had replaced firewalls in DC. 5520 foreign was on 8.7.106.0 and 5508 anchor was on 8.5.164.0. Somehow anchor was working good with another foreign which is 9800. ARP replies by firewall was reaching anchor but not foreign. And issue resolved after we reloaded anchor.
12-08-2023 09:31 AM
> 5520 foreign was on 8.7.106.0 and 5508 anchor was on 8.5.164.0
5520 should be running 8.10.190.0 and 5508 should be running 8.5.182.11 (link below)
Always refer to TAC recommended link below to find what version of software you should be using.
If you still see the problem on latest software then you need to look at other possible issues. We've seen problems with gratuitous ARP not being received by some devices occasionally. So first make sure the firewall is sending the gratuitous ARP and that the WLC has received it. If it gets missed for some reason then you're going to need a shorter ARP timeout to force the WLC to update ARP cache.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide