We have 2 * 3560 switches only configured for L2 (etherchannel between them) there are 2 * 3800 (12.4) routers configured with HSRP each router is connected to a different 3560 switch. There is a problem that is sporadic where it is not possible to ping the VIP, when looking at the problem it is not possible to ping the other router from either the switch or 1st router. The logs show nothing on either the switch or routers, the only thing that is wrong is an incomplete Mac address entry. If the Router is restarted service returns. The link between the routers is direct (no switch connection)
There is an added complication in that the routers are managed externally and the LAN internally. Any suggestions as to how to track this problem down as currently neither party can see anything glaringly obvious.
Does your topology look like the following?
Are there subordinate switches?
Is the link between routers routed?
Is there a routing protocol between the routers?
Which switch is the spanning tree root for the VLAN that the HSRP relationship is manifested over?
Thanks for responding,
There are no subordinate switches however there are 2 ASA's Active/Passive. I am not sure about routing between the two routers as they are managed by someone else but I would imagine there is routing. Switch a is the root Bridge for this configuration. Just a thought on the ASA should the access list allow both actual IP addresses or just the Vip address or all through the firewall. I also noticed Proxy Arp is enabled on that interface, There is no NAT over that interface so I have tried turning Proxy Arp off.Have tried swapping the routers round, running with just one router but still intermittently the problem occurs, it is not always at a quiet time, I have checked the Arp cache timeout has not been changed.
Any suggestions gratefully received.
I think the topology is beginning to look like this.
If this is the case during the time of failure what is the source of the ping? When the HSRP address doesn’t respond to ping does all production traffic fail to traverse the 3800's? Is a ping successful from any of the devices below during failure situation?
Proxy arp shouldn't be causing a problems in the topology in regard to the HSRP address. I'm assuming the ASA's are provisioned for 'routed mode' and the default route points to the HSRP address? Something like:
route outside 0.0.0.0 0.0.0.0 H.S.R.P
The ACL on the ASA should only need to account for the HSRP address unless you also want to ping the physical addresses as well.
Spanning-tree portfast should be enabled facing the 3800's and the ASA's. Spanning-tree portfast trunk should be considered for the trunk between the to 3560's.
Enable link state logging on the 3560's so you'll have awareness of the 3800's dropping link during a failure situation. Also take careful notice of the syslog traps on the 3560's in regard to mac address movement.
While troubleshooting during a failure situation the following should be evaluated.
ASA failover – has the ASA failover state toggled?
Does the Enterprise switch fabric show the ASA inside mac address behind the proper switch port?
Does the ASA have an arp entry for the HSRP address and is it what it's supposed to be?
Do the 3560's show the ASA outside mac address behind the proper port?
Do the 3560's show the HSRP mac address behind the proper port?