Catalyst 6500/Sup 720 ARP discards

Alan Boyd · ‎12-12-2011

Hi,

We have a failover pair of loadbalancers (non-Cisco) which are connected to each other via Catalyst 6509Es with Sup720 supervisor cards. Failover is achieved by the newly active loadbalancer GARPing all its service IP addresses with the relevant MAC address in order to update nearby ARP tables (failover GARPs are fired out by the loadbalancers at a rate of 200 per second). Failing over services between these loadbalancers has been found to be problematic, with numerous services not failing over in a timely manner.

Some of the loadbalanced networks involved are routed on the Sup720s, the rest are routed on FWSM modules in the same chassis. Problems occur only with VLANs routed on the Sup720s; all VLANs routed on the FWSMs failover without issue.

Investigation has shown that this is due to a proportion of the ARP table entries not being updated in the Sup720 and (with the default 4 hour ARP table timeout) subsequently requiring a manual flush of the "wrong" ARP entries.

Testing by throwing GARPs at both the FWSM and the Sup720 has revealed the following:

we can quite happily throw ~200 GARPs per second at the FWSM and all the relevant ARP table entries are updated with the correct MAC address. This fits with the sucessful failovers for any FWSM routed networks.
however, repeating the same test with a VLAN routed on the Sup720 results in GARPs being dropped by the Sup720. The Sup720 keeps discarding a proportion of GARPs until we drop the rate of GARP to <~75 per second.

We're not rate-limiting ARP anywhere in the Sup720 - hardware or otherwise - and the FWSM handles the rate of GARP without issue. Is there any built-in restriction on the Sup720 we're not aware of that would cause the observed behaviour, is it configurable, or can anyone suggest what else could be causing the observed behaviour?

Thanks

amikat · ‎12-14-2011

Hi,

Will you please post outputs of these commands:

"show policy-map control-plane",

"show mls qos protocol",

"show mls rate-limit".

Thanks & Regards,

Antonin