HSRP Issue ( HSRP Flap logs on one router only)

Mayank Nauni · ‎02-27-2011

Hi all,

Greetings; we have been facing a strange problem from couple of months on most of the Cisco routers. We are receiving HSRP flapping alerts from most of the Cisco routers (Cisco 2811). While we check for the HSRP logs on both of the routers to our surprise this flapping phenomenon is visible on one router only and there is no production impact because of these false alerts. It is happening on at least 85 such devices. When we try to emulate the same condition (with same configuration) in the Lab environment we still get these HSRP flap alerts. Can somebody help us to address this issue?

The relevant configuration is below:

Router1

______________________________________

track 1 interface Tunnel2 line-protocol

______________________________________

interface FastEthernet0/1

description LAN Network

ip address 190.16.128.242 255.255.255.224 secondary

ip address 10.204.5.112 255.255.255.0

ip access-group BROKER-IN in

duplex full

speed 10

standby 11 ip 10.204.5.113

standby 11 timers msec 100 msec 300

standby 11 preempt

standby 11 track 1

standby 12 ip 170.16.128.243

standby 12 timers msec 100 msec 300

standby 12 preempt

standby 12 track 1

________________________________________

Router2

track 1 interface Tunnel1 line-protocol

________________________________________

interface FastEthernet0/1

description LAN Network

ip address 190.16.128.241 255.255.255.224 secondary

ip address 10.204.5.111 255.255.255.0

ip access-group BROKER-IN in

ip accounting output-packets

ip route-cache flow

duplex full

speed 10

no cdp enable

standby 11 ip 10.204.5.113

standby 11 timers msec 100 msec 300

standby 11 priority 105

standby 11 preempt

standby 11 track 1

standby 12 ip 170.16.128.243

standby 12 timers msec 100 msec 300

standby 12 priority 105

standby 12 preempt

standby 12 track 1

-Regards,
Mayank Nauni
CCIE#48541
Cisco Champion 2019

lgijssel · ‎02-27-2011

It would help a lot if we could see (a relevant part of) the log.

Also, for this type of problem a full configuration is more appropriate.

regards,

Leo

Mayank Nauni · ‎02-27-2011

Hi Leo,

Thanks for the prompt response. The desired information is attached for your reference.

Thanks and Regards,

Mayank Nauni

-Regards,
Mayank Nauni
CCIE#48541
Cisco Champion 2019

GauravGambhir · ‎02-28-2011

are we running same IOS in all the 85 devices where we are seeing this problem and did we used the same IOS in the lab setup where you are able to reproduce the problem? If Yes then try and move the routers to the latest T Train and see if this fixes the issue. Sounds like a bug to me....n do try to configure hsrp without track and see it that makes any difference.

milan.kulik · ‎03-01-2011

Hi,

have you tried to use HSRP ver 2?

And increase your HSRP timers? I'd say 100/300 ms is an extremly short HSRP timer!

HTH,

Milan

Tracer Germany · ‎02-28-2011

Hello,

i had a similar issue on your routers twice

Once caused by a foreign or miss configured device using the same Standby ID

- easy to solve while using “Standby Authentication”

Standby authentication - in interface config mode

interface Vlan1
description **** LAN Link ****
ip address 172.16.1.2 255.255.255.0
standby 1 ip 172.16.1.1
standby 1 preempt delay minimum 60 reload 180
standby 1 authentication hsrppassword // PASSWORD
standby 1 name nameofgroup // NAME of GROUP

The second time it was caused by vlan miss configuration
(Trunk encapsulation and/or VLAN mismatch between switches/router or during the transport way)

It was between a Cisco 1802 and a 1841, both connected via switches in a VTP domain.
The First one was in "access mode" and the second in "Trunk mode" …. both of them reachable via IP but caused problems with HSRP

Hope that helps

Regards

Tracer

Kishore Chennupati · ‎02-28-2011

Hi,

If its not impacting the production network , and you are runnign the same IOS across the 85devices then I guess its an IOS bug.

What IOS are you running and also what logs do you get. Might be good if you can paste them here

You could possibley check for any caveats in the cisco bug toolkit

Regards

Kishore

Mayank Nauni · ‎03-16-2011

Mates,

I am sure this is something that has to do with the configuration of the devices... as this issue is Model and IOS independent....see my observation below:

The below mentioned test was carried out on GNS3:-

Step 1 : Preempt removed from active router and logs taken

*Mar 1 05:19:41.090: %HSRP-6-STATECHANGE: FastEthernet0/0 Grp 10 state Active -> Speak

*Mar 1 05:21:47.986: %SYS-5-CONFIG_I: Configured from console by console

ST-INS-WRT1#

ST-INS-WRT1#sh clo

ST-INS-WRT1#sh clock

*05:23:01.726 UTC Fri Mar 1 2002

Observation : The HSRP Flap Logs Stopped

Step 2. Removed the aggressive timers on both routers

No standby 10 timers msec 100 msec 300

Observation : The HSRP Flap Logs Stopped

Interpretation:

Since this emulation was done in an ideal environment, there are no probabilities of physical media problems resulting in missing of HSRP flaps.
Since this emulation was done with various IOS codes (12.2, 12.4) hence the problem doesn’t seems to be associated with IOS code
Since this emulation was done with various platforms ( Cisco2811, Cisco 3600, Cisco 3700, Cisco 7200 VXR)

-Regards,
Mayank Nauni
CCIE#48541
Cisco Champion 2019

Peter Paluch · ‎03-17-2011

Mayank,

I agree that the aggressive timing may be at the core of this issue but I would personally not draw any conclusion from a simulation run in GNS3. The GNS3 and the underlying dynamips engine may skew the timing considerably as all code flow of the IOS is reinterpreted and may not flow linearly in time. It is absolutely not guaranteed that 100/300 ms in dynamips are 100/300ms in real time.

Best regards,

Peter