cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1025
Views
0
Helpful
5
Replies

HSRP flapping without a known trigger

jose_jimbo
Level 1
Level 1

Hey guys,

 

A few days ago I had a MW and everything went just fine. At the next morning I got a call from NOC saying they had an interruption of services and to check everything on routers so I logged in and found these logs:

Router1:

Sep 23 04:39:30 ECU: %HSRP-5-STATECHANGE: Vlan308 Grp 140 state Active -> Speak
Sep 23 04:39:30 ECU: %HSRP-5-STATECHANGE: Vlan775 Grp 71 state Active -> Speak
Sep 23 04:39:30 ECU: %HSRP-5-STATECHANGE: Vlan103 Grp 140 state Active -> Speak
Sep 23 04:39:31 ECU: %HSRP-5-STATECHANGE: Vlan308 Grp 140 state Speak -> Standby
Sep 23 04:39:31 ECU: %HSRP-5-STATECHANGE: Vlan775 Grp 71 state Speak -> Standby
Sep 23 04:39:31 ECU: %HSRP-5-STATECHANGE: Vlan103 Grp 140 state Speak -> Standby
Sep 23 04:39:33 ECU: %HSRP-5-STATECHANGE: Vlan304 Grp 71 state Standby -> Active
Sep 23 04:39:33 ECU: %HSRP-5-STATECHANGE: Vlan128 Grp 71 state Standby -> Active
Sep 23 04:39:33 ECU: %HSRP-5-STATECHANGE: Vlan128 Grp 72 state Standby -> Active
Sep 23 04:39:33 ECU: %HSRP-5-STATECHANGE: Vlan305 Grp 71 state Standby -> Active
Sep 23 04:39:33 ECU: %HSRP-5-STATECHANGE: Vlan985 Grp 229 state Standby -> Active
Sep 23 04:39:33 ECU: %HSRP-5-STATECHANGE: Vlan304 Grp 71 state Active -> Speak
Sep 23 04:39:33 ECU: %HSRP-5-STATECHANGE: Vlan128 Grp 71 state Active -> Speak
Sep 23 04:39:33 ECU: %HSRP-5-STATECHANGE: Vlan985 Grp 229 state Active -> Speak
Sep 23 04:39:33 ECU: %HSRP-5-STATECHANGE: Vlan128 Grp 72 state Active -> Speak
Sep 23 04:39:33 ECU: %HSRP-5-STATECHANGE: Vlan305 Grp 71 state Active -> Speak
Sep 23 04:39:34 ECU: %HSRP-5-STATECHANGE: Vlan304 Grp 71 state Speak -> Standby
Sep 23 04:39:35 ECU: %HSRP-5-STATECHANGE: Vlan985 Grp 229 state Speak -> Standby
Sep 23 04:39:35 ECU: %HSRP-5-STATECHANGE: Vlan128 Grp 71 state Speak -> Standby
Sep 23 04:39:35 ECU: %HSRP-5-STATECHANGE: Vlan128 Grp 72 state Speak -> Standby
Sep 23 04:39:35 ECU: %HSRP-5-STATECHANGE: Vlan305 Grp 71 state Speak -> Standby
Sep 23 04:39:40 ECU: %HSRP-5-STATECHANGE: Vlan103 Grp 140 state Standby -> Active
Sep 23 04:39:40 ECU: %HSRP-5-STATECHANGE: Vlan775 Grp 71 state Standby -> Active
Sep 23 04:39:40 ECU: %HSRP-5-STATECHANGE: Vlan308 Grp 140 state Standby -> Active
Sep 23 04:39:40 ECU: %HSRP-5-STATECHANGE: Vlan103 Grp 140 state Active -> Speak
Sep 23 04:39:40 ECU: %HSRP-5-STATECHANGE: Vlan775 Grp 71 state Active -> Speak
Sep 23 04:39:40 ECU: %HSRP-5-STATECHANGE: Vlan308 Grp 140 state Active -> Speak
Sep 23 04:39:41 ECU: %HSRP-5-STATECHANGE: Vlan103 Grp 140 state Speak -> Standby
Sep 23 04:39:42 ECU: %HSRP-5-STATECHANGE: Vlan775 Grp 71 state Speak -> Standby
Sep 23 04:39:42 ECU: %HSRP-5-STATECHANGE: Vlan308 Grp 140 state Speak -> Standby

 

Router2:

Sep 23 05:05:22 ECU: %HSRP-5-STATECHANGE: Vlan110 Grp 70 state Active -> Speak
Sep 23 05:05:22 ECU: %HSRP-5-STATECHANGE: Vlan511 Grp 222 state Active -> Speak
Sep 23 05:05:22 ECU: %HSRP-5-STATECHANGE: Vlan102 Grp 70 state Active -> Speak
Sep 23 05:05:22 ECU: %HSRP-5-STATECHANGE: Vlan100 Grp 72 state Active -> Speak
Sep 23 05:05:23 ECU: %HSRP-5-STATECHANGE: Vlan110 Grp 70 state Speak -> Standby
Sep 23 05:05:23 ECU: %HSRP-5-STATECHANGE: Vlan511 Grp 222 state Speak -> Standby
Sep 23 05:05:23 ECU: %HSRP-5-STATECHANGE: Vlan100 Grp 70 state Speak -> Standby
Sep 23 05:05:23 ECU: %HSRP-5-STATECHANGE: Vlan102 Grp 70 state Speak -> Standby
Sep 23 05:05:23 ECU: %HSRP-5-STATECHANGE: Vlan100 Grp 72 state Speak -> Standby
Sep 23 05:05:25 ECU: %HSRP-5-STATECHANGE: Vlan648 Grp 143 state Standby -> Active
Sep 23 05:05:25 ECU: %HSRP-5-STATECHANGE: Vlan523 Grp 220 state Standby -> Active
Sep 23 05:05:25 ECU: %HSRP-5-STATECHANGE: Vlan646 Grp 141 state Standby -> Active
Sep 23 05:05:25 ECU: %HSRP-5-STATECHANGE: Vlan132 Grp 72 state Standby -> Active
Sep 23 05:05:25 ECU: %HSRP-5-STATECHANGE: Vlan740 Grp 66 state Standby -> Active
Sep 23 05:05:25 ECU: %HSRP-5-STATECHANGE: Vlan133 Grp 140 state Standby -> Active
Sep 23 05:05:26 ECU: %HSRP-5-STATECHANGE: Vlan740 Grp 66 state Active -> Speak
Sep 23 05:05:26 ECU: %HSRP-5-STATECHANGE: Vlan648 Grp 143 state Active -> Speak
Sep 23 05:05:26 ECU: %HSRP-5-STATECHANGE: Vlan523 Grp 220 state Active -> Speak
Sep 23 05:05:26 ECU: %HSRP-5-STATECHANGE: Vlan133 Grp 140 state Active -> Speak
Sep 23 05:05:26 ECU: %HSRP-5-STATECHANGE: Vlan132 Grp 72 state Active -> Speak
Sep 23 05:05:26 ECU: %HSRP-5-STATECHANGE: Vlan646 Grp 141 state Active -> Speak
Sep 23 05:05:27 ECU: %HSRP-5-STATECHANGE: Vlan740 Grp 66 state Speak -> Standby
Sep 23 05:05:27 ECU: %HSRP-5-STATECHANGE: Vlan133 Grp 140 state Speak -> Standby
Sep 23 05:05:27 ECU: %HSRP-5-STATECHANGE: Vlan648 Grp 143 state Speak -> Standby
Sep 23 05:05:27 ECU: %HSRP-5-STATECHANGE: Vlan523 Grp 220 state Speak -> Standby
Sep 23 05:05:27 ECU: %HSRP-5-STATECHANGE: Vlan132 Grp 72 state Speak -> Standby
Sep 23 05:05:27 ECU: %HSRP-5-STATECHANGE: Vlan646 Grp 141 state Speak -> Standby
Sep 23 05:05:29 ECU: %HSRP-5-STATECHANGE: Vlan102 Grp 70 state Standby -> Active
Sep 23 05:05:29 ECU: %HSRP-5-STATECHANGE: Vlan100 Grp 72 state Standby -> Active
Sep 23 05:05:29 ECU: %HSRP-5-STATECHANGE: Vlan110 Grp 70 state Standby -> Active
Sep 23 05:05:29 ECU: %HSRP-5-STATECHANGE: Vlan100 Grp 70 state Standby -> Active
Sep 23 05:05:29 ECU: %HSRP-5-STATECHANGE: Vlan511 Grp 222 state Standby -> Active
Sep 23 05:05:30 ECU: %HSRP-5-STATECHANGE: Vlan511 Grp 222 state Active -> Speak
Sep 23 05:05:30 ECU: %HSRP-5-STATECHANGE: Vlan100 Grp 70 state Active -> Speak
Sep 23 05:05:30 ECU: %HSRP-5-STATECHANGE: Vlan102 Grp 70 state Active -> Speak
Sep 23 05:05:30 ECU: %HSRP-5-STATECHANGE: Vlan100 Grp 72 state Active -> Speak
Sep 23 05:05:30 ECU: %HSRP-5-STATECHANGE: Vlan110 Grp 70 state Active -> Speak
Sep 23 05:05:31 ECU: %HSRP-5-STATECHANGE: Vlan511 Grp 222 state Speak -> Standby
Sep 23 05:05:31 ECU: %HSRP-5-STATECHANGE: Vlan100 Grp 70 state Speak -> Standby
Sep 23 05:05:31 ECU: %HSRP-5-STATECHANGE: Vlan102 Grp 70 state Speak -> Standby
Sep 23 05:05:31 ECU: %HSRP-5-STATECHANGE: Vlan100 Grp 72 state Speak -> Standby
Sep 23 05:05:31 ECU: %HSRP-5-STATECHANGE: Vlan110 Grp 70 state Speak -> Standby

 

Logs started at 02:12:21am for no reason apparently and went on all morning till 8am that I shuted down interface vlans on the backup router. I tried to analyze the root cause but it seems there is no trigger for that event. We got so many logs that buffer got full and I had to get logs from the syslog server to get the start time of the incident.

After that, I check some older logs from other MWs and I found the following:

Router1:

Sep 16 02:51:25 ECU: %HSRP-5-STATECHANGE: Vlan305 Grp 72 state Standby -> Active
Sep 16 02:51:25 ECU: %HSRP-5-STATECHANGE: Vlan985 Grp 229 state Standby -> Active
Sep 16 02:51:25 ECU: %HSRP-5-STATECHANGE: Vlan304 Grp 71 state Standby -> Active
Sep 16 02:51:25 ECU: %HSRP-5-STATECHANGE: Vlan128 Grp 72 state Standby -> Active
Sep 16 02:51:25 ECU: %HSRP-5-STATECHANGE: Vlan310 Grp 140 state Standby -> Active
Sep 16 02:51:26 ECU: %HSRP-5-STATECHANGE: Vlan985 Grp 229 state Active -> Speak
Sep 16 02:51:26 ECU: %HSRP-5-STATECHANGE: Vlan304 Grp 71 state Active -> Speak
Sep 16 02:51:26 ECU: %HSRP-5-STATECHANGE: Vlan128 Grp 72 state Active -> Speak
Sep 16 02:51:26 ECU: %HSRP-5-STATECHANGE: Vlan310 Grp 140 state Active -> Speak
Sep 16 02:51:26 ECU: %HSRP-5-STATECHANGE: Vlan305 Grp 72 state Active -> Speak
Sep 16 02:51:27 ECU: %HSRP-5-STATECHANGE: Vlan985 Grp 229 state Speak -> Standby
Sep 16 02:51:27 ECU: %HSRP-5-STATECHANGE: Vlan128 Grp 72 state Speak -> Standby
Sep 16 02:51:27 ECU: %HSRP-5-STATECHANGE: Vlan304 Grp 71 state Speak -> Standby
Sep 16 02:51:27 ECU: %HSRP-5-STATECHANGE: Vlan310 Grp 140 state Speak -> Standby
Sep 16 02:51:27 ECU: %HSRP-5-STATECHANGE: Vlan305 Grp 72 state Speak -> Standby
!
Sep 21 16:02:20 ECU: %HSRP-5-STATECHANGE: Vlan304 Grp 71 state Standby -> Active
Sep 21 16:02:20 ECU: %HSRP-5-STATECHANGE: Vlan100 Grp 71 state Standby -> Active
Sep 21 16:02:20 ECU: %HSRP-5-STATECHANGE: Vlan985 Grp 229 state Standby -> Active
Sep 21 16:02:20 ECU: %HSRP-5-STATECHANGE: Vlan100 Grp 73 state Standby -> Active
Sep 21 16:02:20 ECU: %HSRP-5-STATECHANGE: Vlan128 Grp 72 state Standby -> Active
Sep 21 16:02:20 ECU: %HSRP-5-STATECHANGE: Vlan128 Grp 72 state Active -> Speak
Sep 21 16:02:20 ECU: %HSRP-5-STATECHANGE: Vlan304 Grp 71 state Active -> Speak
Sep 21 16:02:21 ECU: %HSRP-5-STATECHANGE: Vlan985 Grp 229 state Active -> Speak
Sep 21 16:02:21 ECU: %HSRP-5-STATECHANGE: Vlan100 Grp 73 state Active -> Speak
Sep 21 16:02:21 ECU: %HSRP-5-STATECHANGE: Vlan100 Grp 71 state Active -> Speak
Sep 21 16:02:21 ECU: %HSRP-5-STATECHANGE: Vlan128 Grp 72 state Speak -> Standby
Sep 21 16:02:21 ECU: %HSRP-5-STATECHANGE: Vlan304 Grp 71 state Speak -> Standby
Sep 21 16:02:22 ECU: %HSRP-5-STATECHANGE: Vlan100 Grp 71 state Speak -> Standby
Sep 21 16:02:22 ECU: %HSRP-5-STATECHANGE: Vlan985 Grp 229 state Speak -> Standby
Sep 21 16:02:22 ECU: %HSRP-5-STATECHANGE: Vlan100 Grp 73 state Speak -> Standby
Sep 21 16:05:56 ECU: %HSRP-5-STATECHANGE: Vlan310 Grp 140 state Standby -> Active
Sep 21 16:05:56 ECU: %HSRP-5-STATECHANGE: Vlan305 Grp 71 state Standby -> Active
Sep 21 16:05:56 ECU: %HSRP-5-STATECHANGE: Vlan305 Grp 72 state Standby -> Active
Sep 21 16:05:56 ECU: %HSRP-5-STATECHANGE: Vlan128 Grp 71 state Standby -> Active
Sep 21 16:05:56 ECU: %HSRP-5-STATECHANGE: Vlan305 Grp 71 state Active -> Speak
Sep 21 16:05:56 ECU: %HSRP-5-STATECHANGE: Vlan305 Grp 72 state Active -> Speak
Sep 21 16:05:57 ECU: %HSRP-5-STATECHANGE: Vlan310 Grp 140 state Active -> Speak
Sep 21 16:05:57 ECU: %HSRP-5-STATECHANGE: Vlan128 Grp 71 state Active -> Speak
Sep 21 16:05:57 ECU: %HSRP-5-STATECHANGE: Vlan305 Grp 71 state Speak -> Standby
Sep 21 16:05:58 ECU: %HSRP-5-STATECHANGE: Vlan305 Grp 72 state Speak -> Standby
Sep 21 16:05:58 ECU: %HSRP-5-STATECHANGE: Vlan128 Grp 71 state Speak -> Standby
Sep 21 16:05:58 ECU: %HSRP-5-STATECHANGE: Vlan310 Grp 140 state Speak -> Standby
!

Router2:

!
Sep 16 02:51:25 ECU: %HSRP-5-STATECHANGE: Vlan318 Grp 218 state Standby -> Active
Sep 16 02:51:25 ECU: %HSRP-5-STATECHANGE: Vlan523 Grp 220 state Standby -> Active
Sep 16 02:51:25 ECU: %HSRP-5-STATECHANGE: Vlan750 Grp 67 state Standby -> Active
Sep 16 02:51:25 ECU: %HSRP-5-STATECHANGE: Vlan304 Grp 70 state Standby -> Active
Sep 16 02:51:25 ECU: %HSRP-5-STATECHANGE: Vlan984 Grp 228 state Standby -> Active
Sep 16 02:51:25 ECU: %HSRP-5-STATECHANGE: Vlan740 Grp 66 state Standby -> Active
Sep 16 02:51:25 ECU: %HSRP-5-STATECHANGE: Vlan648 Grp 143 state Standby -> Active
Sep 16 02:51:25 ECU: %HSRP-5-STATECHANGE: Vlan646 Grp 141 state Standby -> Active
Sep 16 02:51:25 ECU: %HSRP-5-STATECHANGE: Vlan309 Grp 140 state Standby -> Active
Sep 16 02:51:25 ECU: %HSRP-5-STATECHANGE: Vlan750 Grp 67 state Active -> Speak
Sep 16 02:51:25 ECU: %HSRP-5-STATECHANGE: Vlan318 Grp 218 state Active -> Speak
Sep 16 02:51:25 ECU: %HSRP-5-STATECHANGE: Vlan309 Grp 140 state Active -> Speak
Sep 16 02:51:25 ECU: %HSRP-5-STATECHANGE: Vlan304 Grp 70 state Active -> Speak
Sep 16 02:51:25 ECU: %HSRP-5-STATECHANGE: Vlan984 Grp 228 state Active -> Speak
Sep 16 02:51:26 ECU: %HSRP-5-STATECHANGE: Vlan740 Grp 66 state Active -> Speak
Sep 16 02:51:26 ECU: %HSRP-5-STATECHANGE: Vlan648 Grp 143 state Active -> Speak
Sep 16 02:51:26 ECU: %HSRP-5-STATECHANGE: Vlan646 Grp 141 state Active -> Speak
Sep 16 02:51:26 ECU: %HSRP-5-STATECHANGE: Vlan523 Grp 220 state Active -> Speak
Sep 16 02:51:27 ECU: %HSRP-5-STATECHANGE: Vlan318 Grp 218 state Speak -> Standby
Sep 16 02:51:27 ECU: %HSRP-5-STATECHANGE: Vlan304 Grp 70 state Speak -> Standby
Sep 16 02:51:27 ECU: %HSRP-5-STATECHANGE: Vlan309 Grp 140 state Speak -> Standby
Sep 16 02:51:27 ECU: %HSRP-5-STATECHANGE: Vlan750 Grp 67 state Speak -> Standby
Sep 16 02:51:27 ECU: %HSRP-5-STATECHANGE: Vlan984 Grp 228 state Speak -> Standby
Sep 16 02:51:27 ECU: %HSRP-5-STATECHANGE: Vlan740 Grp 66 state Speak -> Standby
Sep 16 02:51:27 ECU: %HSRP-5-STATECHANGE: Vlan648 Grp 143 state Speak -> Standby
Sep 16 02:51:27 ECU: %HSRP-5-STATECHANGE: Vlan646 Grp 141 state Speak -> Standby
Sep 16 02:51:27 ECU: %HSRP-5-STATECHANGE: Vlan523 Grp 220 state Speak -> Standby
!
Sep 21 16:02:20 ECU: %HSRP-5-STATECHANGE: Vlan319 Grp 219 state Standby -> Active
Sep 21 16:02:20 ECU: %HSRP-5-STATECHANGE: Vlan648 Grp 143 state Standby -> Active
Sep 21 16:02:20 ECU: %HSRP-5-STATECHANGE: Vlan647 Grp 142 state Standby -> Active
Sep 21 16:02:20 ECU: %HSRP-5-STATECHANGE: Vlan523 Grp 220 state Standby -> Active
Sep 21 16:02:20 ECU: %HSRP-5-STATECHANGE: Vlan647 Grp 142 state Active -> Speak
Sep 21 16:02:20 ECU: %HSRP-5-STATECHANGE: Vlan524 Grp 221 state Standby -> Active
Sep 21 16:02:20 ECU: %HSRP-5-STATECHANGE: Vlan524 Grp 221 state Active -> Speak
Sep 21 16:02:20 ECU: %HSRP-5-STATECHANGE: Vlan319 Grp 219 state Active -> Speak
Sep 21 16:02:20 ECU: %HSRP-5-STATECHANGE: Vlan523 Grp 220 state Active -> Speak
Sep 21 16:02:20 ECU: %HSRP-5-STATECHANGE: Vlan648 Grp 143 state Active -> Speak
Sep 21 16:02:21 ECU: %HSRP-5-STATECHANGE: Vlan647 Grp 142 state Speak -> Standby
Sep 21 16:02:21 ECU: %HSRP-5-STATECHANGE: Vlan524 Grp 221 state Speak -> Standby
Sep 21 16:02:21 ECU: %HSRP-5-STATECHANGE: Vlan319 Grp 219 state Speak -> Standby
Sep 21 16:02:21 ECU: %HSRP-5-STATECHANGE: Vlan523 Grp 220 state Speak -> Standby
Sep 21 16:02:21 ECU: %HSRP-5-STATECHANGE: Vlan648 Grp 143 state Speak -> Standby
Sep 21 16:05:56 ECU: %HSRP-5-STATECHANGE: Vlan647 Grp 142 state Standby -> Active
Sep 21 16:05:56 ECU: %HSRP-5-STATECHANGE: Vlan775 Grp 72 state Standby -> Active
Sep 21 16:05:56 ECU: %HSRP-5-STATECHANGE: Vlan319 Grp 219 state Standby -> Active
Sep 21 16:05:56 ECU: %HSRP-5-STATECHANGE: Vlan102 Grp 70 state Standby -> Active
Sep 21 16:05:56 ECU: %HSRP-5-STATECHANGE: Vlan775 Grp 72 state Active -> Speak
Sep 21 16:05:56 ECU: %HSRP-5-STATECHANGE: Vlan647 Grp 142 state Active -> Speak
Sep 21 16:05:56 ECU: %HSRP-5-STATECHANGE: Vlan102 Grp 70 state Active -> Speak
Sep 21 16:05:56 ECU: %HSRP-5-STATECHANGE: Vlan524 Grp 221 state Standby -> Active
Sep 21 16:05:56 ECU: %HSRP-5-STATECHANGE: Vlan524 Grp 221 state Active -> Speak
Sep 21 16:05:56 ECU: %HSRP-5-STATECHANGE: Vlan319 Grp 219 state Active -> Speak
Sep 21 16:05:57 ECU: %HSRP-5-STATECHANGE: Vlan775 Grp 72 state Speak -> Standby
Sep 21 16:05:57 ECU: %HSRP-5-STATECHANGE: Vlan647 Grp 142 state Speak -> Standby
Sep 21 16:05:57 ECU: %HSRP-5-STATECHANGE: Vlan524 Grp 221 state Speak -> Standby
Sep 21 16:05:57 ECU: %HSRP-5-STATECHANGE: Vlan102 Grp 70 state Speak -> Standby
Sep 21 16:05:57 ECU: %HSRP-5-STATECHANGE: Vlan319 Grp 219 state Speak -> Standby
!

So this was not the first time, for what I can see, flapping occurs for like 2-3 seconds and then stops, then goes again for like 2-3 seconds and stops, and so on. I'm gonna try to run a debug standby events/errors and debug spanning-tree events to verify if there is something useful but I wanted to ask you guys if you have found anything similar or something like that? or I should just open a case with TAC? Thing is it happens randomly and I can't event reproduce the event. Btw, Router1 and Router2 are connected in L2 using a Portchannel grouping 4 GigaEthernet interfaces.

Thanks for the help.

Jose Luis

 

 

1 Accepted Solution

Accepted Solutions

There are very aggressive timers 1 second hold time 300 msec for hello , so if you even have a 2 second hit your flapping or if an interface packs up for even a second and hellos cant get through you'll flap , ours are set to 3 sec and 5 sec holdtime and its very stable.

BFD is able to handle a lot faster speeds like that, we have had issues before with stp and aggressive timers

View solution in original post

5 Replies 5

Mark Malone
VIP Alumni
VIP Alumni

Did you move cables around during your MW ? that looks like could be L1 L2 issue

Try this command to track the L2 STP if that's the cause  , its should show you where the changes are coming from you may have to trace switch to switch depending on physical setup , be careful running stp debugging make sure its logging to the buffer

whats the setup here in terms of topology is there just 1 switch between the routers or more linking the hsrp

sw#sh spanning-tree de | i ieee|occurr|from|is exec
 VLAN0001 is executing the ieee compatible Spanning Tree protocol
  Number of topology changes 1791 last change occurred 02:12:46 ago
          from GigabitEthernet0/41
 

Hey Mark,

First of all, let me thank you for taking part of your time to review this issue. I attach a Diagram for physical connections.

Aswering your questions, we did move cables around during the MW but it was part of the MW, we migrate one network element from old routers to new routers, that migration finished at 00:26am and the first HSRP flapping we had it at 02:12am. I did execute the STP show command you were saying at 07:30am but the last topology change was at 00:30 or so, the last time STP converged. I will make sure debugs will log to the buffer.

The thing is like I was saying, this was not the first HSRP flapping. We had two of them before, the first one at September 16 and the next one at September 21. Actually this flap was the third one at September 23. This is why I think the migration window didn't trigger this event.

If you need more information just let me know, I'm about to open a case with Cisco TAC but I also wanted to know if anyone in the community has faced a similar situation.

Thanks,

JL

Yes I wouldn't say it was a MW issue taking from what your saying there anyway , its difficult to suggest something when the only thing the logs show is a flap , this issue will probably require live troubleshooting or good collection of debugs when the issue occurs.

Are all your uplinks ok ,the individual links of your pos not showing any issues , did the syslog show anything before this started to give any hint of the trigger

Hows the cpu its not maxing out on either side preventing the hsrp hello packets from reaching each other , no config that could block them either temporarily ? Are your timers default or have you set them more aggressive

This doc may help when troubleshooting its got some good points to look out for

Understanding and Troubleshooting HSRP Problems in Catalyst Switch Networks

http://www.cisco.com/c/en/us/support/docs/ip/hot-standby-router-protocol-hsrp/10583-62.html

Hey Mark,

I also think live troubleshooting is required. The thing is as there is no known trigger for the event, it's difficult to recreate it. I was thinking maybe setting the debugs for like a week or so, and then collect the outputs to verify if the issue happened again or not.

Interfaces are just fine, the syslog server only showed the moment flapping started and nothing else. CPU processing is ok but it went up from 5% to 10% while the flapping occured, once it was solved CPU went back to normal range.

HSRP timers are set agressively yes, but this configuration was copy/pasted from old routers and the event didn't occur before, these are the timers:

Router1:

 standby 141 ip 10.116.176.174
 standby 141 timers msec 300 1
 standby 141 preempt

Router2:

 standby 141 ip 10.116.176.174
 standby 141 timers msec 300 1
 standby 141 preempt
 standby 141 priority 80

 

The same for all VLANs with HSRP configured. Thanks for the doc btw, I'm gonna take a look at it.

 

BR,

 

There are very aggressive timers 1 second hold time 300 msec for hello , so if you even have a 2 second hit your flapping or if an interface packs up for even a second and hellos cant get through you'll flap , ours are set to 3 sec and 5 sec holdtime and its very stable.

BFD is able to handle a lot faster speeds like that, we have had issues before with stp and aggressive timers

Review Cisco Networking for a $25 gift card