03-28-2023 02:02 AM
Hello
When doing maintenance we had an unexpected outage due to HSRP moving from Active to Init when ONE interface in a port-channel went up after being down for some time:
063116: Mar 25 09:08:09.343 CET: %LINEPROTO-SW1-5-UPDOWN: Line protocol on Interface TenGigabitEthernet6/6/6, changed state to down
063130: Mar 25 09:58:13.811 CET: %LINK-SW1-3-UPDOWN: Interface TenGigabitEthernet6/6/6, changed state to up
063131: Mar 25 09:58:18.944 CET: %LINEPROTO-SW1-5-UPDOWN: Line protocol on Interface TenGigabitEthernet6/6/6, changed state to up
063132: Mar 25 09:58:19.888 CET: %HSRP-SW1-5-STATECHANGE: Vlan666 Grp 1 state Active -> Init
The other interface in the port-channel, and thus the port-channel, was unaffected.
Can anyone understand why we saw this behaviour?
BR
//joakim
Solved! Go to Solution.
03-28-2023 05:24 AM
If the VLAN 666 have dual path and STP is spanned, you should not encounter the issue.
May be you need to Troubleshooting adding new VLAN for testing and perform the test.
Also you need to co-related all the device Logs - i am sure in this path one of the Link gone blocking mode. when other Link go down, how fast the STP convergence take place , if you using Rapid STP - it will be quick than as expected.
you only provide small information Logs, what is the Full Logs (i am sure you should see 1 or 2 ping logs and should work as expected as per your diagram (until its configured in correctly - so you need to post the config also)
=====Preenayamo Vasudevam=====
***** Rate All Helpful Responses *****
03-28-2023 02:26 AM
Hello @JoakimA
The reason for this behavior could be related to how HSRP works. HSRP is a protocol used to provide redundancy for IP networks by allowing multiple routers to participate in a virtual router group. One router in the group is elected as the active router, while the others are in standby mode. The active router is responsible for forwarding packets sent to the virtual IP address, while the standby routers monitor the active router and take over its duties if it fails.
When the interface on the active router goes down, the other routers in the HSRP group detect the failure and one of them takes over as the active router. When the interface comes back up, the original active router will have to compete with the current active router to regain its status.
It's possible that the delay in bringing the interface back up caused the HSRP protocol to initiate a state change from Active to Init. This could be due to the HSRP timers, which determine how quickly the routers detect and respond to changes in the network.
With additional information about your network configuration and HSRP settings, It could help us to find the issue.
03-28-2023 04:00 AM
Hello, cheers. I was unclear/wrong about the port-channel, it was down (the box in the other end was moved) but the vlan 666 other port-channel was never affected. So the vlan int was always up. Could something other trigger the move to Init? STP block perhaps as mentioned below?
03-28-2023 04:07 AM
Yes, it's possible that the HSRP state change was triggered by STP blocking. If the port-channel was down, STP would have blocked the interface and put it in the Listening and Learning states before forwarding traffic. During the Listening and Learning states, HSRP packets would not be forwarded, and if the standby router did not receive any HSRP packets within the hold-time, it would initiate a state change. This could explain why the HSRP state changed from Active to Init when the port-channel came back up.
03-28-2023 02:36 AM
That is expected behaviour, if the HSRP VLAN trasit using same port-channel and if the portchannel shutdown, if VLAN do not have any other path - then we expect HSRP split brain here.(both will go active / active since you do not have communication between HSRP peers.
In this case when you doing maintenance, what i do is - shutdown Standby side VLAN Interface, so Active will be always active and pass the traffic, if end device can reach the Gateway VIP IP.
=====Preenayamo Vasudevam=====
***** Rate All Helpful Responses *****
03-28-2023 02:44 AM - edited 03-28-2023 02:45 AM
Will this solution for our website? https://bmccpa.com/
03-28-2023 02:59 AM
can you share 
show EtherChannel summary 
show spanning tree 
I think there is STP issue here 
03-28-2023 04:02 AM
I was wrong, Po17 was down and when 6/6/6 came up the Active > Init occured, Po500 was never affected.
17 Po17(SU) LACP Te1/1/1(P) Te6/6/6(P)
500 Po500(SU) LACP Fo1/1/25(P) Fo2/1/25(P)
VLAN0666
Spanning tree enabled protocol rstp
Root ID Priority 50
Address x.x.y
Cost 1000
Port 5766 (Port-channel17)
Hello Time 2 sec Max Age 20 sec Forward Delay 15 sec
Bridge ID Priority 4146 (priority 4096 sys-id-ext 50)
Address x.x.x
Hello Time 2 sec Max Age 20 sec Forward Delay 15 sec
Aging Time 600
Interface Role Sts Cost Prio.Nbr Type
------------------- ---- --- --------- -------- --------------------------------
Po17 Root FWD 1000 128.5766 P2p
Po500 Desg FWD 250 128.5772 P2p
03-28-2023 04:22 AM
with all above info. let deep dive here 
HSRP go from active -> inti when 
1- there is better active than it and this also can be via 
A- add new active (You confirm that you not add new HSRP router)
B- using track to increase the priority 
2- there is split brain as @balaji.bandi  mention above
can I know if your topology is as show below ?
03-28-2023 04:51 AM
Hello, its more like this, where R2 was the one taken down for maintenance.
03-28-2023 05:24 AM
If the VLAN 666 have dual path and STP is spanned, you should not encounter the issue.
May be you need to Troubleshooting adding new VLAN for testing and perform the test.
Also you need to co-related all the device Logs - i am sure in this path one of the Link gone blocking mode. when other Link go down, how fast the STP convergence take place , if you using Rapid STP - it will be quick than as expected.
you only provide small information Logs, what is the Full Logs (i am sure you should see 1 or 2 ping logs and should work as expected as per your diagram (until its configured in correctly - so you need to post the config also)
=====Preenayamo Vasudevam=====
***** Rate All Helpful Responses *****
03-28-2023 05:32 AM
Cheers, I will look into the logs, unfortunately they are very sparse but I will focus on the STP.
Många tack!
03-28-2023 05:32 AM
as @balaji.bandi mention below dual link and STP will not make any issue 
but 
VLAN0666
Spanning tree enabled protocol rstp
Root ID Priority 50
Address x.x.y
Cost 1000
Port 5766 (Port-channel17)
Hello Time 2 sec Max Age 20 sec Forward Delay 15 sec
Bridge ID Priority 4146 (priority 4096 sys-id-ext 50)
the vlan is 666 and sys-id-ext 50 !!!
can you check this again 
03-28-2023 05:34 AM
I was obfuscating and missed that one! 
Thanks for the help!
03-28-2023 06:12 AM
You need to Looker wider angle than focussing on HSRP. (that will help you not to repeat the same issue when you do maintenance)
=====Preenayamo Vasudevam=====
***** Rate All Helpful Responses *****
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide