11-03-2010 12:02 PM - edited 03-07-2019 12:35 AM
All,
I've been having weird failover issues between my two routers. Today, I got an alert that my backup became my primary and it went back immediately as backup, but the primary never failed over:
RouterA (Pri)
Log:
Oct 27 12:44:46: %CONTROLLER-5-UPDOWN: Controller T3 1/0, changed state to down
Oct 27 12:44:47: %CONTROLLER-5-UPDOWN: Controller T3 1/0, changed state to up
Oct 27 12:53:17: %CONTROLLER-5-UPDOWN: Controller T3 1/0, changed state to down
Oct 27 12:53:18: %CONTROLLER-5-UPDOWN: Controller T3 1/0, changed state to up
sh standby:
GigabitEthernet0/0 - Group 100
State is Active
32 state changes, last state change 1w0d
Virtual IP address is 10.125.100.1
Secondary virtual IP address 10.125.99.1
Secondary virtual IP address 10.128.100.1
Secondary virtual IP address 10.129.100.1
Secondary virtual IP address 10.131.100.1
Active virtual MAC address is 0000.0c07.ac64
Local virtual MAC address is 0000.0c07.ac64 (v1 default)
Hello time 3 sec, hold time 10 sec
Next hello sent in 1.388 secs
Preemption enabled
Active router is local
Standby router is 10.125.100.3, priority 100 (expires in 9.404 sec)
Priority 105 (configured 105)
Track object 1 state Up decrement 10
IP redundancy name is "hsrp-Gi0/0-100" (default)
RouterB:
log:
Nov 3 13:52:23: %HSRP-5-STATECHANGE: FastEthernet0/0 Grp 100 state Speak -> Standby
Nov 3 13:54:35: %HSRP-5-STATECHANGE: FastEthernet0/0 Grp 100 state Standby -> Active
Nov 3 13:54:35: %HSRP-5-STATECHANGE: FastEthernet0/0 Grp 100 state Active -> Speak
Nov 3 13:54:45: %HSRP-5-STATECHANGE: FastEthernet0/0 Grp 100 state Speak -> Standby
sh standby:
FastEthernet0/0 - Group 100
State is Standby
397 state changes, last state change 00:06:04
Virtual IP address is 10.125.100.1
Secondary virtual IP address 10.125.99.1
Secondary virtual IP address 10.128.100.1
Secondary virtual IP address 10.129.100.1
Secondary virtual IP address 10.131.100.1
Active virtual MAC address is 0000.0c07.ac64
Local virtual MAC address is 0000.0c07.ac64 (v1 default)
Hello time 3 sec, hold time 10 sec
Next hello sent in 1.296 secs
Preemption enabled
Active router is 10.125.100.2, priority 105 (expires in 7.276 sec)
Standby router is local
Priority 100 (default 100)
IP redundancy name is "hsrp-Fa0/0-100" (default)
Anything that I can look at? The standby unit has 10x the amount of state changes.
Thanks,
John
11-03-2010 12:17 PM
It looks like the backup router is frequently missing keepalives from the primary.
This causes the hsrp-process to assume the primary has died and he needs to kick in.
These symptoms could indicate a unidirectional connection issue between the routers.
At least I would check the LAN in between them.
regards,
Leo
11-03-2010 12:20 PM
Hi,
Usually, this means there is unidirectional connection problem.
Router A keeps receiving HSRP hello from Router B so it is not changing its HSRP state.
Router B didn't receive HSRP hello from router A became active then received HSRP hello from Router A so changed back to Standby.
Check Router A and B interface to see if there is any interface errors increasing.
If not, most of time, it is cabling. Swap cable.
KK.
11-03-2010 12:24 PM
I have no errors on either of the interfaces, and I can see both fine from each other.
** I'm going to try to replace the cable anyway. In the time it took me to type above, I've made 3 more state changes:
FastEthernet0/0 - Group 100
State is Standby
400 state changes, last state change 00:00:59
Virtual IP address is 10.125.100.1
Secondary virtual IP address 10.125.99.1
Secondary virtual IP address 10.128.100.1
Secondary virtual IP address 10.129.100.1
Secondary virtual IP address 10.131.100.1
Active virtual MAC address is 0000.0c07.ac64
Local virtual MAC address is 0000.0c07.ac64 (v1 default)
Hello time 3 sec, hold time 10 sec
Next hello sent in 0.780 secs
Preemption enabled
Active router is 10.125.100.2, priority 105 (expires in 9.772 sec)
Standby router is local
Priority 100 (default 100)
IP redundancy name is "hsrp-Fa0/0-100" (default)
Thanks,
John
11-03-2010 12:56 PM
I've seen this behavior where HSRP hello's must traverse a switched network and the spanning tree root for that particular VLAN is moving around. If this relationship manifests through multiple switches review the spanning tree topology.
Chris
11-03-2010 01:10 PM
Okay, it's not a cabling issue. I've replaced it with a known good cable, and it doesn't look like a stp issue. (These routers are connected to the same switch.)
Thanks,
John
11-03-2010 01:15 PM
I'm debugging it now. (Don't know why I didn't do this before.) I'll post results.
Thanks,
John
11-03-2010 01:37 PM
Okay, here's what I have, and it does look like it's losing communication: (possible interface problem?)
Nov 3 20:30:31.210: HSRP: Fa0/0 Redirect adv out, Passive, active 0 passive 1
Nov 3 20:30:32.842: HSRP: Fa0/0 Grp 100 Hello in 10.125.100.2 Active pri 105 vIP 10.125.100.1
Nov 3 20:30:33.850: HSRP: Fa0/0 Grp 100 Hello out 10.125.100.3 Standby pri 100 vIP 10.125.100.1
Nov 3 20:30:36.850: HSRP: Fa0/0 Grp 100 Hello out 10.125.100.3 Standby pri 100 vIP 10.125.100.1
Nov 3 20:30:39.850: HSRP: Fa0/0 Grp 100 Hello out 10.125.100.3 Standby pri 100 vIP 10.125.100.1
Nov 3 20:30:42.842: HSRP: Fa0/0 Grp 100 Standby: c/Active timer expired (10.125.100.2)
Nov 3 20:30:42.842: HSRP: Fa0/0 Grp 100 Active router is local, was 10.125.100.2
Nov 3 20:30:42.842: HSRP: Fa0/0 Grp 100 Standby router is unknown, was local
Nov 3 20:30:42.842: HSRP: Fa0/0 Grp 100 Standby -> Active
Nov 3 15:30:42: %HSRP-5-STATECHANGE: FastEthernet0/0 Grp 100 state Standby -> Active
Nov 3 20:30:42.842: HSRP: Fa0/0 Redirect adv out, Active, active 1 passive 1
Nov 3 20:30:42.842: HSRP: Fa0/0 Grp 100 Redundancy "hsrp-Fa0/0-100" state Standby -> Active
Nov 3 20:30:42.842: HSRP: Fa0/0 Grp 100 Hello out 10.125.100.3 Active pri 100 vIP 10.125.100.1
Nov 3 20:30:42.846: HSRP: Fa0/0 API arp proto filter, 0000.0c07.ac64 is active vMAC for grp 100 - filter
Nov 3 20:30:42.846: HSRP: Fa0/0 API arp proto filter, 0000.0c07.ac64 is active vMAC for grp 100 - filter
Nov 3 20:30:42.846: HSRP: Fa0/0 API arp proto filter, 0000.0c07.ac64 is active vMAC for grp 100 - filter
Nov 3 20:30:42.846: HSRP: Fa0/0 API arp proto filter, 0000.0c07.ac64 is active vMAC for grp 100 - filter
Nov 3 20:30:42.846: HSRP: Fa0/0 API arp proto filter, 0000.0c07.ac64 is active vMAC for grp 100 - filter
Nov 3 20:30:42.846: HSRP: Fa0/0 Grp 100 Coup in 10.125.100.2 Active pri 105 vIP 10.125.100.1
Nov 3 20:30:42.846: HSRP: Fa0/0 Grp 100 Active: j/Coup rcvd from higher pri router (105/10.125.100.2)
Nov 3 20:30:42.846: HSRP: Fa0/0 Grp 100 Active router is 10.125.100.2, was local
Nov 3 20:30:42.846: HSRP: Fa0/0 Grp 100 Active -> Speak
11-03-2010 01:41 PM
Okay, I see something else. The switch is reporting a multicast storm for the port that the primary router connects to, not the one that I'm having a problem with. It could be the reason I'm losing connection from the secondary.
John
11-04-2010 06:36 AM
Apparently the multicast storm threshold was too low. After changing it, my failovers stopped happening.
Thanks,
John
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide