Help with HSRP

John Blakley · ‎11-03-2010

All,

I've been having weird failover issues between my two routers. Today, I got an alert that my backup became my primary and it went back immediately as backup, but the primary never failed over:

RouterA (Pri)

Log:

Oct 27 12:44:46: %CONTROLLER-5-UPDOWN: Controller T3 1/0, changed state to down
Oct 27 12:44:47: %CONTROLLER-5-UPDOWN: Controller T3 1/0, changed state to up
Oct 27 12:53:17: %CONTROLLER-5-UPDOWN: Controller T3 1/0, changed state to down
Oct 27 12:53:18: %CONTROLLER-5-UPDOWN: Controller T3 1/0, changed state to up

sh standby:

GigabitEthernet0/0 - Group 100
State is Active
32 state changes, last state change 1w0d
Virtual IP address is 10.125.100.1
    Secondary virtual IP address 10.125.99.1
    Secondary virtual IP address 10.128.100.1
    Secondary virtual IP address 10.129.100.1
    Secondary virtual IP address 10.131.100.1
Active virtual MAC address is 0000.0c07.ac64
    Local virtual MAC address is 0000.0c07.ac64 (v1 default)
Hello time 3 sec, hold time 10 sec
    Next hello sent in 1.388 secs
Preemption enabled
Active router is local
Standby router is 10.125.100.3, priority 100 (expires in 9.404 sec)
Priority 105 (configured 105)
    Track object 1 state Up decrement 10
IP redundancy name is "hsrp-Gi0/0-100" (default)

RouterB:

log:

Nov 3 13:52:23: %HSRP-5-STATECHANGE: FastEthernet0/0 Grp 100 state Speak -> Standby
Nov 3 13:54:35: %HSRP-5-STATECHANGE: FastEthernet0/0 Grp 100 state Standby -> Active
Nov 3 13:54:35: %HSRP-5-STATECHANGE: FastEthernet0/0 Grp 100 state Active -> Speak
Nov 3 13:54:45: %HSRP-5-STATECHANGE: FastEthernet0/0 Grp 100 state Speak -> Standby

sh standby:

FastEthernet0/0 - Group 100
State is Standby
   397 state changes, last state change 00:06:04
Virtual IP address is 10.125.100.1
    Secondary virtual IP address 10.125.99.1
    Secondary virtual IP address 10.128.100.1
    Secondary virtual IP address 10.129.100.1
    Secondary virtual IP address 10.131.100.1
Active virtual MAC address is 0000.0c07.ac64
    Local virtual MAC address is 0000.0c07.ac64 (v1 default)
Hello time 3 sec, hold time 10 sec
    Next hello sent in 1.296 secs
Preemption enabled
Active router is 10.125.100.2, priority 105 (expires in 7.276 sec)
Standby router is local
Priority 100 (default 100)
IP redundancy name is "hsrp-Fa0/0-100" (default)

Anything that I can look at? The standby unit has 10x the amount of state changes.

Thanks,

John

HTH, John *** Please rate all useful posts ***

lgijssel · ‎11-03-2010

It looks like the backup router is frequently missing keepalives from the primary.

This causes the hsrp-process to assume the primary has died and he needs to kick in.

These symptoms could indicate a unidirectional connection issue between the routers.

At least I would check the LAN in between them.

regards,

Leo

kyukim · ‎11-03-2010

Hi,

Usually, this means there is unidirectional connection problem.

Router A keeps receiving HSRP hello from Router B so it is not changing its HSRP state.

Router B didn't receive HSRP hello from router A became active then received HSRP hello from Router A so changed back to Standby.

Check Router A and B interface to see if there is any interface errors increasing.

If not, most of time, it is cabling. Swap cable.

KK.

John Blakley · ‎11-03-2010

I have no errors on either of the interfaces, and I can see both fine from each other.

** I'm going to try to replace the cable anyway. In the time it took me to type above, I've made 3 more state changes:

FastEthernet0/0 - Group 100
State is Standby
   400 state changes, last state change 00:00:59
Virtual IP address is 10.125.100.1
    Secondary virtual IP address 10.125.99.1
    Secondary virtual IP address 10.128.100.1
    Secondary virtual IP address 10.129.100.1
    Secondary virtual IP address 10.131.100.1
Active virtual MAC address is 0000.0c07.ac64
    Local virtual MAC address is 0000.0c07.ac64 (v1 default)
Hello time 3 sec, hold time 10 sec
    Next hello sent in 0.780 secs
Preemption enabled
Active router is 10.125.100.2, priority 105 (expires in 9.772 sec)
Standby router is local
Priority 100 (default 100)
IP redundancy name is "hsrp-Fa0/0-100" (default)

Thanks,

John

HTH, John *** Please rate all useful posts ***

gatlin007 · ‎11-03-2010

I've seen this behavior where HSRP hello's must traverse a switched network and the spanning tree root for that particular VLAN is moving around. If this relationship manifests through multiple switches review the spanning tree topology.

Chris

John Blakley · ‎11-03-2010

Okay, it's not a cabling issue. I've replaced it with a known good cable, and it doesn't look like a stp issue. (These routers are connected to the same switch.)

Thanks,

John

HTH, John *** Please rate all useful posts ***

John Blakley · ‎11-03-2010

I'm debugging it now. (Don't know why I didn't do this before.) I'll post results.

Thanks,

John

HTH, John *** Please rate all useful posts ***

John Blakley · ‎11-03-2010

Okay, here's what I have, and it does look like it's losing communication: (possible interface problem?)

Nov 3 20:30:31.210: HSRP: Fa0/0 Redirect adv out, Passive, active 0 passive 1
Nov 3 20:30:32.842: HSRP: Fa0/0 Grp 100 Hello in 10.125.100.2 Active pri 105 vIP 10.125.100.1
Nov 3 20:30:33.850: HSRP: Fa0/0 Grp 100 Hello out 10.125.100.3 Standby pri 100 vIP 10.125.100.1
Nov 3 20:30:36.850: HSRP: Fa0/0 Grp 100 Hello out 10.125.100.3 Standby pri 100 vIP 10.125.100.1
Nov 3 20:30:39.850: HSRP: Fa0/0 Grp 100 Hello out 10.125.100.3 Standby pri 100 vIP 10.125.100.1
Nov 3 20:30:42.842: HSRP: Fa0/0 Grp 100 Standby: c/Active timer expired (10.125.100.2)
Nov 3 20:30:42.842: HSRP: Fa0/0 Grp 100 Active router is local, was 10.125.100.2
Nov 3 20:30:42.842: HSRP: Fa0/0 Grp 100 Standby router is unknown, was local
Nov 3 20:30:42.842: HSRP: Fa0/0 Grp 100 Standby -> Active
Nov 3 15:30:42: %HSRP-5-STATECHANGE: FastEthernet0/0 Grp 100 state Standby -> Active
Nov 3 20:30:42.842: HSRP: Fa0/0 Redirect adv out, Active, active 1 passive 1
Nov 3 20:30:42.842: HSRP: Fa0/0 Grp 100 Redundancy "hsrp-Fa0/0-100" state Standby -> Active
Nov 3 20:30:42.842: HSRP: Fa0/0 Grp 100 Hello out 10.125.100.3 Active pri 100 vIP 10.125.100.1
Nov 3 20:30:42.846: HSRP: Fa0/0 API arp proto filter, 0000.0c07.ac64 is active vMAC for grp 100 - filter
Nov 3 20:30:42.846: HSRP: Fa0/0 API arp proto filter, 0000.0c07.ac64 is active vMAC for grp 100 - filter
Nov 3 20:30:42.846: HSRP: Fa0/0 API arp proto filter, 0000.0c07.ac64 is active vMAC for grp 100 - filter
Nov 3 20:30:42.846: HSRP: Fa0/0 API arp proto filter, 0000.0c07.ac64 is active vMAC for grp 100 - filter
Nov 3 20:30:42.846: HSRP: Fa0/0 API arp proto filter, 0000.0c07.ac64 is active vMAC for grp 100 - filter
Nov 3 20:30:42.846: HSRP: Fa0/0 Grp 100 Coup in 10.125.100.2 Active pri 105 vIP 10.125.100.1
Nov 3 20:30:42.846: HSRP: Fa0/0 Grp 100 Active: j/Coup rcvd from higher pri router (105/10.125.100.2)
Nov 3 20:30:42.846: HSRP: Fa0/0 Grp 100 Active router is 10.125.100.2, was local
Nov 3 20:30:42.846: HSRP: Fa0/0 Grp 100 Active -> Speak

HTH, John *** Please rate all useful posts ***

John Blakley · ‎11-03-2010

Okay, I see something else. The switch is reporting a multicast storm for the port that the primary router connects to, not the one that I'm having a problem with. It could be the reason I'm losing connection from the secondary.

John

HTH, John *** Please rate all useful posts ***

John Blakley · ‎11-04-2010

Apparently the multicast storm threshold was too low. After changing it, my failovers stopped happening.

Thanks,

John

HTH, John *** Please rate all useful posts ***