cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
45393
Views
10
Helpful
37
Replies

HSRP-Issue: Both Routers Active

mullzkBern_2
Level 1
Level 1

I have a strange issue with HSRP on my Nexus7000 resulting in a Active/Active-State.
Does anyone see where the problem is founded or where I should look next?

Thx in advance and Greetings from Berne,
Stefan Mueller

Layout

  • 2 Nexus 7000 with NX-OS 5.1(3) as Distribution-Switch, with all the Access-Switches attached to each Nexus, bundled with vPC.
  • N7K Providing L3 with SVIs on 49 Vlans. Nexus1 always takes the IP x.11, Nexus2 is x.12. Default Gateway is x.10, provided via HSRP. 48 Vlans work out fine. 1 Vlan (with identical Configuration) has a Problem:

Issue

  • Both Nexus think that they are HSRP Active on Vl 783. Standby-Router is unknown.

Config-Snippet Nexus 1

interface Vlan783

ip address 10.34.195.11/25

ip router eigrp 41

ip passive-interface eigrp 41

hsrp 1

authentication text somethingelse

preempt

priority 150

timers msec 300 msec 1000

ip 10.34.195.10

no shutdown

Config-Snippet Nexus 2

interface Vlan783

ip address 10.34.195.12/25

ip router eigrp 41

ip passive-interface eigrp 41

hsrp 1

authentication text somethingelse

preempt

priority 130

timers msec 300 msec 1000

ip 10.34.195.10

no shutdown

debug hsrp engine packet hello interface vlan 783

=> on N2 (which should be Standby. IP: .12), only the following lines are repeating:

2011 Oct 11 16:58:36.880624 hsrp: Vlan783[1/V4]: Hello out Active pri 130 ip 10.34.195.10

2011 Oct 11 16:58:36.880651 hsrp: Vlan783[1/V4]: hel 0 hol 0 auth somethingelse

2011 Oct 11 16:58:37.184802 hsrp: Vlan783[1/V4]: Hello out Active pri 130 ip 10.34.195.10

2011 Oct 11 16:58:37.184827 hsrp: Vlan783[1/V4]: hel 0 hol 0 auth somethingelse

=> on N1 (which should be Active. IP: .11), I receive two Hellos for each Hello sent:

2011 Oct 11 17:07:56.405711 hsrp: Vlan783[1/V4]: Hello out Active pri 150 ip 10.34.195.10

2011 Oct 11 17:07:56.405735 hsrp: Vlan783[1/V4]: hel 0 hol 0 auth somethingelse

2011 Oct 11 17:07:56.491349 hsrp: Vlan783[1/V4]: Hello in from 10.34.195.12 State Active pri 130 ip 10.34.195.10

2011 Oct 11 17:07:56.491450 hsrp: Vlan783[1/V4]: hel 0 hol 0 auth somethingelse

2011 Oct 11 17:07:56.491546 hsrp: Vlan783[1/V4]: Hello in from 10.34.195.12 State Active pri 130 ip 10.34.195.10

2011 Oct 11 17:07:56.491559 hsrp: Vlan783[1/V4]: hel 0 hol 0 auth somethingelse

2011 Oct 11 17:07:56.705691 hsrp: Vlan783[1/V4]: Hello out Active pri 150 ip 10.34.195.10

2011 Oct 11 17:07:56.705715 hsrp: Vlan783[1/V4]: hel 0 hol 0 auth somethingelse

2011 Oct 11 17:07:56.791414 hsrp: Vlan783[1/V4]: Hello in from 10.34.195.12 State Active pri 130 ip 10.34.195.10

2011 Oct 11 17:07:56.791437 hsrp: Vlan783[1/V4]: hel 0 hol 0 auth somethingelse

2011 Oct 11 17:07:56.791532 hsrp: Vlan783[1/V4]: Hello in from 10.34.195.12 State Active pri 130 ip 10.34.195.10

2011 Oct 11 17:07:56.791546 hsrp: Vlan783[1/V4]: hel 0 hol 0 auth somethingelse

Further Observations:

  • sh ip arp: N1 sees the SVI-address of N2 and vice-versa. Both of course have a ARP-Entry for the HSRP-address
  • sh mac add: N1 sees the N2-SVI-MAC on the vPC Peer-Link and vice-versa
  • Both N1 and N2 can ping all involved Addresses 10.34.195.10, 10.34.195.11 and 10.34.195.12 (and all Host-addresses as well)
  • Previously this morning, N1 could not ping SVI of N2 and Vice-Versa, although they could see each-other in the mac address-table (don't remember about arp-table). This also caused issues for End-Host-Traffic, notably DHCP. I then deleted hsrp-group 1, created hsrp-group 2 without authentication and with default-timers. This led to the same situation as above (Ping possible, HSRP both active), so I changed back to our standard-configuration.
  • The Vlan used to work at least three weeks ago. We are not aware of any relevant changes since then (we did attach more Access-Switches via vPC-Uplinks, though).
37 Replies 37

Hi Stefan,

How did you configure your peer links? Can you share some config?

Best regards,

Alex

Hi Alex

vPC-Config is pretty straightforward, although single-link:

'Active' Nexus: 
vpc domain 2
  role priority 100
  peer-keepalive destination 10.1.1.2 source 10.1.1.1 vrf pkal-VDC2
interface port-channel101
  vpc peer-link
interface Ethernet1/1
  description Po101 vPC Peer
  switchport
  switchport mode trunk
  rate-mode dedicated force
  udld aggressive
  channel-group 101 mode active
  no shutdown
interface Ethernet2/3
  description Keepalive n91005:e2/3
  vrf member pkal-VDC2
  ip address 10.1.1.1/30
  no shutdown

'Passive' Nexus (Admin and Operational vPC Role Secondary):
feature vpc
vpc domain 2
  role priority 200
  peer-keepalive destination 10.1.1.1 source 10.1.1.2 vrf pkal-VDC2
interface port-channel101
  vpc peer-link
interface Ethernet1/1
  description Po101 vPC Peer
  switchport
  switchport mode trunk
  rate-mode dedicated force
  udld aggressive
  channel-group 101 mode active
  no shutdown
interface Ethernet2/3
  description Keepalive n91005:e2/3
  vrf member pkal-VDC2
  ip address 10.1.1.2/30
  no shutdown

try removing preept and readding the cmd?

do you have tracking on an interface?

may be best to remove preept all together ( i know its not ideal but may be best)

Hi Scott

- Removing preempt did not help, neither did readding the cmd

- There is no tracking configured

The problem is narrowed to the problem where the should-be-standby-router does not receive the HRSP-Packets via vPC-Link or where he receives but silently ignores them.

I will focus on troubleshooting with TAC - if we find a reason/solution, I will post it in this thread.

Thanks and greetings from Switzerland

Stefan

Hi Stefan,

Please add "switchport mode trunk" to your interface port-channel 101.

I do not see "feature vpc" on the "active" nexus but I assume it is there, right?

Best regards,

Alex

Hi Alex

Sorry, I edited the config in order to keep the post shorter and forgot to note it. The full po101-Config on both Nexus is:

interface port-channel101

  description 1/1 VPC-Peer

  switchport

  switchport mode trunk

  spanning-tree port type network

  spanning-tree guard loop

  vpc peer-link

feature vpc is enabled on both Nexus, yes.

I will focus on troubleshooting with TAC - if we find a reason/solution, I will post it in this thread.
Thanks for your input and greetings from Switzerland

Stefan

WILLIAM STEGMAN
Level 4
Level 4

Is vlan 783 allowed on the trunk between your Nexus 7ks? 

Hi William

Yes, all Vlans are allowed on the trunk.

I will focus on troubleshooting with TAC - if we find a reason/solution, I will post it in this thread. Thanks for your input and greetings from Switzerland

Stefan

Things Need to be checked:

=========================

+Turn on ethanylzer on Nexus7k and check  HSRP hello packet from both side meaning sending and recieving.

N7KSW1#ethanalyzer local int inband capture-filter "host x.x.x.x" limit-cap

where x.x.x.x is vlan ip

+Check the control plane policy and check for any drop  incrementing or not.

+Check for spanning-tree on the  vlans.

regards

syed.

mullzkBern_2
Level 1
Level 1

hi @all

Thanks for your interest in our problem and all your input.

I think we are at a point, where the Forum is not an expedient instrument for troubleshooting anymore.

Since yesterday, we have a TAC-Case open for the problem and I will focus my troubleshooting there.If we find a reason for this issue, I will post it here...

Thanks again and greetings from Switzerland

Stefan

Hi,

Today, i matched the same issue on a very identical  architecture. all symptoms match. May you tell me in few words, what is  the troubleshooting from TAC ; do you have some indications/clues ?

Many thanks

Hi delahais

For Indications / Clues see my post below.

The important symptom to match is that the one Nexus says that it sends packets on the vPC-Link, but the other does not receivce them. In our case, a

debug hsrp engine packet hello interface vl 783

proved this point very easy.

Greetings from Berne

Stefan

darren.g
Level 5
Level 5

mullzkBern wrote:

I have a strange issue with HSRP on my Nexus7000 resulting in a Active/Active-State.
Does anyone see where the problem is founded or where I should look next?

Thx in advance and Greetings from Berne,
Stefan Mueller

Layout

  • 2 Nexus 7000 with NX-OS 5.1(3) as Distribution-Switch, with all the Access-Switches attached to each Nexus, bundled with vPC.
  • N7K Providing L3 with SVIs on 49 Vlans. Nexus1 always takes the IP x.11, Nexus2 is x.12. Default Gateway is x.10, provided via HSRP. 48 Vlans work out fine. 1 Vlan (with identical Configuration) has a Problem:

I think the issue is in the "49 Vlans" part of the above.

Do you have *all* the interfaces in the one HSRP group (I.E. hsrp1)? There's some kind of limitation with HSRP version one that I reemmber from way, way back that it has to be in multiples of 16 or something similar, and there's an upper limit on the number of interfaces which can be in a single HSRP group.

What happens if you reconfigure this VLAN and put it into HSRP group 2 (hsrp2) - or some other group number?

I know the TAC is looking into it, but thought I'd answer anyway.

Cheers.

mullzkBern_2
Level 1
Level 1

Hi there

As expected, the problem was not HSRP, but vPC. As seen in the debug above, the one Nexus said that he sent the HSRP Packets, but the other one did not receive it. Fact is, that the primary Nexus did not send anything over the vPC-Link in one specific Vlan - although the Vlan was allowed on the trunk.

Under the Hood, it seems to be related to Bug #CSCti95293 - though this bug has quite different symptoms (but this is what TAC and huge Show-Tech-Files are for). What is funny is that the bug was resolved under 5.0(5) and 5.1(1) and we have 5.1(3). The reason for this seems to be that we updated it from 5.0(3) via ISSU and the bug persisted this Soft Update. A reboot of the System probably would have helped it. However....

We could resolve the problem fairly easy: Disallowing and Readding the Vlan from the vPC (interface port-channel101; switchport trunk allowed vlan remove 783; switchport trunk allowed vlan all). Since then, the Vlan is forwarded on the vPC-Trunk, the Secondary-Nexus receives the the HSRP-Packets from the primary and changes its state to HSRP standby. Everything nice again.

Greetings from Berne

Stefan

Hi Folks,

I'm running version 5.1(3)N2(1) and have run into this exact problem.  Could you let me know what version I should upgrade to to resolve the issue?

Many thanks.

Shane.

Review Cisco Networking for a $25 gift card