cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1437
Views
0
Helpful
2
Replies

HSRP Issue betwen two Datacenter Cores

4octet
Level 1
Level 1

Having an interesting issue I believe is related to how VPC and HSRP works in NXOS. We have two datacenters connected together via a layer 2 trunk. The goal is to be able to failover to either datacenter for the ESX environment if either datacenter should go down.  Each are running NXoS 9.2.3 with layer3 peer-router enabled.

 

Nexus 9k Core 1 ------------>10GB Layer 2 -----------> DR Nexus 9k Core3

 

VPC HSRP Primary

 

Nexus 9k Core 2 ----------->Backup 1GB layer 2 (STP blocked)----> DR Nexus 9k Core4

VPC HSRP Secondary

 

Issue lies when we try to add a third and forth HSRP instance for the SVI's at the DR which are in their own VPC domain. Routing starts breaking even though HSRP looks correct in DR it shows the Nexus 9k1 as active 9k2 as Secondary. But what ends up happing is people loose connection to the gateway. Core 4 is becomes completely un-routable except via the vpc peer

 

Question is this breaking because technically HSRP runs active active when using VPC and having to separate domains is causing this to black hole somehow? Unfortunately I can't lab this up so leaving this in production for any length of time to troubleshoot is detrimental.

 

Nexus 9k1

vpc domain 100
peer-switch
role priority 1000
peer-keepalive destination 10.0.0.130 source 10.0.0.129
peer-gateway
layer3 peer-router
auto-recovery
fast-convergence
ip arp synchronize

 

interface Vlan11
no shutdown
no ip redirects
ip address 10.100.1.18/28
no ipv6 redirects
ip router eigrp 1
hsrp version 2
hsrp 11
authentication md5 key-string XXXXXXXXX
preempt
priority 110
timers 2 6
ip 10.100.1.17

 

Nexus 2

vpc domain 100
peer-switch
role priority 2000
peer-keepalive destination 10.0.0.129 source 10.0.0.130
peer-gateway
layer3 peer-router
auto-recovery
fast-convergence
ip arp synchronize

 

interface Vlan11
no shutdown
no ip redirects
ip address 10.100.1.19/28
no ipv6 redirects
ip router eigrp 1
hsrp version 2
hsrp 11
authentication md5 key-string XXXXXXXX
preempt
priority 105
timers 2 6
ip 10.100.1.17

 

 

DR Nexus 9k 3

vpc domain 110
peer-switch
role priority 1000
peer-keepalive destination 10.0.0.130 source 10.0.0.129
peer-gateway
layer3 peer-router
auto-recovery
fast-convergence
ip arp synchronize

 

interface Vlan11
shutdown
no ip redirects
ip address 10.100.1.29/28
no ipv6 redirects
ip router eigrp 1
hsrp version 2
hsrp 11
authentication md5 key-string NxOSHSRP@!
preempt
priority 100
timers 2 6
ip 10.100.1.17

 

 

DR Nexus 4

vpc domain 110
peer-switch
role priority 2000
peer-keepalive destination 10.0.0.129 source 10.0.0.130
peer-gateway
layer3 peer-router
auto-recovery
fast-convergence
ip arp synchronize

 

 

 

interface Vlan11
shutdown
no ip redirects
ip address 10.100.1.30/28
no ipv6 redirects
ip router eigrp 1
hsrp version 2
hsrp 11
authentication md5 key-string XXXXXXX
preempt
priority 95
timers 2 6
ip 10.100.1.17

 

Thanks in advance for any advice!

 

Chris D. Groves

CCIE#25899

 

2 Replies 2

Remi Astruc
Level 1
Level 1

Hi Chris,

Try to remove "peer-gateway" from the DR N9K Core 3 and 4.

It may be that each Core 3 and 4 tries to route on behalf of its peer while indeed none of them is HSRP "Active".

I would not set peer-gateway on HSRP "Listen" devices.

Let us know.

 

Remi Astruc

nazimkha
Level 4
Level 4
Issue is not because of separate vPC domains but on how vPC, STP and HSRP work together. You may want to tweak the design based upon the recommended best practices
Someone (TAC) with a live access to your system will be able to pinpoint the exact place where there is a packet drop / race around condition of HSRP states, etc

However as per vPC best practices you may want to implement STP / HSRP isolation between the data centers and also if feasible have vPC DCI

A sample design
https://www.cisco.com/c/en/us/td/docs/solutions/Enterprise/Data_Center/DCI/4-0/EMC/dciEmc/EMC_2.html

https://www.ciscolive.com/global/on-demand-library.html?search=vpc%20best%20practices#/session/14479207929320017eHp
https://www.ciscolive.com/global/on-demand-library.html?search=vpc%20best%20practices#/session/1488328831048001rdKC