06-24-2019 06:39 AM
Having an interesting issue I believe is related to how VPC and HSRP works in NXOS. We have two datacenters connected together via a layer 2 trunk. The goal is to be able to failover to either datacenter for the ESX environment if either datacenter should go down. Each are running NXoS 9.2.3 with layer3 peer-router enabled.
Nexus 9k Core 1 ------------>10GB Layer 2 -----------> DR Nexus 9k Core3
VPC HSRP Primary
Nexus 9k Core 2 ----------->Backup 1GB layer 2 (STP blocked)----> DR Nexus 9k Core4
VPC HSRP Secondary
Issue lies when we try to add a third and forth HSRP instance for the SVI's at the DR which are in their own VPC domain. Routing starts breaking even though HSRP looks correct in DR it shows the Nexus 9k1 as active 9k2 as Secondary. But what ends up happing is people loose connection to the gateway. Core 4 is becomes completely un-routable except via the vpc peer
Question is this breaking because technically HSRP runs active active when using VPC and having to separate domains is causing this to black hole somehow? Unfortunately I can't lab this up so leaving this in production for any length of time to troubleshoot is detrimental.
Nexus 9k1
vpc domain 100
peer-switch
role priority 1000
peer-keepalive destination 10.0.0.130 source 10.0.0.129
peer-gateway
layer3 peer-router
auto-recovery
fast-convergence
ip arp synchronize
interface Vlan11
no shutdown
no ip redirects
ip address 10.100.1.18/28
no ipv6 redirects
ip router eigrp 1
hsrp version 2
hsrp 11
authentication md5 key-string XXXXXXXXX
preempt
priority 110
timers 2 6
ip 10.100.1.17
Nexus 2
vpc domain 100
peer-switch
role priority 2000
peer-keepalive destination 10.0.0.129 source 10.0.0.130
peer-gateway
layer3 peer-router
auto-recovery
fast-convergence
ip arp synchronize
interface Vlan11
no shutdown
no ip redirects
ip address 10.100.1.19/28
no ipv6 redirects
ip router eigrp 1
hsrp version 2
hsrp 11
authentication md5 key-string XXXXXXXX
preempt
priority 105
timers 2 6
ip 10.100.1.17
DR Nexus 9k 3
vpc domain 110
peer-switch
role priority 1000
peer-keepalive destination 10.0.0.130 source 10.0.0.129
peer-gateway
layer3 peer-router
auto-recovery
fast-convergence
ip arp synchronize
interface Vlan11
shutdown
no ip redirects
ip address 10.100.1.29/28
no ipv6 redirects
ip router eigrp 1
hsrp version 2
hsrp 11
authentication md5 key-string NxOSHSRP@!
preempt
priority 100
timers 2 6
ip 10.100.1.17
DR Nexus 4
vpc domain 110
peer-switch
role priority 2000
peer-keepalive destination 10.0.0.129 source 10.0.0.130
peer-gateway
layer3 peer-router
auto-recovery
fast-convergence
ip arp synchronize
interface Vlan11
shutdown
no ip redirects
ip address 10.100.1.30/28
no ipv6 redirects
ip router eigrp 1
hsrp version 2
hsrp 11
authentication md5 key-string XXXXXXX
preempt
priority 95
timers 2 6
ip 10.100.1.17
Thanks in advance for any advice!
Chris D. Groves
CCIE#25899
06-25-2019 06:51 AM - edited 06-25-2019 07:18 AM
Hi Chris,
Try to remove "peer-gateway" from the DR N9K Core 3 and 4.
It may be that each Core 3 and 4 tries to route on behalf of its peer while indeed none of them is HSRP "Active".
I would not set peer-gateway on HSRP "Listen" devices.
Let us know.
Remi Astruc
06-25-2019 07:12 AM
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide