Re: Nexus 9372 Secondary suspends vPC on Primary Manual Reload

CSCO10235163 · ‎06-21-2018

Hi all,

I've implemented a back to back vPC on a data center network by using 4 Nexus 9372X distributed in two different sites. Each couple of distributed Nexus corresponds to a vPC domain and auto recovery feature has been configured on it. 2 Nexus implement L3, 2 Nexus are only L2. Because of hardware and geographical limitations I used a L3 connection traversing external switches in order to bring peer keepalive messages. I configured auto recovery mechanism on vPC domain. Release is 7.0(3)I3(1)

Now, I've experienced this issue: one of the 2 L3 Nexus was manually reloaded, I expected the other one in the vPC domain would have taken the traffic control, but unexpectedly it shutted down all vPC interfaces, creating a black hole of trafffic that lasted until primary vpc domain member came back online. This is a significant message found in log:

15:13:00 NX_BF1_Pri %VPC-2-VPC_SUSP_ALL_VPC: Peer-link going down, suspending all vPCs on secondary

15:21:04 NX_BF1_Pri %VPC-2-PEER_KEEP_ALIVE_RECV_FAIL: In domain 10, VPC peer keep-alive receive has failed

I found some documentation speaking of an expected behavior like this on Nexus7k, but referring to a scenario using mgmt interface for peer keepalive and a reload by software, furthermore it should have been solved with auto recovery feature.

Do you know some related issue with Nexus9k? What am I missing?

Thanks

Chiara

Reza Sharifi · ‎06-21-2018

Hi,

That should not be the case if everything is configured correctly. Do you have vPC-keep alive configured?

See figure-2-2 in this link and the explanation when for when the vPC peer-link goes down.

https://www.cisco.com/c/en/us/td/docs/switches/datacenter/nexus5000/sw/operations/n5k_vpc_ops.html

HTH