Solved: Regarding vPC in ACI

tuanquangnguyen · ‎08-26-2020

Hi folks,

How does ACI handle vPC EP sync and downlink failure? Any official or unofficial documentation on this?

For vPC EP sync, theoretically if data plane traffic is hashed ONLY towards a single Leaf, does the other Leaf sync EP info as local by asking Spine, or wait for Spine to push to it (which is weird considering that's not how I expect from COOP)? Or, does it happen automatically?

For downlink failure (e.g. endpoint vPC connection is impaired), how does ACI handle it so that it would only forward to the Leaf with available downlink? Does the Leaf with failed downlink create any bounce entries for the endpoint?

In case of peer-keepalive and peer-link both being down which causes the NX-OS vPC domain to be dual-active, is there an equivalent situation in ACI and how is it dealt with?

Also, is ZMQ used for Peer-reachability the same as the ZMQ used for COOP?

Thanks in advanced.

Edit: I think I just found the answer for downlink failure in BRKACI-2001. The other question still remains oblivious to me. Also, how does ACI handle uplink failure from only a single vPC Leaf, because if member ports are not shutdown or disabled then the endpoint might just hash and forward to the Leaf with failed uplinks?

RedNectar · ‎08-26-2020

Hi @tuanquangnguyen ,

How does ACI handle vPC EP sync and downlink failure? Any official or unofficial documentation on this?

For vPC EP sync, theoretically if data plane traffic is hashed ONLY towards a single Leaf, does the other Leaf sync EP info as local by asking Spine, or wait for Spine to push to it (which is weird considering that's not how I expect from COOP)? Or, does it happen automatically?

When either leaf learns a new MAC/IP, this information is immeditately sent to the Spine COOP database AND to the Peer Leaf, so their shared Local Station tables are synchronised

In case of peer-keepalive and peer-link both being down which causes the NX-OS vPC domain to be dual-active, is there an equivalent situation in ACI and how is it dealt with?

For "peer-keepalive" - or whatever the equivalent is in ACI - to fail, a switch would have to loose ALL uplinks to ALL spines. In which case the spines would then have only one path to the VPC Anycast address, so problem solved.

Also, is ZMQ used for Peer-reachability the same as the ZMQ used for COOP?

I'm not sure about that one, but sounds reasonable.

I hope this helps

Don't forget to mark answers as correct if it solves your problem. This helps others find the correct answer if they search for the same problem

RedNectar aka Chris Welsh.
Forum Tips: 1. Paste images inline - don't attach. 2. Always mark helpful and correct answers, it helps others find what they need.

View solution in original post

RedNectar · ‎08-26-2020

Hi @tuanquangnguyen ,

I ignored a promise to myself that I would not answer posts late at night - so I missed thinking about the problem you have with the external switch still being connected to both leaves.

No you are correct. Port Tracking is probably your best bet. FWIW, I did a quick test to isolate a VPC peer from the fabric while it had a VPC active to a switch. I had a host attached to that switch, and set if off on three pings, hoping that at least one would still be balanced towards the soon-to-be-isolate peer switch.

Sure enough, when I shut down the spine ports towards that leaf, one of the ping streams died, never to fail over.

I'm reading upon Port Tracking, hoping that it would solve the problem. But I'm also afraid since it is a global knob, and my Leafs have different numbers of uplinks towards Spines (ranging from 2 to 4), I might accidentally shut down access ports on those with 4 uplinks if I set it to 2 (i.e. 2/2 uplinks down on Compute Leaf 1 => OK, but 2/4 uplinks down on Border Leaf => not OK to bring ports down)

I think that narrows it down to having to set port-tracking count to zero :-(

On NX-OS, I think at least if vPC Peer-link is down, the secondary member would suspend its own vPC member ports.

One would imagine that this shouldn't be too hard to implement on ACI - but I guess someone of influence needs to request it.

On a side note - I found the following link a useful read on ZMQ: https://augustl.com/blog/2013/zeromq_instead_of_http/

RedNectar aka Chris Welsh.
Forum Tips: 1. Paste images inline - don't attach. 2. Always mark helpful and correct answers, it helps others find what they need.

View solution in original post

RedNectar · ‎08-26-2020

Hi @tuanquangnguyen ,

How does ACI handle vPC EP sync and downlink failure? Any official or unofficial documentation on this?

For vPC EP sync, theoretically if data plane traffic is hashed ONLY towards a single Leaf, does the other Leaf sync EP info as local by asking Spine, or wait for Spine to push to it (which is weird considering that's not how I expect from COOP)? Or, does it happen automatically?

When either leaf learns a new MAC/IP, this information is immeditately sent to the Spine COOP database AND to the Peer Leaf, so their shared Local Station tables are synchronised

In case of peer-keepalive and peer-link both being down which causes the NX-OS vPC domain to be dual-active, is there an equivalent situation in ACI and how is it dealt with?

For "peer-keepalive" - or whatever the equivalent is in ACI - to fail, a switch would have to loose ALL uplinks to ALL spines. In which case the spines would then have only one path to the VPC Anycast address, so problem solved.

Also, is ZMQ used for Peer-reachability the same as the ZMQ used for COOP?

I'm not sure about that one, but sounds reasonable.

I hope this helps

Don't forget to mark answers as correct if it solves your problem. This helps others find the correct answer if they search for the same problem

RedNectar aka Chris Welsh.
Forum Tips: 1. Paste images inline - don't attach. 2. Always mark helpful and correct answers, it helps others find what they need.

tuanquangnguyen · ‎08-26-2020

Dear Chris @RedNectar,

Thanks for your response. Just want to ask something more:

For "peer-keepalive" - or whatever the equivalent is in ACI - to fail, a switch would have to loose ALL uplinks to ALL spines. In which case the spines would then have only one path to the VPC Anycast address, so problem solved.

True. But in case if the Leaf only has uplink failure (hardware is still normal), does ACI have any action on its downlink? Because if not, I think local endpoints with vPC connection might still be able to forward traffic to that Leaf.

I'm reading upon Port Tracking, hoping that it would solve the problem. But I'm also afraid since it is a global knob, and my Leafs have different numbers of uplinks towards Spines (ranging from 2 to 4), I might accidentally shut down access ports on those with 4 uplinks if I set it to 2 (i.e. 2/2 uplinks down on Compute Leaf 1 => OK, but 2/4 uplinks down on Border Leaf => not OK to bring ports down)

On NX-OS, I think at least if vPC Peer-link is down, the secondary member would suspend its own vPC member ports.

Thanks heaps.

RedNectar · ‎08-26-2020

Hi @tuanquangnguyen ,

I ignored a promise to myself that I would not answer posts late at night - so I missed thinking about the problem you have with the external switch still being connected to both leaves.

No you are correct. Port Tracking is probably your best bet. FWIW, I did a quick test to isolate a VPC peer from the fabric while it had a VPC active to a switch. I had a host attached to that switch, and set if off on three pings, hoping that at least one would still be balanced towards the soon-to-be-isolate peer switch.

Sure enough, when I shut down the spine ports towards that leaf, one of the ping streams died, never to fail over.

I'm reading upon Port Tracking, hoping that it would solve the problem. But I'm also afraid since it is a global knob, and my Leafs have different numbers of uplinks towards Spines (ranging from 2 to 4), I might accidentally shut down access ports on those with 4 uplinks if I set it to 2 (i.e. 2/2 uplinks down on Compute Leaf 1 => OK, but 2/4 uplinks down on Border Leaf => not OK to bring ports down)

I think that narrows it down to having to set port-tracking count to zero :-(

On NX-OS, I think at least if vPC Peer-link is down, the secondary member would suspend its own vPC member ports.

One would imagine that this shouldn't be too hard to implement on ACI - but I guess someone of influence needs to request it.

On a side note - I found the following link a useful read on ZMQ: https://augustl.com/blog/2013/zeromq_instead_of_http/

RedNectar aka Chris Welsh.
Forum Tips: 1. Paste images inline - don't attach. 2. Always mark helpful and correct answers, it helps others find what they need.

tuanquangnguyen · ‎08-26-2020

Dear Chris @RedNectar,

Thank you for your help so far.

I think that narrows it down to having to set port-tracking count to zero :-(

I misread the configuration for Port Tracking, as it apparently says "number of active spine links that triggers" it. So I think it's safe in my scenario to set it to 0 (a.k.a. a leaf lost all of its uplink). Something comes to my mind that is, why isn't this enabled by default, for if you lose all uplinks then your services are affected anyway.

And with the result of your testing, I suppose that the "Implicit Uplink Tracking" only applies to traffic coming towards the vPC endpoints (so that Spine would only forward traffic to the non-affected Leaf) and not for traffic coming from the vPC endpoints itself.

Again, thank you kindly for your assistance.

Sergiu.Daniluk · ‎08-26-2020

Hello again @tuanquangnguyen,

Chris already covered everything. I just wanted to add the following for vPC peer reachability: routing triggers are used to detect peer reachability

The vPC Manager registers with URIB for peer route notifications.
When ISIS discovers a route to the peer, URIB notifies vPC manager, in turn attempts to open ZMQ socket with the peer
When the peer route is withdrawn by ISIS, the vPC manager is again notified by URIB, and it brings the MCT link down

Source: https://unofficialaciguide.com/2018/04/10/aci-vpc-in-aci/

Stay safe,

Sergiu