cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
3391
Views
10
Helpful
3
Replies

L3out stale endpoint by remote endpoint learning

LHi all,

 

The ACI endpoint learning white paper describes an issue with a stale endpoint when a server is deployed on a border node

https://www.cisco.com/c/en/us/solutions/collateral/data-center-virtualization/application-centric-infrastructure/white-paper-c11-739989.html#Figure8Staleendpointafterendpoin

The white paper states "This behavior is observed only when a packet to L3Out is sourced from a first-generation leaf switch."

What is different on 2nd generation leaf switches that solves this problem?

 

Thanks!

 

1 Accepted Solution

Accepted Solutions

micgarc2
Cisco Employee
Cisco Employee

Hi Bram,

 

It is due to GEN1 switches setting the DL (don' learn) bit in the iVXLAN header. The DL bit informs a remote leaf that it should not do dataplane learning for the particular frame.

 

Consider the following:

 

Leaf 101 - Compute Leaf

Leaf 103 - Compute Leaf

Leaf 105 - Border Leaf (BL)

 

  • Host-A is learned on Leaf-101 and Host-B is learned on BL 105
  • Host-A has triggered an XR learn on the BL 105
  • Host-A on Leaf 101 then moves from Leaf 101 to Leaf 103 and no longer sends any frames to another host on BL 105 but instead continues sending frames towards L3 out toward the BL 105
  • Leaf 101 maintains a bounce-entry for Host-A
  • Leaf 103 is a Gen-1 Leaf and VRF is set to ingress enforcement. Due to hardware restriction in GEN1, traffic sent to the L3 out has the DL (don’t learn) bit set in the iXVLAN header)
  • When the BL 105 receives the frame, it updates aging hit bit but does not update the learn entry since DL is set
  • Eventually, the bounce entry on Leaf 101 will timeout but the BL 105 will still have an XR entry point to Leaf 101. Any traffic destined to host-A will be dropped

GEN2 switches do not have this issue. Disabling remote EP learning fix:

 

  • No XR IP learning on BL
    • L3 out deployed with VRF in ingress policy enforcement mode
  • Prevents stale EP caused by Generation 1 sending traffic to L3 out with DL bit set
  • NOTE *Routed multicast traffic will still trigger an XR IP Learn on BL with Gen 2 switches*

 

Also, in 3.2 we introduced a new feature called EP Announce which should prevent stale endpoint issues (I have yet to see a stale EP issue since this). Basically when the bounce timer expires, leaf sends EP announce delete message which will trigger an XR delete on any leaf still pointing to the old leaf.

 

Hope this helps.

 

Thank you for participating in the Cisco Support Forum for ACI! If you have other questions related to this post, please let us know. If this response answers your questions, please mark this post "answered" and assign a rating to the response(s) provided. This will help notify other viewers that your question(s) is answered and this helps us provide better responses for this and future questions.
 
Regards,
Michael G.

View solution in original post

3 Replies 3

micgarc2
Cisco Employee
Cisco Employee

Hi Bram,

 

It is due to GEN1 switches setting the DL (don' learn) bit in the iVXLAN header. The DL bit informs a remote leaf that it should not do dataplane learning for the particular frame.

 

Consider the following:

 

Leaf 101 - Compute Leaf

Leaf 103 - Compute Leaf

Leaf 105 - Border Leaf (BL)

 

  • Host-A is learned on Leaf-101 and Host-B is learned on BL 105
  • Host-A has triggered an XR learn on the BL 105
  • Host-A on Leaf 101 then moves from Leaf 101 to Leaf 103 and no longer sends any frames to another host on BL 105 but instead continues sending frames towards L3 out toward the BL 105
  • Leaf 101 maintains a bounce-entry for Host-A
  • Leaf 103 is a Gen-1 Leaf and VRF is set to ingress enforcement. Due to hardware restriction in GEN1, traffic sent to the L3 out has the DL (don’t learn) bit set in the iXVLAN header)
  • When the BL 105 receives the frame, it updates aging hit bit but does not update the learn entry since DL is set
  • Eventually, the bounce entry on Leaf 101 will timeout but the BL 105 will still have an XR entry point to Leaf 101. Any traffic destined to host-A will be dropped

GEN2 switches do not have this issue. Disabling remote EP learning fix:

 

  • No XR IP learning on BL
    • L3 out deployed with VRF in ingress policy enforcement mode
  • Prevents stale EP caused by Generation 1 sending traffic to L3 out with DL bit set
  • NOTE *Routed multicast traffic will still trigger an XR IP Learn on BL with Gen 2 switches*

 

Also, in 3.2 we introduced a new feature called EP Announce which should prevent stale endpoint issues (I have yet to see a stale EP issue since this). Basically when the bounce timer expires, leaf sends EP announce delete message which will trigger an XR delete on any leaf still pointing to the old leaf.

 

Hope this helps.

 

Thank you for participating in the Cisco Support Forum for ACI! If you have other questions related to this post, please let us know. If this response answers your questions, please mark this post "answered" and assign a rating to the response(s) provided. This will help notify other viewers that your question(s) is answered and this helps us provide better responses for this and future questions.
 
Regards,
Michael G.

Thanks Michael for the detailed explanation!

Does this mean that when traffic is send from a Gen2 leaf to a L3out on a boarder leaf, that the border leaf does learn the remote endpoint? 

Correct it doesn't set the DL bit. Highly recommend to upgrade to 3.2 or higher though. Fixes almost all stale EP issues with EP announce feature.

Save 25% on Day-2 Operations Add-On License