cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
216
Views
2
Helpful
3
Replies

ACI forwarding question:

jesteen-wrangler
Community Member

ACI forwarding question: usually in a single pod when an endpoint wants to reach with in the same subnet but it dest not known it will send a spine proxy to one of the spines. And Spines check for the endpoint if not present it will drop it won't flood.

 Q1) How would source endpoint eventually communicate if EP is dropped. (Assume that 'hardware proxy' is configured on the fabric)

   Q2) Cut to the ACI multi pod it will trigger arp glean whereas in single pod arp glean is not triggered. How would ACI differentiate this behaviors?

1 Accepted Solution

Accepted Solutions

RedNectar
VIP Alumni
VIP Alumni

Hi @jesteen-wrangler ,

Let me sort out a little mis-conception - where you say "And Spines check for the endpoint if not present it will drop it won't flood"

That's not quite true.

Firstly, let's understand exactly WHAT is sent "when an endpoint wants to reach with in the same subnet but it dest not known"

I'm sure you remember from your first TCP/IP lesson, that if a device wishes to communicate with another on the same subnet, the first packet sent is an ARP request - so let's make it clear that the packet we must assume you are talking about in your question is an ARP packet (or frame - let's skip that argument). The point being that it is highly unlikely to be an IP packet being sent to another at this stage. (I'll discuss that exception below)

Now, back to the ARP frame (yes frame - because I'm discussing the L2 header). The ARP frame will be a L2 broadcast.

When it gets to the leaf, one of two things happen, depending on the state of the ARP Flooding option for the BD

  1. The leaf treats it like a L2 broadcast and sends it to all leaves that belong to the source BD. This is the default behaviour since about ACI v4.x when setting up an IP address for a BD - i.e.  ARP Flooding option for the BD is Enabled
  2. The leaf treats it like a L3 IP packet, and looks at the destination IP address within the ARP packet. This used to be the default behaviour before someone changed it. I.e. ARP Flooding option for the BD is Disabled
    1. If the leaf knows the destination IP address is local, it floods the ARP packet just on it's own ports in that BD
    2. If the leaf knows the destination IP address is present on another leaf, it forwards the ARP request to just that leaf.
    3. If the leaf doesn't know the destination IP, it sends the ARP packet to the spine proxy. I believe this is option that you seem to be exploring

      When the proxy gets the ARP, it will do one of two things
      1. If the proxy knows the destination IP address is present on some leaf, it forwards the ARP request to just that leaf.
      2. If the spine doesn't know the destination IP address, it holds on to that ARP for a short time while it sends an ARP glean request to all leaves that have ports in the BD, including the leaf that forwarded the original ARP packet to the spine.
        1. When each leaf receives the ARP glean request, it generates its own ARP request sourced with the IP address of the default gateway IP of the BD and sends it on all ports in the BD
        2. Assuming the endpoint exists, the endpoint will then reply to that ARP, and the leaf will learn that IP
          1. The leaf will report this IP to the spine proxy
          2. Meanwhile, the original ARP packet is still sitting in memory in the spine - so go back to ii.3.a above and you'll see the original ARP being sent to the target leaf. If the original ARP has timed out (likely) then the 2nd or 3rd ARP will make it through.

Note that during that whole discussion, your statement "And Spines check for the endpoint if not present it will drop it won't flood." did not happen.

Now, in the highly unlikely event that a device already knows the MAC address of another device (perhaps a static ARP entry) and sends a frame to that MAC, the whole ARP process discussed above is bypassed. Note we are now discussing a L2 scenario, so the IP addresses are irrelevant, although in reality we are going to be talking about two endpoints on the same subnet.

In this case, when the endpoint sends the MAC frame and it reaches the leaf, the leaf does a L2 lookup and

  1. If the destination MAC is local to the leaf, it forwards the frame out that interface
  2. if the destination MAC is known to exist on another leaf, it forwards the frame to that leaf
  3. if the destination MAC is unknown on that leaf, it will do one of two things depending on the state of the L2 Unknown Unicast option for the BD
    1. If the L2 Unknown Unicast option is Hardware Proxy (default) then the leaf will send it to the spine proxy
      1. If the spine proxy knows where the destination MAC address is, it will re-write the destination TEP of the encapsulated frame and send it to the correct leaf.
      2. if the spine proxy does not know where the destination MAC address is, it will drop the frame (AT LAST - we get to your "drop" scenario)
    2. If the L2 Unknown Unicast option is Flood, then it will flood the frame, as per normal L2 BUM traffic.

Now to your actual questions

Q1) How would source endpoint eventually communicate if EP is dropped. (Assume that 'hardware proxy' is configured on the fabric)

The source endpoint will rely on the destination endpoint sending a frame at some point in time. Until the destination endpoint sends a frame, the ACI fabric will never know that it exists, just like in any traditional L2 network. However, this is a reasonably (BUT NOT IMPOSSIBLE) situation, given that the original endpoint is almost certainly going to send an ARP request for the destination at some point.  The type of situation that might cause a MAC to be unknown would be if there is a L2 loop in an attached switch, and the leaf switches keep receiving BPDU TCNs, causing the leaf switches to continually drop their MAC tables.

Q2) Cut to the ACI multi pod it will trigger arp glean whereas in single pod arp glean is not triggered. How would ACI differentiate this behaviors?

Again, not quite right. ARP glean packets are sent by the proxy that does not have the destination IP address in its global station table. This could be a remote proxy or a local proxy - the story is pretty much the same.

Here's some more reading. Most of these were written by me

https://community.cisco.com/t5/application-centric-infrastructure/aci-arp-deluge/td-p/3677900 This is a link to when I asked a similar question of this community in 2018

https://rednectar.net/2018/08/13/aci-arp-gleaning/ Sergiu does a great explanation

https://rednectar.net/2018/08/13/aci-arp-gleaning/ (my blog - don't feel obliged to visit)

https://community.cisco.com/t5/application-centric-infrastructure/communication-among-same-ip-subnets-in-different-epgs-in-one-bd/m-p/3686124/highlight/true#M5147

 

 

 

RedNectar aka Chris Welsh.
Forum Tips: 1. Paste images inline - don't attach. 2. Always mark helpful and correct answers, it helps others find what they need.

View solution in original post

3 Replies 3

RedNectar
VIP Alumni
VIP Alumni

Hi @jesteen-wrangler ,

Let me sort out a little mis-conception - where you say "And Spines check for the endpoint if not present it will drop it won't flood"

That's not quite true.

Firstly, let's understand exactly WHAT is sent "when an endpoint wants to reach with in the same subnet but it dest not known"

I'm sure you remember from your first TCP/IP lesson, that if a device wishes to communicate with another on the same subnet, the first packet sent is an ARP request - so let's make it clear that the packet we must assume you are talking about in your question is an ARP packet (or frame - let's skip that argument). The point being that it is highly unlikely to be an IP packet being sent to another at this stage. (I'll discuss that exception below)

Now, back to the ARP frame (yes frame - because I'm discussing the L2 header). The ARP frame will be a L2 broadcast.

When it gets to the leaf, one of two things happen, depending on the state of the ARP Flooding option for the BD

  1. The leaf treats it like a L2 broadcast and sends it to all leaves that belong to the source BD. This is the default behaviour since about ACI v4.x when setting up an IP address for a BD - i.e.  ARP Flooding option for the BD is Enabled
  2. The leaf treats it like a L3 IP packet, and looks at the destination IP address within the ARP packet. This used to be the default behaviour before someone changed it. I.e. ARP Flooding option for the BD is Disabled
    1. If the leaf knows the destination IP address is local, it floods the ARP packet just on it's own ports in that BD
    2. If the leaf knows the destination IP address is present on another leaf, it forwards the ARP request to just that leaf.
    3. If the leaf doesn't know the destination IP, it sends the ARP packet to the spine proxy. I believe this is option that you seem to be exploring

      When the proxy gets the ARP, it will do one of two things
      1. If the proxy knows the destination IP address is present on some leaf, it forwards the ARP request to just that leaf.
      2. If the spine doesn't know the destination IP address, it holds on to that ARP for a short time while it sends an ARP glean request to all leaves that have ports in the BD, including the leaf that forwarded the original ARP packet to the spine.
        1. When each leaf receives the ARP glean request, it generates its own ARP request sourced with the IP address of the default gateway IP of the BD and sends it on all ports in the BD
        2. Assuming the endpoint exists, the endpoint will then reply to that ARP, and the leaf will learn that IP
          1. The leaf will report this IP to the spine proxy
          2. Meanwhile, the original ARP packet is still sitting in memory in the spine - so go back to ii.3.a above and you'll see the original ARP being sent to the target leaf. If the original ARP has timed out (likely) then the 2nd or 3rd ARP will make it through.

Note that during that whole discussion, your statement "And Spines check for the endpoint if not present it will drop it won't flood." did not happen.

Now, in the highly unlikely event that a device already knows the MAC address of another device (perhaps a static ARP entry) and sends a frame to that MAC, the whole ARP process discussed above is bypassed. Note we are now discussing a L2 scenario, so the IP addresses are irrelevant, although in reality we are going to be talking about two endpoints on the same subnet.

In this case, when the endpoint sends the MAC frame and it reaches the leaf, the leaf does a L2 lookup and

  1. If the destination MAC is local to the leaf, it forwards the frame out that interface
  2. if the destination MAC is known to exist on another leaf, it forwards the frame to that leaf
  3. if the destination MAC is unknown on that leaf, it will do one of two things depending on the state of the L2 Unknown Unicast option for the BD
    1. If the L2 Unknown Unicast option is Hardware Proxy (default) then the leaf will send it to the spine proxy
      1. If the spine proxy knows where the destination MAC address is, it will re-write the destination TEP of the encapsulated frame and send it to the correct leaf.
      2. if the spine proxy does not know where the destination MAC address is, it will drop the frame (AT LAST - we get to your "drop" scenario)
    2. If the L2 Unknown Unicast option is Flood, then it will flood the frame, as per normal L2 BUM traffic.

Now to your actual questions

Q1) How would source endpoint eventually communicate if EP is dropped. (Assume that 'hardware proxy' is configured on the fabric)

The source endpoint will rely on the destination endpoint sending a frame at some point in time. Until the destination endpoint sends a frame, the ACI fabric will never know that it exists, just like in any traditional L2 network. However, this is a reasonably (BUT NOT IMPOSSIBLE) situation, given that the original endpoint is almost certainly going to send an ARP request for the destination at some point.  The type of situation that might cause a MAC to be unknown would be if there is a L2 loop in an attached switch, and the leaf switches keep receiving BPDU TCNs, causing the leaf switches to continually drop their MAC tables.

Q2) Cut to the ACI multi pod it will trigger arp glean whereas in single pod arp glean is not triggered. How would ACI differentiate this behaviors?

Again, not quite right. ARP glean packets are sent by the proxy that does not have the destination IP address in its global station table. This could be a remote proxy or a local proxy - the story is pretty much the same.

Here's some more reading. Most of these were written by me

https://community.cisco.com/t5/application-centric-infrastructure/aci-arp-deluge/td-p/3677900 This is a link to when I asked a similar question of this community in 2018

https://rednectar.net/2018/08/13/aci-arp-gleaning/ Sergiu does a great explanation

https://rednectar.net/2018/08/13/aci-arp-gleaning/ (my blog - don't feel obliged to visit)

https://community.cisco.com/t5/application-centric-infrastructure/communication-among-same-ip-subnets-in-different-epgs-in-one-bd/m-p/3686124/highlight/true#M5147

 

 

 

RedNectar aka Chris Welsh.
Forum Tips: 1. Paste images inline - don't attach. 2. Always mark helpful and correct answers, it helps others find what they need.

Wonderful ! Thanks @RedNectar for such an insightful answer. Didnt expect such a thorough answer. The caveat lied in the detail which I misunderstood  is ARP packet is handled same as L2 unknown packet. 

And ARP request packet is handled same as L3 unicast with target ip. 

Q2 is moot as ARP glean does happen . 

sharing another resource if anyone else finds it useful

https://www.ciscolive.com/c/dam/r/ciscolive/emea/docs/2020/pdf/BRKACI-3545.pdf

Hi @jesteen-wrangler ,

Glad I could help (lucky you caught me on a lazy day). And yes - BRKACI-3545 is a great resource - I should have included it in my list!

RedNectar aka Chris Welsh.
Forum Tips: 1. Paste images inline - don't attach. 2. Always mark helpful and correct answers, it helps others find what they need.

Review Cisco Networking for a $25 gift card

Save 25% on Day-2 Operations Add-On License