cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
751
Views
3
Helpful
7
Replies

EVPN VXLAN - Distributed Anycast GW + ARP Suppress + Silent Host

jbekk
Level 1
Level 1

I have been working on an environment where a design was completed prior to my engagement. This is my first foray into configuring VXLAN but I have some experience with overlays/ACI/etc so am fairly comfortable. Design has 2x border leafs, 2x spines, 2x leafs. It has numerous VLANs and VRFs. Lots of external routing on the border leafs. No multi-site connectivity (full L3 segmentation between sites).

Currently we are mid-migration and are having Layer 2 reachability issues on one specific VLAN with a silent host on the old network fabric not being able to communicate with VLAN-adjacent hosts on the new fabric. All other inter-fabric communications are working fine (both L2/L3). 

Some basic setup notes:

  1. Distributed Anycast GW is enabled (i.e. fabric forwarding anycast-gateway-mac xxxx.xxxx.xxxx + (at SVI level) fabric forwarding mode anycast-gateway)
  2. Host learning is being done via BGP/EVPN
  3. ARP Suppression is enabled per VLAN

interface nve1
 no shutdown
 host-reachability protocol bgp
 source-interface loopback1
 member vni 200002
 suppress-arp
 mcast-group 239.0.1.1
 member vni 200003
 suppress-arp
 mcast-group 239.0.1.5
!
interface Vlan2
 no shutdown
 mtu 9216
 no ip redirects
 ip address 192.168.1.1/24
 no ipv6 redirects
 ip pim sparse-mode
 ip pim neighbor-policy NONE*
 fabric forwarding mode anycast-gateway

Some thoughts I wanted someone to comment on:

  1. With a Distributed Anycast gateway SVI, should I have it configured on both border leaf and leaf nodes if end-hosts are connected to both nodes types? The design has Anycast GW either on leaf or border leaf but not both. During migration we have hosts connected to both. At end-state nodes for each VLAN won't be connected to both types. I suspect the Anycast GW configuration matches this "end-state" scenario but won't work for the migration phase.
  2. With a silent host (i.e. one that doesn't communicate unless it is specifically spoken to), is it best-practice to disable ARP suppression on the associated VLAN the silent-host lives on? My thinking here is that ARP suppression in combination with EVPN host-learning just doesn't work when a silent host is in the mix. More context on silent host situation below...

Silent Host Issue Notes:

  • FHRP GW IP for VLAN on old network still.
  • Some hosts on VLAN are connected to new VXLAN fabric but can't communicate with silent hosts on old fabric (all Layer 2 traffic).
  • On border leaf, if I configure SVI (non-GW) IP to be on assocaited VLAN and ping silent host the VXLAN fabric updates to have reachability for silent host and all is well.
  • The border leaf is where old fabric connects to new fabric (Layer 2 connection between).
  • Layer 2 forwarding works for other VLANs in this situation, it's just this silent host that isn't playing nice until I configure a gateway and manually setup the ping.
1 Accepted Solution

Accepted Solutions

Pavel Tarakanov
Cisco Employee
Cisco Employee

>With a Distributed Anycast gateway SVI, should I have it configured on both border leaf and leaf nodes if end-hosts are connected to both nodes types?

 

In BGP EVPN, it is recommended to use the anycast gateway feature on all VTEPs.

https://www.cisco.com/c/en/us/td/docs/dcn/nx-os/nexus9000/103x/configuration/vxlan/cisco-nexus-9000-series-nx-os-vxlan-configuration-guide-release-103x/m_configuring_vxlan_93x.html

Also CiscoLive p. 22

https://www.ciscolive.com/c/dam/r/ciscolive/global-event/docs/2022/pdf/BRKDCN-2106.pdf

With a silent host (i.e. one that doesn't communicate unless it is specifically spoken to), is it best-practice to disable ARP suppression on the associated VLAN the silent-host lives on?

I'd say that in case of properly configured AGW (on all leaf switches, etc) it should work correctly with silent host.

 

 

In your case best approach is to configure AGW across all leaf switches and, may be, disable ARP suppression till the end of migration. (also it's better to consider if you need ARP suppression at all, as there is no much traffic generated in usual curcumstances).

View solution in original post

7 Replies 7

pwtn
Level 1
Level 1

I’m pretty sure I’m hitting a very similar issue. I just posted a new thread about it but basically I have two nodes connected via layer 2 bgp evpn. First VLAN worked fine. Second VLAN has a single very quiet host so the MAC times out and I lose BGP route and NVE peering on the non silent host side. I can force it up by configuring an SVI on the silent host side. This works even when the SVI has no IP, which I cant even explain. As soon as I delete the SVI the MAC eventually times out again and I lose BGP route and NVE peering for the VNI. This feel like a logic bug in the software because even ARP does not appear to get “flooded” until there is an NVE peering… and you don’t get one until the BGP route gets advertised… and it just wont be advertised until some traffic puts the silent host back in the mac address table on the VTEP. Feels like the NVE peering should be based on having an active VLAN in the DB, not on local mac address table entries.

Dawei
Cisco Employee
Cisco Employee

While it may not be applicable for C9K cases, but for N9K, if a VLAN has a silent host, you should redistribute the direct (SVI subnet) in the Tenant VRF. 

The problem is I have a layer 2 EVPN network, that is to say there are only type 2 host routes in BGP and no SVI's configured on the VTEPs. This layer 2 EVPN network is just connecting two segments of the same VLAN which are separated by a routed underlay network. In my scenario the VTEPs don't need to be involved in routing IP traffic for the subnet, we just need them to forward ethernet frames between the two sites, and replicate broadcast frames using the multicast underlay. The problem with a silent host (at least in the C9K implementation) is that VTEP on the silent side wil withdraw its BGP routes when the mac address times out taking down the VNI peering on the loud side. Once the VNI peering is gone the loud side VTEP doesn't seem to be able to use the multicast underlay to forward BUM traffic to the other side. Thus the loud hosts are not even able to "wake" the silent host with ARP to trigger BGP updates and bring up the VNI.

The creation of an SVI for the VLAN on the silent side (even with no IP address) seems to keep the mac address in the table and BGP up. I'm not sure what it's doing, it doesn't look like it should do anything, but it's clearly doing something. Ideally I don't want to deploy anycast gateway on every VTEP, and from the C9K guide on layer 2 EVPN it doesn't seem neccessary, or at least there is no mention of how to deal with silent hosts in the documentation.

Pavel Tarakanov
Cisco Employee
Cisco Employee

>With a Distributed Anycast gateway SVI, should I have it configured on both border leaf and leaf nodes if end-hosts are connected to both nodes types?

 

In BGP EVPN, it is recommended to use the anycast gateway feature on all VTEPs.

https://www.cisco.com/c/en/us/td/docs/dcn/nx-os/nexus9000/103x/configuration/vxlan/cisco-nexus-9000-series-nx-os-vxlan-configuration-guide-release-103x/m_configuring_vxlan_93x.html

Also CiscoLive p. 22

https://www.ciscolive.com/c/dam/r/ciscolive/global-event/docs/2022/pdf/BRKDCN-2106.pdf

With a silent host (i.e. one that doesn't communicate unless it is specifically spoken to), is it best-practice to disable ARP suppression on the associated VLAN the silent-host lives on?

I'd say that in case of properly configured AGW (on all leaf switches, etc) it should work correctly with silent host.

 

 

In your case best approach is to configure AGW across all leaf switches and, may be, disable ARP suppression till the end of migration. (also it's better to consider if you need ARP suppression at all, as there is no much traffic generated in usual curcumstances).

Also:

 

  • ARP suppression is only supported for a VNI if the VTEP hosts the First-Hop Gateway (Distributed Anycast Gateway) for this VNI. The VTEP and the SVI for this VLAN have to be properly configured for the distributed Anycast Gateway operation, for example, global Anycast Gateway MAC address configured and Anycast Gateway feature with the virtual IP address on the SVI.

  • The ARP suppression setting must match across the entire fabric. For a specific VNID, all VTEPs must be either configured or not configured.

https://www.cisco.com/c/en/us/td/docs/dcn/nx-os/nexus9000/103x/configuration/vxlan/cisco-nexus-9000-series-nx-os-vxlan-configuration-guide-release-103x/m_configuring_vxlan_bgp_evpn.html

 

So for suppress ARP anycast gateway should be configured across all VTEPs

It's been months since I raised this but since there was some traffic on this post I wanted to share some information. The answer provided by Pavel is correct.

Some pointers to keep others out of trouble...

An Anycast SVI needs to be configured on all leaf nodes (BGW/LEAF) where endpoints will connected to the VLAN. This is needed for ARP flows to function across the old/new switches properly. In my situation, the GW remained on the old routers on the old environment, so I didn't have an SVI on the LEAF switches yet. The plan was to bring this up as a Layer 3 cutover at a later point. So, if you get into this situation, and ARP suppression is on, configure a dummy unused IP on the LEAF nodes as an interim. This allows each LEAF node to resolve ARP for you even if ARP suppression is turned on.

When you move the gateway across to be on the VXLAN switches (i.e. as an Anycast gateway). Every leaf node (that has devices on the VLAN connected to it) needs the gateway IP configured on its respective Anycast SVI.

ARPs are not forwarded by LEAF nodes when ARP suppression is enabled. The first leaf node that sees the ARP will try ARP resolve but won't forward it to other LEAF nodes.

Get familiar with the "show bgp l2vpn evpn" command on Nexus switches. It shows you how the switch will route L2 traffic destined to IP (i.e. packets) and/or MAC address (i.e. frames). Very useful to understand whether ARP is working.

Run a "show arp" on each leaf node to understand whether ARP resolution is working on each LEAF node.

I can't recall the specifics on this from memory, but there is some funky stuff with running pings from LEAF nodes. Don't assume that a ping will work just because the LEAF node has the correct Anycast gateway on it. EVERY switch has the same IP and will intercept the response packets on return flows.

That's about all I can remember for the moment. I hope it helps.

>ARPs are not forwarded by LEAF nodes when ARP suppression is enabled. The first leaf node that sees the ARP will try ARP resolve but won't forward it to other LEAF nodes.

 

ARPs not forwarded in case if switch have a record in ARP cache. Otherwise, it should be forwarded as usual.

Also, couple useful commands for ARP suppression cache:

"ip arp suppression-cache clear remote vlan <vlan> <ip>"
"ip arp suppression-cache download remote vlan <vlan> <ip>"

 

 

>I can't recall the specifics on this from memory, but there is some funky stuff with running pings from LEAF nodes.

 

Indeed, it's not a valid test, as ping reply can be consumed by another switch with the same IP.

Two options here:

- ping from client to AGW

- configure Loopback with unique IP in tenant and use it as a source.

Review Cisco Networking for a $25 gift card