cancel
Showing results for 
Search instead for 
Did you mean: 
cancel

Who Me Too'd this topic

Route leaking between VRF using EVPN (not centralized) BROKEN?

f00z
Level 3
Level 3

I am having an issue with route leaking between VRFs in EVPN getting blackholed.  I swear I had this working in the lab the other day and now it's not.

Example setup is 2 leaf one spine. 

 

Leaf A 

vrf Test1

L2VNI 1000 L3VNI 1001

 

Leaf B

vrf Test2

L2VNI 2000 L3VNI 2001

 

Using auto rt/rd except:

Vrf Test1

route-target export 123:1

route-target export 123:1 evpn

route-target import 123:2

route-target import 123:2 evpn

 

Vrf Test2

route-target export 123:2

route-target export 123:2 evpn

route-target import 123:1

route-target import 123:1 evpn

 

nve1 on leaf A has 1000 1001 2001 

nve1 on leaf B has 2000 2001 1001

 

BGP redistributing static/direct

Everything looks perfectly fine in the show forwarding, show bgp etc etc etc

in fact it all works if I set a loopback interface on leaf a and b.

example I can set loopback 10 to an ip address on the leaves and it works between them from the CLI

like ping 5.5.5.5 vrf Test2 source 6.6.6.6  (where 5.5.5.5 is on leaf a and 6.6.6.6 is on leaf b loopback)

Sniffing the traffic, it looks right.

 

THE PROBLEM IS.. It doesn't work in the hardware. It's blackholed in the hardware if you look at bcm-shell:

2112 4 4.4.4.4/30 00:00:00:00:00:00 100017 0 0 0 0 y

This is the 'defip' the alpm route which is set to interface 10017 in BCM shell

if you look at the egress interface obect table

Entry Mac Vlan INTF PORT MOD MPLS_LABEL ToCpu Drop RefCount L3MC

100017 00:00:00:00:00:00 4095 4095 1 110 -1 no yes no

 

It's set to Null, drop.  

If you look at: show system internal forwarding vrf Test1 detail

it has an entry:

4.4.4.4/30 ,

with nothing after it.

Dev| Prefix | PfxIndex | AdjIndex | LIF

0 4.4.4.4/30 0xcdc967fc 0x186b1 0xfff

 

LIF set to 0xfff, and 186b1 is 100017 , when it should be set to the remote vtep rmac adjindex and lif 

 

the ones that work have entries with vlan ids and adjacencies etc

So this bug is setting it to null for some strange reason?

 

The switch is setting it to the wrong egress , it should be set to 10016 in my list which is the tunnel mac for the other switch.

Why is it doing this? I spent days trying to figure this out and I'm utterly frustrated at this point.  

 

Like i said it uses the right interface from the NX-OS CLI but if a server or devices is connected to a port it blackholes it.  This has to be a bug, although it looks intentional? I tried with 7.0.3.I7.8 and 9.3.4  so far.  

 

Would appreciate it if someone else can test this in a lab.  

Who Me Too'd this topic