Re: ping problem in vpc anycast vtep environment - Page 2

ez9 · ‎05-08-2020

Hello,

I have a topology(attached) using 2x Cisco nexus 9396px switches in vpc mode. I enabled vxlan feature and I implement anycast vtep.

The anycast IP is 192.168.93.93. The switchA has the 192.168.93.101 and the SwitchB the IP 192.168.93.102.

Also, I have a different loopbacks for bgp evpn peering.

Between the 2 switch I have a L3 dedicated link (e1/5) which running OSPF for underlay loopback reachability.

I have advertise-pip and advertise virtual-rmac enabled.

The problem is that client1 cannot ping looopback on R1 router. The switch1 and switch2 have a route in vrf table for this loopback IP. I tried to capture the traffic and I can see that the packet reaches to switch2 but it not forwarding to the R1 via interface e1/3.

I also tried to create a seperate loopback in each switch in vrf (switchA: 10.10.10.93, switchB: 10.10.10.193). I announced these IPs in BGP and i tried to ping between these IPs. Unfortunatelly the ping is failing again. In the packet capture I can see the packets reaching the other switch with VXLAN encapsulation, but the reply is never generated.

Now the switches running the version 7.0(3)I7(7). I tried to upgrade to 7.0(3)I7(8) but with the same results.

Can you provide any insights on this?

f00z · ‎05-13-2020

Seeing same results here. Routing loop inside sw1, the packets never leave switch 1.

An entry gets put in the forwarding table for switch2 loopback, and the external router (via redistribution or network statement) but it doesn't work. It keeps looping inside switch 1 like the route is pointing to sw1 instead of sw2 even though the nexthop is sw2's vtep PIP.

Without VPC it works fine, with VPC with exact same setup only addition of peer link and vpc config, plus advertise-pip and advertise virtual rmac, it makes routing loop.

Before advertise-pip, we had to have ibgp session between vpc pairs to share vrf routes, but advertise-pip is supposed to not require this. I am wondering if it's a bug? Might try some older code.

I'll keep digging at this because I need this scenario to work as well without ibgp peer between vpc pair.

ez9 · ‎05-14-2020

Hello,

Thank you for the reply and for the effort to solve this problem. I am going to try with older versions too.

At the moment I have the same results with versions: 7.0.3.I7.7, 7.0.3.I7.8, 9.2.3,9.2.4,9.3.4

I also tried to prune the L3VNI VLAN (901 in my case) from VPC peer link. In this case I can see the packets reaching SW2 with VXLAN encapsulation but they looped somewhere. In wireshark, for one ICMP request I capture many packets with TTL reduced eg starting with TTL 255 and reaching TTL 1.

f00z · ‎05-14-2020

Yes, a few things are happening here. How vpc works with vxlan, the switches in the pair ignore routes from the other switch in the pair (since they should be shared over cfs). However, since this is a type 5 route it shouldn't be ignored, but it looks like the mac address associated with it is being ignored. The packets never get encapsulated in vxlan because it sees the mac address over the peer link.

The peer-gateway feature makes it so the local switch can route packets for the other one, which makes it loop internally on sw1.

The routing table clearly shows it's destined to the vxlan tunnel but it's not encapsulating it, or it is internally and then picking it back up on the same switch, can't tell there.

To me it seems like this is a bug and it's not being programmed right in the hardware. It works from other VTEPS to get to orphan type 5 route on sw1 (which is what the advertise-pip fixed) but the pip should also fix the between vpc pair routes so you don't have to have a per-vrf ibgp session sharing the routes and it doesn't look like that is working.

The type-5 routes have the router mac inserted in them which matches the mac of the vpc peer and i think that is where it isn't getting programmed right.

Have to do some more digging. It seems silly that the other vpc pair can't access an orphan connected host that is on another interface.

example: putting layer3 port and connecting host on sw1, with /29 on layer3 port , advertising this /29 to type5

interface eth1/5

description server

vrf member VNI1000

ip address 6.6.6.1/29

then in bgp, network 6.6.6.0/29

this works for all other vteps on the network to connect to this host EXCEPT the VPC peer switch :/

obviously it would work in the old scenario where you have the per-vrf ibgp sessions between the vpc peers but that's really bad design option.

Also, orphan connecting hosts to vpc pairs isn't best practice either, but sometimes we have to do it , and this should work.

ez9 · ‎05-15-2020

I just tried to downgrade to 7.0(3)I7(6) but the result is the same.

I also tried to configure the fabric via DCNM (to avoid possible configuration errors), but the problem remains.

So, If this is bug I suppose I should wait from Cisco to resolve it. Unfortunatelly I haven't an active service contract for these devices in order to report this (possible) bug.

I wonder if this issue also exists in newer generations of nexus 9300.

Marc Luethi · ‎09-23-2021

@ez9 wrote:

So, If this is bug I suppose I should wait from Cisco to resolve it. Unfortunatelly I haven't an active service contract for these devices in order to report this (possible) bug.

https://bst.cloudapps.cisco.com/bugsearch/bug/CSCvh68802

We observe the very same symptoms on N9K-C9332PQ with NXOS 9.3.3 and 9.3.5.
- (in VRF) can't ping from Loopback to Loopback between VCP members, despite Route Type 5 being present
- (in VRF) traffic from LACP attached host to Loopbacks only works if host's LACP happens to select "correct" uplink to VPC member.

Did you ever find a solution to this problem other than per-VRF routing directly between the VPC peers?