cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
514
Views
0
Helpful
1
Replies

BGP EVPN session flapping between Spines in ACI Multisite

bmcgahan
Level 1
Level 1

I have a 2-site ACI fabric test lab that I'm deploying from scratch using Nexus Dashboard 2.3(2d), and Nexus Dashboard Orchestrator 4.1.2h. APICs at both sites are running 5.2(6e), and Spines are N9K-C9364C running 15.2(6e).

The config appears to deploy fine from NDO to the Spines, as OSPF is up in VRF overlay-1, and they are learning the Loopbacks for BGP EVPN peering, but the BGP sessions periodically go up and down. show bgp event-history events and show bgp event-history errors on the Spines don't show anything useful other than the BGP session stuck in Open state.

Does anyone know why iBGP EVPN sessions would flap between the Spines? Any further troubleshooting recommendations?

 

1 Accepted Solution

Accepted Solutions

bmcgahan
Level 1
Level 1

Sorry for the duplicate posts, the first one was originally marked as SPAM, but I wanted to post the resolution here in case anyone is searching for this problem in the future. 

This problem turned out to be an MTU issue in the IPN transport between the spines. Small OSPF hellos were getting through and establishing adjacency, but then large BGP update packets would get dropped and cause session resets. 

The fix was to move to a different physical transport box that doesn’t have an issue routing jumbo frames. 

I’m still not sure how to troubleshoot/verify this from ACI though. 

View solution in original post

1 Reply 1

bmcgahan
Level 1
Level 1

Sorry for the duplicate posts, the first one was originally marked as SPAM, but I wanted to post the resolution here in case anyone is searching for this problem in the future. 

This problem turned out to be an MTU issue in the IPN transport between the spines. Small OSPF hellos were getting through and establishing adjacency, but then large BGP update packets would get dropped and cause session resets. 

The fix was to move to a different physical transport box that doesn’t have an issue routing jumbo frames. 

I’m still not sure how to troubleshoot/verify this from ACI though. 

Save 25% on Day-2 Operations Add-On License