cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2602
Views
0
Helpful
7
Replies

OTV Multihomed Issue

Richard Clayton
Level 1
Level 1
Calling all OTV experts
 
I have configured Multihomed OTV in a virtual lab on EVE-NG using Cisco CSR's.  The lab is 2 x CSR at one site both connected to layer2 'customer' switch and a single CSR at a remote site.
Everything works good apart from one thing.  At the dual router site, when I drop the OTV WAN/Overlay interface on the CSR/R1 active for my extended VLAN, the remote mac appears in the CSR/R2 bridge-domain (as it should) but the 'customer' layer 2 switch mac address table still shows the mac address as facing the CSR/R1 LAN interface.  After 5 minutes the mac table times out and the extended VLAN traffic works again over the CSR/R2 path.
Is there any way R1/R2 can update the customer L2 switch when the remote mac moves over to R2 to make the failover quicker?
I did read a Cisco article that said if spanning tree is enabled on the OTV routers, it will send out a TCN when the overlay drops which will update the 'customer' L2.  I have spanning tree enabled on the OTV routers but when I drop the OTV WAN/Overlay interface, it does not send out a TCN, I had debug running on the CSR and wireshark running but no TCN.
I can give access to the lab if it will help the right person understand/fix the issue.
 
Thanks
Rick
7 Replies 7

Hello Richard,

 

without knowing your configuration, what happens if you disable ARP caching on the overlay interfaces ( no otv suppress arp-nd ) ?

Hi Georg

 

I have now attached the 3 CSR configs to the original post.  I have tried with and without ARP caching enabled but not to fix this issue, just to see the ARP caching in action.  I thought ARP caching just cached the remote ARP responses locally to save ARP traffic across the overlay?

Can you see anything missing from the config which would prevent the documented TCN from being generated?

 

I have pasted an extract from Cisco on the TCN generation.

 

Source

https://www.cisco.com/c/en/us/products/collateral/routers/asr-1000-series-aggregation-services-routers/guide-c07-735942.html

 

If multihoming is used, it is highly recommended to enable spanning tree on the OTV routers. Doing so enables the OTV router to send out a topology change notification (TCN), which will cause the adjacent Layer 2 switch device (along with other switches in the spanning tree) to reduce their aging timer from the default setting to 15 seconds. This will greatly speed convergence when there is a failure or recovery between the multihomed pair.

 

Thanks

Rick

Hello,

 

configs look ok as far as I can tell (except for the mtu 1600 size, I guess you have a specific reason for those?).

Can you verify which is the root switch for Vlan 99 and Vlan 100 (show spanning-tree vlan 99/100) ?

I increased the WAN MTU due to the overhead caused by using this feature, I could have used a lower value but 1600 covers it.  The spanning tree root for VLAN's 99-100 is the 'customer' switch connected to CSR1 Site1.

Richard Clayton
Level 1
Level 1
 

Richard Clayton
Level 1
Level 1

Hi Guys

 

I think I have the reason for the behavior in my lab.  I have the 'silent host' issue which happens in labs but generally doesn't happen in live networks.  For my host devices I used Cisco routers with an IP address on a single interface, all these devices were doing is a ping and an ARP.  In a production network these hosts would be workstations and servers and would be a lot more chatty, generating broadcast traffic.  When I drop the CSR1 site 1 WAN overlay the remote Cisco host does not generate any new broadcast traffic, new broadcast traffic would flood from the CSR1 site 2 across the overlay and eventually into the 'customer' layer 2 at site 1.

So in summary, in a production network the hosts would generate enough broadcast traffic to keep failover connectivity issues to a minimum.  In a lab with silent hosts, you will have to wait 5 minutes for the 'customer' layer 2 mac address table to age out before connectivity is restored.

I still don't fully understand why the OTV host doesn't generate a TCN as documented so if anyone could get an answer on that it would be great.

For now I am happy to design OTV into my customer solution.

 

Thanks

Rick

borman.bravo
Level 1
Level 1

Richard, just wondering which image version CSR you used in EVE-NG, I've tried the same configuration and cannot get the adjacency UP even though site is UP on each site. Both CSRs display the "No, overlay DIS not elected"

Thank you

Review Cisco Networking for a $25 gift card