Multihoming to Two SP with Backdoor Links

Hi all, I have a customer which is currently connected to two service provider's (SP1 & SP2) MPLS VPN cloud. OSPF was chosen as the PE-CE routing protocol. Each branch is connected to SP1 and SP2, and there are branches which have backdoor link (Please refer to the attached diagram). The following are the further details on what has been configured:

  1. All CE routers are running OSPF routing protocol in Area 0, except the one indicated with "BGP" on the left hand side SP1's cloud.
  2. Capability VRF Lite is configured on each PE router, so that down-bit will be ignored when it reaches another PE router on another SP's PE router.
  3. Different Domain-IDs are used on both SP1 and SP2, 100 and 200 respectively.
  4. Standard OSPF-BGP (VPNv4) mutual redistribution are done on all PEs.
  5. No Sham-Links are used, nor allowed to be used.

Issue: Whenever there's a route prefix missing (due to branch link broken), routing loop will occur. For example, traceroute result to that particular route prefix will show that the packet will travel from E1 to E2, to SP2's PE, and across SP2's core, to B2, to B3, to A4 via backdoor link, to A1, to SP1's PE, and across SP1's core and the loop continues in the same path.

I have studied and determined the root cause of this issue and would like to get a second opinion. My opinion is that this sort of network design will not work in this environment especially when OSPF is used as PE-CE routing protocol on two SPs. The use of Capability VRF Lite will create a ring and whenever there're route prefix missing (might be due to branch link flapping or down), routing loop will occur. This happens to both OSPF Internal and External routes. Also, with two separate SP cloud, the OSPF attributes carried along VPNv4 routes will go missing as it's being mutually redistributed on all PEs. If you have a method to overcome routing loop issue, please share with us. Your input is greatly appreciated. Thanks!

Laurent Aubert
Cisco Employee


As you said, the down-bit is there to avoid such situation so why did you deactivate it on the PE connecting your dual-homed site ?