Currently we have multiple locations all of which are interconnected through an MPLS layer 3 VPN across multiple providers. All routes are being learned via E2 OSPF routes from the PE router to CE router at each location. We have a failover option utilizing a DMVPN connection through a secondary ISP inter-connecting all sites using EIGRP to our data center where each tunnel terminates and the device acts as a hub to the rest of the spoke locations. The DMVPN configuration does not create spoke to spoke tunnels and all traffic traverses through the hub then to the spokes. The fail over works properly when the OSPF adjacency from the CE to PE is lost and OSPF routes are withdrawn from the routing table. This then allows the EIGRP route through the DMVPN to populate the routing table and automatic fail over is achieved. This also works if the LSA is withdrawn from the OSPF database and not passed to the PE router. However, to my understanding within the ISP’s network mBGP is not reliant on the end to end LDP transport path through the MPLS cloud and is reliant on the ISP's local routing table to establish the mBGP peer. This allows the specific customers IGP routes to be advertised to the PE’s that are participating in mBGP peering. This is where I believe the problem is that when we lose end to end connectivity to a remote site due to an LDP failure we are still receiving the LSA through the mBGP peer from PE to PE then to CE, but the transit LDP path fails and our traffic is black holed and fail over never takes place automatically. This forces us to manually shut the CE to PE interface to force the fail over. Is there any way to verify the end to end transit path without using static routing with IP SLA as this would not be scalable and difficult to manage with so many sites? We would like to keep OSPF to accomplish this since it is already in place over the MPLS, but are open to using any other dynamic routing protocols as well. I have looked everywhere and cannot find any best practices or case reviews discussing this issue.
... View more