Hello!
My apologies for the incoming wall of text! We have a medium sized SP network running MPLS on ASR9K routers in our core and at our edges. I was able to consolidate the responsibilities of our edge routers by bringing them into the underlay network and using a L3VPN with a VRF (BORDER) for our border peering connections (full tables). We have a second VRF (INTERNET) that is used for our public IP routing for internet subscribers. All customer internet gateways live in VRF INTERNET and a default route is leaked from VRF BORDER for customer internet connectivity.
Now for the nitty gritty. I am using RPL on our edge routers to mark certain routes (with a specific route target) in our BORDER VRF. Then, I am importing that RT into our INTERNET VRF on all of our routers. This is working well for leaking specific routes (default router included) into VRF INTERNET, which keeps us from having to learn a full table on all of our devices. Each edge router has a static default route pointing to the connected upstream peer, which then gets redistributed, marked by the route-policy, then imported (leaked) into VRF INTERNET. This gives VRF INTERNET a default route to VRF BORDER, where a more specific route exists since a full table is present.
Now for the problem. I am finding that it takes much longer than expected (10-15 minutes) to import the BORDER routes that are using the route target specified in our BORDER route-policy. For example, if our edge peer interface goes down, then the connected route for that interface is removed, thus removing the static default route from the routing table. This causes an outage for our customers. It takes 10-15 minutes for our other edge router's default route to get imported into VRF INTERNET on our other routers, which restores connectivity. Is there a more elegant way to handle this? I was thinking I could simply us a default route to NULL0 in VRF BORDER to prevent it from ever being removed from the table, but I am unsure if this is the best way. Alternatively, I was considering creating a default static route on our edge routers in VRF INTERNET to the next-hop in VRF BODER (0.0.0.0/0 vrf BODER x.x.x.x), but I am worried that the same problem may exists? Does anyone have experience with this?
Community feedback is always excellent and I look forward to hearing your thoughts! Thanks in advance!