We are running a textbook MPLS VPN using EIGRP as the IGP on ASR 9006s running a mix of IOS XR 6.2.3 and 5.3.4.
Before Non-Stop Routing (NSR) was enabled for LDP we experienced a 60+ second outage on VPN traffic when performing a 'redundancy switchover' on a transit router.
After enabling NSR for LDP this outage was reduced to ~10 seconds.
mpls ldp nsr router-id 172.30.0.2 interface TenGigE0/1/0/16 ! interface TenGigE0/1/0/17 ! interface Serial0/0/0/0/1/1/3:0
Both routing and LDP neighborships stay up during the switchover.
How do we troubleshoot this brief traffic outage?
EIGRP is not supported by NSR. Could this be a cause of the issue?
Unfortunately our core infrastructure runs EIGRP. So the loopback addresses used for the VPN are distributed via EIGRP.
We don't route with our customers, so the only routing is in the core.
Is there a way to avoid an outage on the MPLS VPN traffic during an RSP failover with such a setup?
You have NSF enabled correct (it is by default)? Also the default timer for NSF is 240 so it shouldn't be that causing a problem.
NSR indeed is not configurable for EIGRP in the ASR9K.
What are your EIGRP timers?
Do you see EIGRP go down or just traffic loss?
Any NP counters incrementing for drops. You can check show drops np.
What is BGP configured with? NSR?
Do the routes actually leave CEF or the RIB? We can check the time by show route vrf <vrf name> and the time there or show cef vrf <vrf name> prefix detail and it will tell us the last programmed time.
We can also check BGP to see the last time the route was learned / refreshed.
We should also check the CE facing EIGRP table to see if its flushing / relearning.
On the 5.3.4 router do you have CSCvk01967 installed?
Unfortunately the same SMU for 6.2.3 got rejected, however make sure to install CSCvi83758.
If adding the above SMUs and making sure BGP has NSR doesn't help then I would recommend emailing me so I can get some more details and put this in a lab or opening a service request so an engineer can lab this up and debug where the loss is occurring.