Hi,rLFA while identifying the

bjberginz · ‎04-25-2014

I have a simple ring network with 4 3600Xs with IP/MPLS 10 gig backbone between all units (with OSPF running in the core). Per the 3600 design guide I turned on IPFRR under OSPF for fast reroute of traffic around faults. I have a l3vpn on the 3600s that I'm using to test. The FRR works quite well when the repair route is a ECMP (equal cost multipath) route, I don't even notice an interruption in ping between l3vpn sites when an 'active' link goes down.

The issue arises when the repair route is a remote-LFA (loop free alternative) MPLS tunnel. I've done a few tests, and the failover time when the repair route is a remote LFA tunnel is the same as when FRR isn't turned on at all, it's just the normal route convergence time and there is a significant traffic interruption (as compared to FRR when an ECMP route is the repair route).

The thing is I'm not quite sure how even to diagnose this. I was thinking that maybe the remote FLA tunnel was using the link that failed, so it in essence was 'down' as well, hence the traffic interruption as routing fully converged. But I looked at the remote-LFA interfaces, and as much as I understand them they are taking the right path out of the router anyway (that is, away from the link that would fail in order to activate the remote-LFA route).

Are there any resources or tips to help troubleshoot why these remote-LFA tunnel repair routes don't seem to be working well?

Nagendra Kumar Nainar · ‎04-28-2014

Hi,

rLFA while identifying the PQ node (to which LDP tunnel will be established) will make sure that the tunnel is not going over the link to be protected. SO I dont think that is the reason.

Do you see the back path installed in RIB/FIB table?. Just to make sure it is not taking the failing link. Can you try and check if the path to reach the backup tunnel is not over the failing link?.

-Nagendra

bjberginz · ‎04-28-2014

Thanks for the reply Nagendra. When you ask if I've seen the back path installed in RIB/FIB, I'm not exactly sure what you mean. I do see repair paths referncing remote LFAs on both the 3600 that would be the source and the destination of the test traffic. Like this:

* 172.16.0.3, from 10.10.10.3, 01:55:50 ago, via TenGigabitEthernet0/2
Route metric is 2, traffic share count is 1
Repair Path: 10.10.10.4, via MPLS-Remote-Lfa40

and on the other router:

* 172.16.0.2, from 10.10.10.1, 01:56:34 ago, via TenGigabitEthernet0/1
Route metric is 2, traffic share count is 1
Repair Path: 10.10.10.2, via MPLS-Remote-Lfa32

If you're looking for some specific command output, let me know.

Nagendra Kumar Nainar · ‎04-28-2014

Hi,

Can you check if 10.10.10.4 is using Teng0/2 as egress interface?. (and 10.10.10.2 using Teng0/1)?.

If not, then it is expected behaviour and programming as expected.

Few common mis understanding I have seen in the past from different people are below:

1. Applied rLFA on one direction (which will work fine) while no rLFA for return traffic. So it will create an impression that it waits for convergence (as it is the case for return packet).

2. How to simulate the link failure?. Interface Shut on router where you want to trigger LFA?.

-Nagendra

bjberginz · ‎04-29-2014

Nagendra, I'm not completely sure what you are asking, but 10.10.10.4 does use t0/2 t get to 10.10.10.3 and 10.10.10.2 usues t0/1 to get to 10.10.10.1.

1. It does appear that there are rLFA, and proper ones, in both directions, so I don't think this is the problem

2. I am physically pulling the cable to simulate a link down

IP-Fast Reroute with MPLS remote LFA tunnels