Re: PIM-DM timer question

Laszlo Zoltan · ‎05-16-2019

Hi all,

I have a multicast network like this:

+-----+

| SRC |

+--+--+

|

----+----+----+----

| |

~ ~

| |

+--+--+ +--+--+

| R1 +---+ R2 |

+--+--+ +--+--+

| |

+--+--+ +--+--+

| RCV1| | RCV2|

+-----+ +-----+

There is a source node (SRC) that send multicast traffic to two routers (R1 and R2).

The routers route this traffic to their receiver (RCV1 or RCV2). In normal case the paths are these:

SRC -> R1 -> RCV1

SRC -> R2 -> RCV2

The R1 and R2 are far from the SRC. The RCVx is near the Rx.

In case of the link error between SRC and R1 the multicast traffic must be on this path:

SRC -> R2 -> R1 -> RCV1

The line break is detected by IP SLA ping to a host near SRC. If ping is success then a static route is inserted in the router to the SRC via direct line, and multicast is received from the direct line. If the ping is failed then a persistent unicast static route with higher metric to SRC via other router is active, and the multicast traffic is received from other router.

The multicast routing is working fine in this manner.

But the transition from the normal path to the alternate path needs about 90 secs. IP SLA can detect the line break in 3 secs. So the 90 secs are because of PIM timers. Unfortunately I do not know which PIM timer is responsible for this.

How can I set PIM-DM in order to faster transition?

Thanks,

Laszlo

ngkin2010 · ‎05-16-2019

Hello,

I am learning on the multicast routing and I am interested to check on your problem, hope I can help.

First, according to your description, R1 and R2 are far away from the SRC, I assume neither R1 nor R2 are the gateway of SRC, there is another router(s) between SRC and (R1,R2).

But could you please share if both receivers are not joining the same multicast group?

If so, in normal situation, the multicast traffic (to RCV2) will forwarded to RCV2 by R2, but becoz you are using PIM dense mode, the multicast traffic will also forwarded to R1 by R2.

When R1 received that multicast traffic, it find that none of the downstream devices interested for this multicast group, so it send Prune message to R2 (which tell R2 stop sending the multicast traffic to R1).

So, in a steady-state situation, R2's interface that facing to R1 is pruned. (verify by show ip mrouting)

But if the link is down, R2's interface will not resume from pruned state to forward state immediately.. In stead, it wait until the prune timer expire (that is default 180 seconds), after that it will resume to forward state again.

Not sure if it make sense or not..

Laszlo Zoltan · ‎05-16-2019

Hello,

Thank you for the fast response.

There is not any router between SRC and (R1, R2).

Both receivers join to the same multicast group.

I think that the normal case the R1 drops the multicast traffic arrived from the R2 because R1 has better route to the multicast source via direct line. In this case does R1 send prune message to R2?

Thanks,

Laszlo

ngkin2010 · ‎05-17-2019

Hi Laszlo,

"I think that the normal case the R1 drops the multicast traffic arrived from the R2 because R1 has better route to the multicast source via direct line"

If that is the case, the RPF check will fail, and discard the multicast traffic. But you said the IP SLA can detect the link down in 3 seconds, and shouldn't the route will change accordingly? Could you verify?

Laszlo Zoltan · ‎05-17-2019

Hello,

In normal case the IP SLA ping installs a route to SRC via direct line in R1 so RPF check enables multicast traffic from SRC via direct line and the RPF check fails on the line between R2 and R1.

In case of error on direct line the IP SLA ping fails and removes the earlier installed route in R1 so the persistent unicast static route with higher metric to SRC via other router is active and the RPF check enables multicast traffic from SRC via other router.

It is working well. The transition time is long only.

Thanks,

Laszlo

ngkin2010 · ‎05-17-2019

Hello Laszlo,

So, the actual problem is the "route to SRC via direct line" takes long time (90 seconds) to be uninstalled from routing table by IP SLA? Please confirm, thanks.

Best,
Ngkin

Laszlo Zoltan · ‎05-17-2019

Hi Ngkin,

No.

The static route via direct line is removed in 5 secs after the SRC becoming unreachable. It is good.

The problem is the pruned multicast traffic enable again.

I am going to try "ip pim join-prune-interval" on the interface to the other router.

Thanks,

Laszlo

Giuseppe Larosa · ‎05-16-2019

Hello Laslo,

it is possible to use a dynamic unicast routing protocol like OSPF with Fast reroute features or fast hellos instead of the IP SLA + static route ?

I think the behaviour could be different.

However, if you try to reduce the PIM hello interval on the link between the two device

with

ip pim hello-interval 5

because the default is 30 seconds and three times thirty seconds is actually 90 seconds.

You should recover in 15 seconds.

With a routing protocol the change would be propagated to PIM as soon as the IGP converges.

Hope to help

Giuseppe

ngkin2010 · ‎05-17-2019

Hello,

I am not quite understand if hello interval have any effect to this case..grateful if you could further explain it.

As my understanding, the PIM neighborship between R1 and R2 will be remain unchanged... the hold down timer is not related to this case..?

Giuseppe Larosa · ‎05-17-2019

Hello ngkin2010,

thanks for your useful note.

I have provided my previuos answer based on the following lines from original poster:

>> But the transition from the normal path to the alternate path needs about 90 secs. IP SLA can detect the line break in 3 secs. So the 90 secs are because of PIM timers. Unfortunately I do not know which PIM timer is responsible for this.

So I have supposed that there were other routers upstream R1 and R2 and those 90 seconds are the expiration of PIM neighborship on uplink of R1 to the upstream router. (not R2)

The original poster in later posts explained to you that there is no router upstream R1 and R2.

But he also says that the source is far from R1 and R2 (not directly connected I suppose) and this should lead to intermediate routers or multilayer switches in the middle.

I was referring to the R1 uplink under the hint that there is a PIM neighbor there not referrring to R1-R2 link that is not affected by the fault.

My answer may be wrong, but I do not find another explanation why it takes 90 seconds to converge when the static route fails.

By the way, a router will accept multicast traffic only on a PIM enabled interface and multicast traffic cannot travel over non multicast enabled routers unless using a GRE tunnel.

The original poster may provide additional info about his network scenario (there is an MPLS service provider in the middle ? for example)

Hope to help

Giuseppe

Laszlo Zoltan · ‎05-17-2019

Hi all,

There are Layer2 switches between SRC and (R1, R2), no MPLS. My initial figure was Layer3.

I cannot use dynamic routing instead of static because there is not any router at the SRC.

I do not think that PIM hello interval could help me because the R1 and R2 can see each other continuously.

I am going to try the "ip pim join-prune-interval" on the interface between R1 and R2.

I can do it in the maintenance time window only because this is a production network.

Thanks for help,

Laszlo

Giuseppe Larosa · ‎05-17-2019

Hello Lazlo,

>> There are Layer2 switches between SRC and (R1, R2), no MPLS. My initial figure was Layer3.

Ok but if the devices in the middle are only L2 what is the target of the static route?

In any case the source should be in same IP subnet as R1 uplink. You are using the static route to trace availability of the source?

>> I do not think that PIM hello interval could help me because the R1 and R2 can see each other continuously.

I agree on this if there is no upstream PIM router it is not useful.

ip pim join-prune interval it may be helpful in your network scenario

However, the default value for IPv4 is 60 seconds.

Inform us about the results of your tests.

Thanks for your feedback.

Hope to help

Giuseppe

pieterh · ‎05-17-2019

maybe I'm wrong, but it is the CLIENT that subscribes to the multicast ????

the routers will enable another route, but will not automatically take over the subscribtions.

The client must resubscribe to the new PIM

-> my suggestion: it's a client timer

Laszlo Zoltan · ‎05-20-2019

Hello Pieterh, Thanks your suggestion. I do not think that because R1 and R2 use "ip igmp static-group" on the outbound interface. The client does not need IGMP request in order to get the multicast traffic. Thanks, Laszlo

Laszlo Zoltan · ‎05-20-2019

Hello Giuseppe, You are right. My system a little bit complicated than I wrote in my start message of this conversation. I use host routes as the static route and the Layer2 network has more IP network address: 1. Common IP network with the multicast source, R1 and R2 2. IP network which belongs to R1 and a host which is pinged by R1 3. IP network which belongs to R2 and another host which is pinged by R2 So R1 cannot ping the host which is pinged by R2, and vice versa. On the R1-R2 link I use static route to host SRC only. I will inform you about the test of changing "ip pim join-prune interval" Thanks, Laszlo