I have a multicast network like this:
| SRC |
| R1 +---+ R2 |
| RCV1| | RCV2|
There is a source node (SRC) that send multicast traffic to two routers (R1 and R2).
The routers route this traffic to their receiver (RCV1 or RCV2). In normal case the paths are these:
SRC -> R1 -> RCV1
SRC -> R2 -> RCV2
The R1 and R2 are far from the SRC. The RCVx is near the Rx.
In case of the link error between SRC and R1 the multicast traffic must be on this path:
SRC -> R2 -> R1 -> RCV1
The line break is detected by IP SLA ping to a host near SRC. If ping is success then a static route is inserted in the router to the SRC via direct line, and multicast is received from the direct line. If the ping is failed then a persistent unicast static route with higher metric to SRC via other router is active, and the multicast traffic is received from other router.
The multicast routing is working fine in this manner.
But the transition from the normal path to the alternate path needs about 90 secs. IP SLA can detect the line break in 3 secs. So the 90 secs are because of PIM timers. Unfortunately I do not know which PIM timer is responsible for this.
How can I set PIM-DM in order to faster transition?
I am learning on the multicast routing and I am interested to check on your problem, hope I can help.
First, according to your description, R1 and R2 are far away from the SRC, I assume neither R1 nor R2 are the gateway of SRC, there is another router(s) between SRC and (R1,R2).
But could you please share if both receivers are not joining the same multicast group?
If so, in normal situation, the multicast traffic (to RCV2) will forwarded to RCV2 by R2, but becoz you are using PIM dense mode, the multicast traffic will also forwarded to R1 by R2.
When R1 received that multicast traffic, it find that none of the downstream devices interested for this multicast group, so it send Prune message to R2 (which tell R2 stop sending the multicast traffic to R1).
So, in a steady-state situation, R2's interface that facing to R1 is pruned. (verify by show ip mrouting)
But if the link is down, R2's interface will not resume from pruned state to forward state immediately.. In stead, it wait until the prune timer expire (that is default 180 seconds), after that it will resume to forward state again.
Not sure if it make sense or not..
Thank you for the fast response.
There is not any router between SRC and (R1, R2).
Both receivers join to the same multicast group.
I think that the normal case the R1 drops the multicast traffic arrived from the R2 because R1 has better route to the multicast source via direct line. In this case does R1 send prune message to R2?
In normal case the IP SLA ping installs a route to SRC via direct line in R1 so RPF check enables multicast traffic from SRC via direct line and the RPF check fails on the line between R2 and R1.
In case of error on direct line the IP SLA ping fails and removes the earlier installed route in R1 so the persistent unicast static route with higher metric to SRC via other router is active and the RPF check enables multicast traffic from SRC via other router.
It is working well. The transition time is long only.
it is possible to use a dynamic unicast routing protocol like OSPF with Fast reroute features or fast hellos instead of the IP SLA + static route ?
I think the behaviour could be different.
However, if you try to reduce the PIM hello interval on the link between the two device
ip pim hello-interval 5
because the default is 30 seconds and three times thirty seconds is actually 90 seconds.
You should recover in 15 seconds.
With a routing protocol the change would be propagated to PIM as soon as the IGP converges.
Hope to help
thanks for your useful note.
I have provided my previuos answer based on the following lines from original poster:
>> But the transition from the normal path to the alternate path needs about 90 secs. IP SLA can detect the line break in 3 secs. So the 90 secs are because of PIM timers. Unfortunately I do not know which PIM timer is responsible for this.
So I have supposed that there were other routers upstream R1 and R2 and those 90 seconds are the expiration of PIM neighborship on uplink of R1 to the upstream router. (not R2)
The original poster in later posts explained to you that there is no router upstream R1 and R2.
But he also says that the source is far from R1 and R2 (not directly connected I suppose) and this should lead to intermediate routers or multilayer switches in the middle.
I was referring to the R1 uplink under the hint that there is a PIM neighbor there not referrring to R1-R2 link that is not affected by the fault.
My answer may be wrong, but I do not find another explanation why it takes 90 seconds to converge when the static route fails.
By the way, a router will accept multicast traffic only on a PIM enabled interface and multicast traffic cannot travel over non multicast enabled routers unless using a GRE tunnel.
The original poster may provide additional info about his network scenario (there is an MPLS service provider in the middle ? for example)
Hope to help
There are Layer2 switches between SRC and (R1, R2), no MPLS. My initial figure was Layer3.
I cannot use dynamic routing instead of static because there is not any router at the SRC.
I do not think that PIM hello interval could help me because the R1 and R2 can see each other continuously.
I am going to try the "ip pim join-prune-interval" on the interface between R1 and R2.
I can do it in the maintenance time window only because this is a production network.
Thanks for help,
>> There are Layer2 switches between SRC and (R1, R2), no MPLS. My initial figure was Layer3.
Ok but if the devices in the middle are only L2 what is the target of the static route?
In any case the source should be in same IP subnet as R1 uplink. You are using the static route to trace availability of the source?
>> I do not think that PIM hello interval could help me because the R1 and R2 can see each other continuously.
I agree on this if there is no upstream PIM router it is not useful.
ip pim join-prune interval it may be helpful in your network scenario
However, the default value for IPv4 is 60 seconds.
Inform us about the results of your tests.
Thanks for your feedback.
Hope to help