cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
4533
Views
30
Helpful
19
Replies

Problems using MSDP without BGP

Hi all,

I want to use MSDP to exchange multicast sources between different customer network domains where no BGP is used but OSFP.

Each domain uses it's own two rp for redundancy with anycast rp (without msdp meshgroups). Multicast routing protocol is PIM-SM. Routers learn default route via OSPF from firewall (and few more routes not relavant for this scenario). Routers are Cat6K, Firewall is third party.

 

I want to use MSDP peerings in a hierarchical design like shown below.Unbenannt.PNG

 

MSDP SAs originated in central site are accepted at the other sites and vice versa. Also multicast traffic works fine between central site and the other sites which means pim works as expected.

But MSDP SAs originated e.g. at site A (and "relayed" from central site) are not accepted at site B.

I played with local mesh groups between rp, originator id, and rfc3618 compliance activation, but I didn't find a working solution.

In debugging I see the following error message e.g. on site B router:

Peer RPF check failed for <rp ip address>, used ? route's peer 0.0.0.0

 

After activating rfc3618 compliance (ip msdp rpf rfc3618) error message is 

<msdp peer ip address>: RPF check failed for <rp ip address>

 

I read cisco documentation (https://www.cisco.com/c/en/us/td/docs/ios-xml/ios/ipmulti_pim/configuration/xe-16/imc-pim-xe-16-book/imc-msdp-im-pim-sim.pdf) and also rfc3618 and 4611.

In my understanding it seems that a scenario like this is not supported because of the Peer-RPF check which can not be successful here.

 

Does anybody know if it's possible to get MSDP working in this scenario?

Unfortunately using BGP is not an option.

 

I also thought about using a meshgroup between all rp but this does not scale well because there are several sites and many msdp peerings. Also SA filtering would be easy in the scenario above where it could be done at central site.

 

Thanks a lot in advance

Thorsten

 

 

 

19 Replies 19

Giuseppe Larosa
Hall of Fame
Hall of Fame

Hello Thorsten,

as enabling M-BGP is not an option you are in scenario of section 2.3 of RFC 4611.

 

In this case, an enterprise maintains its own RP and has an MSDP
   peering with its service provider but does not BGP peer with them.
   MSDP relies upon BGP path information to learn the MSDP topology for
   the SA peer-RPF check.  MSDP can be deployed without BGP, however,
   and as a result, there are some special cases where the requirement
   to perform a peer-RPF check on the BGP path information is suspended.
   These cases are:

   o  There is only a single MSDP peer connection.

   o  A default peer (default MSDP route) is configured.

   o  The originating RP is directly connected.

   o  A mesh group is used.

   o  An implementation is used that allows for an MSDP peer-RPF check
      using an IGP.

 

You say that each pair of multilayer switches C6500 is learning a default route 0.0.0.0/0 via OSPF from the firewalls and few other routes not related to MSDP.

 

The first suggestion is to find a way to propagate the MSDP endpoints addresses in OSPF to all remote sites so that an explicit specific unicast route exists for the RP address contained in the MSDP SA message.

This should allow the RPF peer check to pass, because it is evident that the default route is not considered a valid route for this check.

 

if the RP addresses are advertised as internal address you can use or modify an area filter-list command to have them propagated to other areas.

Unless you are using totally stub areas you should be able to use this method.

 

Hope to help

Giuseppe

 

 

Hello Giuseppe,

many thanks for your explanation.

Do we need explicit host routes (for each loopback ip address) or would be a network route (e.g /16) also sufficient?

At the moment we use ospf summary so that the firewall is learning only one summary route from each site, and loopback ip adresses for rp and msdp are part of this summary network.

What we need is that the firewall forwards this summary routes between the sites.

 

Thorsten

Hello Thorsten,

I think the summary routes can be enough to pass the RPF check as they are more specific then a default route.

 

>> What we need is that the firewall forwards this summary routes between the sites.

Yes I do agree on this.

 

Hope to help

Giuseppe

 

Hello Giuseppe,

we let the firewall forward the loopbacks of rp and msdp peers between the sites, so every msdp peer knows next hop for each other msdp peer or rp in its rib.

Unfortunately this didn't help. 

Debugging messages are the same:

  • without activating rfc3616 compliance: Peer RPF check failed for <rp ip address>, used ? route's peer 0.0.0.0
  • with activating rfc3618 compliance:   <msdp peer ip address>: RPF check failed for <rp ip address>

List of rejected SAs (sh ip msdp sa-cache rejected-SA detail read-only) shows also messages concerning rpf-fail:
86842768.616, (<sender>,<group>), RP: <rp>, Peer: <peer>, Reason: rpf-fail


Do you have any idea what else to check?

 

Dont't know if it's relevant but i'm using vrf technology so all routing stuff is done not in global routing but in seperate vrfs on the switches.

 

Best regards
Thorsten

 

Hello Thorsten,

can you post the output of

show ip rpf <remote-MSDP address>

show ip route <remote-MSDP-address>

 

RPF checks may not use most specific routes first,  but it can prefer lower admin distance first.

 

Hope to help

Giuseppe

 

taken on site A RT1 which receives SA from msdp peer on central site RT1 (10.1.255.241)

 

sh ip rpf 10.1.255.241 
RPF information for ? (10.1.255.241)
RPF interface: Fo9/1.1024
RPF neighbor: ? (10.83.252.86)
RPF route/mask: 10.1.255.241/32
RPF type: unicast (ospf 4)
Doing distance-preferred lookups across tables
RPF topology: ipv4 multicast base, originated from ipv4 unicast base

 

sh ip route 10.1.255.241
Routing Table: TEST1
Routing entry for 10.1.255.241/32
Known via "ospf 4", distance 110, metric 150, type extern 2, forward metric 2
Last update from 10.83.252.86 on FortyGigabitEthernet9/1.1024, 02:33:18 ago
Routing Descriptor Blocks:
* 10.83.252.86, from 10.204.120.99, 02:33:18 ago, via FortyGigabitEthernet9/1.1024
Route metric is 150, traffic share count is 1

Hello Thorsten,

I was referrring to this sentence

>> Doing distance-preferred lookups across tables

 

can you provide an example of an MSDP peer address of Site B as seen by RT1 of Site A ?

Again both commands show ip rpf <address> and show ip route <address>

 

The provided output from central site address  looks fine and it is working  for MSDP, isnt it ?

 

Hope to help

Giuseppe

 

 

Hi Guiseppe,

site A has no MSDP peering to site B. They have only peering to central site, which plays intermediate peer role in this scenario. msdp peerings shown in the picture above

 

Central site accepts SAs from site A and B because Peer and RP are identical in the SAs. Outputs of sh ip rpf and sh ip route seem the same for me as the ones in site B RT1.

 

central site RT1:

sh ip msdp sa-cache 
MSDP Source-Active Cache - 2 entries
(10.80.148.101, 239.192.0.1), RP 10.2.255.241, AS ?,1d23h/00:05:42, Peer 10.2.255.241 -> learned from RT1 at site A (which is 10.2.255.241)
(10.88.84.101, 239.192.0.2), RP 10.3.255.241, AS ?,1d21h/00:05:31, Peer 10.3.255.241 -> learned from RT1 at site B (which is 10.3.255.241)

sh ip rpf 10.2.255.241
RPF information for ? (10.2.255.241)
RPF interface: Fo9/1.1035
RPF neighbor: ? (10.111.252.86)
RPF route/mask: 10.2.255.241/32
RPF type: unicast (ospf 15)
Doing distance-preferred lookups across tables
RPF topology: ipv4 multicast base, originated from ipv4 unicast base

sh ip route 10.2.255.241
Routing Table: CLNT1
Routing entry for 10.2.255.241/32
Known via "ospf 15", distance 110, metric 150, type extern 2, forward metric 2
Last update from 10.111.252.86 on FortyGigabitEthernet9/1.1035, 04:06:26 ago
Routing Descriptor Blocks:
* 10.111.252.86, from 10.205.120.195, 04:06:26 ago, via FortyGigabitEthernet9/1.1035
Route metric is 150, traffic share count is 1

 

Central site RT2 also does not accept SAs from central site RT1 neighbor, they are directly connected. Seems there is a rpf problem in any case where peer and rp listed in a SA are not identical.

central site RT2:

show ip msdp sa-cache rejected-SA det read-only
<snip>
86854209.328, (10.80.148.101, 239.192.0.1), RP: 10.2.255.241, Peer: 10.1.255.241, Reason: rpf-fail -> learned from central site RT1 but not accepted (originated from site A RT1)
86854209.328, (10.88.84.101, 239.192.0.2), RP: 10.3.255.241, Peer: 10.1.255.241, Reason: rpf-fail -> learned from central site RT1 but not accepted (originated from site B RT1)

show ip rpf 10.1.255.241
RPF information for ? (10.1.255.241)
RPF interface: Vlan10
RPF neighbor: ? (10.111.254.9)
RPF route/mask: 10.1.255.241/32
RPF type: unicast (ospf 15)
Doing distance-preferred lookups across tables
RPF topology: ipv4 multicast base, originated from ipv4 unicast base

show ip route 10.1.255.241
Routing Table: CENT1
Routing entry for 10.1.255.241/32
Known via "ospf 15", distance 110, metric 3, type intra area
Last update from 10.111.254.9 on Vlan10, 1d22h ago
Routing Descriptor Blocks:
* 10.111.254.9, from 10.205.0.197, 1d22h ago, via Vlan10
Route metric is 3, traffic share count is 1

 

Best Regards

Thorsten

 

 

Hello Thorsten,

 

>> Central site RT2 also does not accept SAs from central site RT1 neighbor, they are directly connected. Seems there is a rpf problem in any case where peer and rp listed in a SA are not identical.

 

This is an interesting note confirmed by the following output:

 

show ip msdp sa-cache rejected-SA det read-only
<snip>
86854209.328, (10.80.148.101, 239.192.0.1), RP: 10.2.255.241, Peer: 10.1.255.241, Reason: rpf-fail -> learned from central site RT1 but not accepted (originated from site A RT1)
86854209.328, (10.88.84.101, 239.192.0.2), RP: 10.3.255.241, Peer: 10.1.255.241, Reason: rpf-fail -> learned from central site RT1 but not accepted (originated from site B RT1)

 

So here the MSDP peer is 10.1.255.241 the SA entry lists a different RP 10.2.255.241 and you have an RPF fail on RT2 of central site.

 

I will give a look again at the RFC 4611 and RFC3618.

 

For the moment, just to make a test what happens if you configure on RT2 something like:

ip mroute 10.2.255.241 255.255.255.255 10.1.255.241

ip mroute 10.3.255.241 255.255.255.255 10.1.255.241

 

you may prefer to use physical next-hops with IP PIM enabled on it.

 

Edit:

according to RFC 3618 section 10.1.3 the RPF check on the peer and on the received SA message is done in the following way:

 

An SA message originated by R and received by X from N is accepted if
   N is the peer-RPF neighbor for X, and is discarded otherwise.

              MPP(R,N)                 MP(N,X)
      R ---------....-------> N ------------------> X
              SA(S,G,R)                SA(S,G,R)

   MP(N,X) is an MSDP peering between N and X.  MPP(R,N) is an MSDP
   peering path (zero or more MSDP peers) between R and N, e.g.,
   MPP(R,N) = MP(R, A) + MP(A, B) + MP(B, N).  
SA(S,G,R) is an SA message for source S on group G originated by an RP R. The peer-RPF neighbor N is chosen deterministically, using the first of the following rules that matches. In particular, N is the RPF neighbor of X with respect to R if (i). N == R (X has an MSDP peering with R). ! this applies to direct MSDP peering and this works (ii). N is the eBGP NEXT_HOP of the Peer-RPF route for R. ! NA (iii). The Peer-RPF route for R is learned through a distance-vector or path-vector routing protocol (e.g., BGP, RIP, DVMRP) and N is the neighbor that advertised the Peer-RPF route for R (e.g., N is the iBGP advertiser of the route for R), or N is the IGP next hop for R if the route for R is learned via a link-state protocol (e.g., OSPF [RFC2328] or IS-IS [RFC1142]). (iv). N resides in the closest AS in the best path towards R. If multiple MSDP peers reside in the closest AS, the peer with the highest IP address is the rpf-peer. ! NA (v). N is configured as the static RPF-peer for R. ! to be investigated with ip mroute command

 

comment:

Looking at the sentence at step (iii) regarding link state IGP it is clear that it is not possible to satisfy this condition.

The IGP next-hop for a remote RP R contained in a SA entry is NOT the loopback address of the directly connected MSDP peer.

So the suggestion to try to use an ip mroute using as next-hop the loopback address of central site RT1 is much more appropriate then I thought before reading this.

So I would give a try to the following on Central site RT2:

 

ip mroute 10.2.255.241 255.255.255.255 10.1.255.241

ip mroute 10.3.255.241 255.255.255.255 10.1.255.241

 

Hope to help

Giuseppe

 

 

Hi Guiseppe,

I edited the picture, hope that helps.

 

Unbenannt.PNG

 

So Site A RT1 which is nearest RP for Source1 sends SA to local neighbor RT2 and to Central Site RT1 which both works fine -> because in msdp message rp and peer are identical.

Then Central Site RT1 forwards SA to local RT2 and Site B RT1. Both routers do not accept the SA, that's what we see in our tests.

I also read the rfc section you mentioned, and I can follow your explanation.

I configured both mroutes on Central Site RT2 but this did not help, I get exactly the same debug error messages:

10.1.255.241: Peer RPF check failed for 10.2.255.241, used ? route's peer 0.0.0.0
10.1.255.241: Peer RPF check failed for 10.3.255.241, used ? route's peer 0.0.0.0

 

Best Regards

Thorsten

just another idea: Would it help to use different msdp ip interfaces for msdp site interconnecing sessions? So use loopbacks for local anycast msdp peering, and SVIs/physical subinterfaces to connect to other sites?

On the other hand, this won't help for msdp communication between Central Site RT1 and Site B RT1 because next hop will be a local router and not remote msdp peer.

RFC seems very complicated to me and does not consider modern high availability enterprise networks. Maybe Full Mesh with mesh group is the only option to make msdp work in our environment

Hello Thorsten,

 

>> Maybe Full Mesh with mesh group is the only option to make msdp work in our environment

 

I am starting to think about this as the easier to manage option.

 

Hope to help

Giuseppe

 

Hello Thorsten,

the new network diagram has much more details.

So you have tried to use ip mroute with no results.

 

Your network diagram describes only two sites so I do not see the IP addresses that are listed in the error messages, you may have more then two sites or you may have changed the IP addresses.

 

>> I configured both mroutes on Central Site RT2 but this did not help, I get exactly the same debug error messages:

10.111.255.241: Peer RPF check failed for 10.83.255.241, used ? route's peer 0.0.0.0
10.111.255.241: Peer RPF check failed for 10.91.255.241, used ? route's peer 0.0.0.0

if this fails is because the next-hop = Loopback of RT1 central site is not a PIM neighbor for RT2.

 

 

 

RFC 4611 in section 3.3 last block says:

 

If recent (but not currently widely deployed) router code is running
   that is fully compliant with the latest MSDP document, another
   option, to work around not having BGP to MSDP RPF peer, is to RPF
   using an IGP like OSPF, IS-IS, RIP, etc.  This new capability will
   allow for enterprise customers, who are not running BGP and who don't
   want to run mesh groups, to use their existing IGP to satisfy the
   MSDP peer-RPF rules.

The RFC has been published in year 2006.

The RFC 3618 has been published in year 2003.

I wonder what recent MSDP implementation they are referring to. I tried to look for more recent RFCs about MSDP but I couldn't find one.

 

You have written about a configuration option to be RFC3618 compliant. Can you try to enable it with the test ip mroute commands configured ?

>> ip msdp rpf rfc3618  + ip mroute

 

Hope to help

Giuseppe

 

Hello Guiseppe,

ip addresses were incorrect, sorry.

Correct messages on central site router 2 and site B router 1 are:

10.1.255.241: Peer RPF check failed for 10.2.255.241, used ? route's peer 0.0.0.0
10.1.255.241: Peer RPF check failed for 10.3.255.241, used ? route's peer 0.0.0.0

 Corrected that in my post above.

 

Activating rfc3618 compliance does not change the behaviour, already tested it. Only difference I see is that debug messages are less detailed then.

 

In my understanding we can summarize that behaviour: If you use intermediate msdp peers, msdp sa messages are not accepted by the msdp peers communicating via this intermediate msdp peers.

 

<source> --- <rp = msdp peer 1> --- <intermediate msdp peer 2> --- <msdp peer 3>

 

It's independent of using redundant routers.

I'll try to build a gns3 lab to verify this.

 

Best Regards

Thorsten

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: