OK, thanks Jon - I'm glad to

dmarekatc · ‎01-21-2016

Hello Forum,

I’m hoping the community can provide some assistance on configuration for loop prevention & route flapping with multiple connection points given bi-directional redistribution of OSPF & BGP. Obviously the ultimately goal being a stable state for routing with all the redundancy in an efficient manner (the goal for all of us managing a network).

I’ve attached a sample diagram to help illustrate the topology for various ways remote sites (Rx) are connected back to two Head-End routers, though in reality there are hundreds of remote routers involved in the different connectivity designs (standalone, daisy-chained, etc.).

A couple of items to note / assume they cannot be changed:

The network is currently all OSPF –based, but in transitioning to MPLS (Provider Based L3) for the Carrier circuits only and it requires the use of BGP.
The P2P links are much higher bandwidth than the Carrier circuits, and thus OSPF would seemingly need to be favored over BGP.
Each router with a Carrier circuit has its own unique ASN, and OSPF Process 100 will be referenced on the routers.

Where I’m at: (you’ll want to view the diagram for the device references)

Using HE1 (Head-End Router) and R5 (Remote Site) as an easy starting point, as it doesn’t need OSPF for the LAN side of R5. No other BGP instance is running until stated.

For the MPLS/BGP connectivity, both HE1 & R5 are CER’s of course and peered to their respective PER’s. On R5, I’m redistributing Connected in BGP rather than specifying networks. All good there, and can take it a step farther with redistributing BGP into OSPF and OSPF into BGP on HE1, as well as adding the redistribution to HE2 but keeping its MPLS connection shutdown. I’ve also increased the admin distance for BGP to be 115 and when redistributing BGP into OSPF the metric is being increased. Again, no problem as you’d expect and of course the purpose of the mentioned distance/metric increases is to ensure OSPF is favored in all cases; and you can ping and connect from both PC1 & PC2.

As you may have guessed, a problem will arise if/when I enable the interface for the MPLS connection on HE2 as a loop would be introduced. So, first step was adding route-maps using tag attributes, but the caveat there is that you can’t use a set tag when redistributing an IGP into BGP.

You’ll get: % "<route-map name>" used as redistribute ospf into bgp route-map, set tag not supported

Alright, so I moved to a community attribute to get around that. Where I’m at now then, when I do enable the HE2 interface for the MPLS circuit isn’t a router crushing loop thankfully but unfortunately a route flap occurs within BGP it would appear; and you get about 30 seconds of connectivity from one side (say PC1) then loss on that side, while the other side (PC2) “works”, and repeat of course.

So with the diagram and all of that background, and I’ll provide config snippets below, I’m hoping someone with more extensive BGP & integration knowledge can provide some configuration help within the parameters I’ve outlined above. Thank you in advance.

I’ve left off the neighbor statements and other some other lines to keep things brief

HE1:
ip community-list 115 permit 115
!
route-map bgp2ospf deny 10
match community 115
!
route-map bgp2ospf permit 20
set tag 110
!
route-map ospf2bgp deny 10
match tag 110
!
route-map ospf2bgp permit 20
set community 115
!
router ospf 100
redistribute bgp 65001 metric 2000 subnets route-map bgp2ospf
!
router bgp 65001
redistribute ospf 100 match internal external 1 external 2 route-map ospf2bgp
distance 115 0.0.0.0 255.255.255.255
no auto-summary
!

HE2:
ip community-list 115 permit 115
!
route-map bgp2ospf deny 10
match community 115
!
route-map bgp2ospf permit 20
set tag 110
!
route-map ospf2bgp deny 10
match tag 110
!
route-map ospf2bgp permit 20
set community 115
!
router ospf 100
redistribute bgp 65002 metric 2000 subnets route-map bgp2ospf
!
router bgp 65002
redistribute ospf 100 match internal external 1 external 2 route-map ospf2bgp
distance 115 0.0.0.0 255.255.255.255
no auto-summary
!

R5:
router bgp 65005
redistribute connected
distance 115 0.0.0.0 255.255.255.255
no auto-summary
!

My hope and assumption, is that once this single remote site and redundant connections to/between the head-ends is sorted out, a similar route-map or whatever the solution is can be applied on all the other remote routers that also have both OSPF & BGP with similar bi-directional redistribution – Such as R3 and R6.

All of that said, if someone has comments on alternate approach methods than bi-directional redistribution or the like, I’m certainly willing to listen – BGP is not my strong suit so tips & tricks are welcome. If there is need for any further clarity on config or what’s occurring, feel free to ask.

Also what to give a shout-out to Giuseppe and Harold Ritter who’s comments in two different older posts of similar situations helped me out.

Those were respectively: Mutual Redistribution BGP<>OSPF and TAG problem in OSPF-BGP

Thanks again!

Jon Marshall · ‎01-21-2016

There is nothing wrong with your configuration so it's difficult to say why you are seeing the symptoms you are.

What troubleshooting have you done in terms of debugging etc ?

As mentioned by Giuseppe in one of the linked threads using network statements under BGP can make your life a lot simpler and I have used this method before.

So using R6 as an example if you advertise the internal networks with BGP network statements you then only have to worry about any received BGP routes being redistributed into OSPF which makes it a lot easier.

The HE1 and HE2 routers would be slightly different because presumably they are meant to back each other up so you would need to advertise out the same networks from both sites but with BGP you can still influence which path is used to get to a specific network when both links are up.

Of course it depends on your IP addressing and whether at each site you can use summary addressing to make it simpler.

That is merely an alternative suggestion though and what you are currently trying to do should be possible if you are filtering correctly and like I say your configuration looks okay from what I can see so you need to try and work out why the BGP link is flapping.

Jon

dmarekatc · ‎01-21-2016

Thanks for looking the configuration over Jon.

One thing that I did that was missing from the original post, was on both HE1 & HE2's BGP config was add: neighbor x.x.x.x send-community

I'm not sure it really did anything for me however, and don't know if it's better to leave it or remove it again?

Regardless, I tried things again (enabling both HE1 & 2 interfaces) and still had the flapping. And actually, it may have just been from where I was at yesterday when I tried this (the PC2 side) where it looked like things were alternating, but from PC1 today, the flap causes a drop all around. e.g. I see it on PC1 & PC2 at approximately the same interval & duration.

I haven't done any debug commands as of yet, but I'm finding there may be a couple of other routers at the Head-End locations that may be adding to the complications, so that doesn't help the matter and is something I wasn't previously aware of. And of course, what I've presented in the forum is a mock-up so I'm sure there's some variance as a result. All of which doesn't help anyone here trying to lend some advice.

But onto your comment on R6 and use of network statements. I'm not sure if that would help or not, so let me say this and see if it changes your thoughts: The IP networks are not all contiguous in general, but could probably be summarized in about 4 class C's per site and not necessarily running BGP at all (and there could be any number of sites to add in a given production daisy-chain, along with potentially having a few sites like R6 along that chain with P2P connections and carrier circuits to contend with - it's a growing environment). Additionally, R6 would need to support R7 (and any others in the chain) if the event that the link between R7 & HE1 went down. e.g. The carrier circuit at R6 isn't just for R6, it's for R6 and any other sites that need to use it in the event of a link down somewhere along the chain. In other words, those sites on one side of the failure would still talk back to the closet head-end via OSPF, while the rest would need to get back via OSPF to R6 & then BGP to the HE's (and back into OSPF there). I should also mention that while this smaller network / mock-up isn't configured like this, but the larger network I'll be dealing with is using NSSA for all the OSPF sites, if that matters. Now what would your position be?

A couple of additional questions for you / the forum:

There are also some static routes on the Head-Ends that get redistributed into OSPF there. The intent there is to "pull" select networks to one side versus the other, regardless of the path they take to get back, and has worked very well.

Of course this means that statics are getting redistributed into OSPF and then OSPF into BGP.

Would it be better to just redistribute those statics directly into BGP instead as well (even though statics into OSPF will remain)?

And lastly for now, is there a situation where "bgp redistribute-internal" could play a role here in any capacity to help the situation?

Regards.

Jon Marshall · ‎01-22-2016

I may not have fully understood the NSSA part ie. you can redistribute BGP into OSPF but the external routes will not then pass into different areas because you cannot do that.

Which looking at your diagram may not matter, just wanted to clarify ?

Jon

dmarekatc · ‎01-22-2016

I'll keep pondering the network statements route, but I suspect that when taking into account the various production details & changing environment, not having to go back to each router similar to R6 in how its connections are & update it with any new network statements for new sites that exist along the OSPF paths is more ideal (too easy to forget a site was added & that X number of other routers need to be updated) - we could be talking the need to add over a hundred network statement lines to each R6-type router.

My bad, perhaps I should have phrased it as OSPF sites are NSSA Totally Stub area (remote site or daisy-chain of sites being in its/their own area. So area 123 nssa on the remote, and area 123 nssa no-summary on the two Head-Ends). So there are O routes on all those remote sites (such as R2 and R7), along with a default route being pushed (O*IA 0.0.0.0/0). And again, with a loss of link between two sites (let's say between R1 & R2, I still need to get R2 traffic back to the Head-Ends. Also, we're thinking about removing any NSSA potentially. Not sure if that helps answer your last question, but hopefully provides some background on the why. What could be debated, is the best way to simply get a default route over BGP & into OSPF on the R6-type sites.

Anyway, we may be digressing from the topic at hand. So to come back to it, it would seem my redistribution and route-map loop prevention config seems correct, and I more so now just need to figure out why & where I'm having the flapping still.

Also now wondering, that with a default route being pushed out in OSPF, how to best duplicate that for BGP to the remote sites and then back into OSPF at that site & to any other sites attached? Think of the link between R7 going down, and now R7 & R6 need to get back to HE1 and/or HE2 over the carrier circuit/BGP.

Thank you for your input, and I'm certainly open to your or anyone else's thoughts on the matter.

Jon Marshall · ‎01-22-2016

Yes that is main issue with using network statements ie. you need to remember to add to the site that has the MPLS connection or else it is not know

The big advantage though is worse case scenario you just can't get to that subnet and it would be easy to troubleshoot.

Whereas with mutual redistribution a mistake could lead to far more serious consequences and would be harder to troubleshoot.

But, as you say, lets assume mutual redistribution for the moment.

You don't need a default route in BGP because you are advertising the specific subnets so R6 will know about them and so will R7 because R6 is redistributing into OSPF which it can do even if it an NSSA.

I am beginning to think I am not understanding all of your design :)

Jon

dmarekatc · ‎01-22-2016

So, perhaps mutual redistribution can be limited to the Head-Ends (HE1 & HE2), and for any applicable remote site (R6, R3 or similar) network statements could be used... might be interesting to consider.

If that was done ( network statements in the remote router(s) ), I suppose I'd still need to redistribute there, but just not bi-directionally - BGP at the Head-Ends would know of the networks courtesy of the remote (R6), but I'd still need to redistribute on R6 BGP into OSPF so that in the event a link is down at R7 & HE1, R7 knows it can come to R6 to get back. Sound right?

LOL - Yeah, we kind of went off topic a bit into the alternate design considerations, which without you knowing the actual nuances of the network, makes things more challenging to discuss. Part of why I was trying to limit things to a smaller mock-up & scope, but I certainly appreciate the points of consideration you've made.

Jon Marshall · ‎01-22-2016

Edited - just seen bit about maybe not using NSSA so don't want to confuse the issue.

Jon

dmarekatc · ‎01-25-2016

OK, thanks Jon - I'm glad to hear my thought process isn't completely off on the concept of network statements & remote side redistribution need in that situation.

As for Areas & NSSA Totally currently, recall earlier that we redistribute static routes on HE1 & HE2 into OSPF (to pull particular traffic to one side versus the other), so in addition to the default route out on the remotes, there are also those routes (O N2 routes).

Not sure if that further clears up things to your question, in addition to potentially removing NSSA. Or maybe your "edit" is taking that all into account.

New Question: If redistribution on the applicable remotes, like R6 (of BGP into OSPF) is done, would a route-map(s) be needed there at all? ...Or wouldn't OSPF handle it by simply not putting those "duplicate" routes via BGP into the routing table, because OSPF would already have a better route via R7 & ultimately HE1.

Regards,

-Marek

Jon Marshall · ‎01-22-2016

Your second paragraph sounds exactly right.

One thing I am still not clear on is for the area R6 and R7 are in you must also be sending routes via OSPF into that area.

And you need these routes to be preferred over the OSPF externals from BGP.

So exactly what type of areas are these going to be ?

Jon

Jon Marshall · ‎01-22-2016

I definitely do not follow.

If the sites are NSSA totally stub then only a default route is being pushed into these areas.

But if you do mutual redistribution that means R6, for example, receives specific routes via BGP and redistributes them into the area.

Which means R6, R7 etc. will always choose the OSPF externals redistributed from BGP and go via MPLS to any remote sites.

The same applies to all sites ie. they will always choose the more specific routes which will mean traffic always ends up at the BGP MPLS router and is sent across the MPLS network which is not what you want.

Jon

milan.kulik · ‎01-23-2016

Hi,

do you really need to run a mutual redistribution here?

It might be quite tricky simetimes, as it might even depend on WHEN the prefix was redistributed.

An example:

You are receiving R5 prefixes on HE2 via BGP under normal conditions.

And redistributing them to OSPF.

But HE2 is also receiving them via BGP and redistributing them to OSPF.

So HE1 can receive the same R5 subnet via OSPF and due to a better AD start using it instead of the original BGP route.

The same is valid for HE2, so a nice routing loop can occur.

Another example:

R6 BGP line fails for a while.

HE2 wil receive the R6 prefixes via OSFP from HE1.

And redistribute them to BGP.

But when the BGP line gets fixed on R6, HE2 will not choose the BGP prefixes received from R6 as the best ones within BGP routing process, as the same prefixes redistributed from OSPF a while ago will use a better weight attribute (redistrubuted by HE2 itself).

So quite complicated to tune.

Why don't you simply advertise default route to BGP from HE1 and HE2 without redistributing the OSPF prefixes at all? (And only redistribute BGP prefixes to OSPF, or even advertise also the default route only?)

Together with a better AD for OSPF it would make the OSPF prefixes always preferred.

And in a case of OSPF line failure on any site the BGP connection would be used as a backup?

BR,

Milan

dmarekatc · ‎01-25-2016

Hi Milan,

At first I was going to say the answer has to be bi-directional redistribution on the Head-Ends (and maybe one-way on the applicable remotes), but perhaps part of that is based on the conversation that Jon & I have been having is on the premise that it's being used.

So now I say, tell me more, since we're taking pure design option... Let me make sure I'm tracking with you & what you're saying at the end:

I presume you're suggesting adding a default-originate in BGP on the Head-Ends (HE1 & HE2).
Redistribute BGP into OSPF on both Head-Ends (HE1 & HE2).
Continue to have a higher admin distance for BGP so that it's only ever used when OSPF is NOT available. e.g. The link between HE1 & H7 is down.

Do I have that correct?

If so...

I suppose that would still allow those static routes that are redistributed into OSPF on the Head-Ends to continue to do what they've been doing (pulling traffic to a particular side), just that it may not be as efficient, however it may be less complicated in design. I say less efficient because right now with just OSPF on remotes they know which way they should send the traffic, but if only a default route is given from both Head-Ends, R6 may put the default route of HE1 in its table, but the final destination of the traffic may be off of HE2 and it's only when it comes into HE1 that it'd know. Make sense?

What about the need for redistribution on the remote end? Would seem like there would again be the need to redistribute BGP into OSPF (Jon & I talk about this is that chain), agreed?? e.g. link between R7 & HE1 goes down, R7 need to know it can goto R6 and then out. Will that default route sent out in BGP redistribute into OSPF on R6 to accomplish that?

I appreciate your comments. Any configuration examples of what you're proposing would we welcomed as well.

Regards.

-Marek

dmarekatc · ‎01-25-2016

I've been pondering this more...

While for a standalone site like R5, originating a default from HE1 & HE2 along with on the remote R5 side doing a redistribute connected, would solve the problem in a simple fashion for those types of sites.

In the more complex situations, like the R6-R7-HE1 daisy-chain, I would still need to, on R6, redistribute BGP into OSPF so the default route is there (just at a higher AD than the one coming from OSPF); so that when the link between R7 & HE1 is down, traffic from R7 knows it can get out via R6. Would you not agree?

And to take it a step further, in that same scenario, the Head-Ends need to know how to get out to R7 (and R6). So am I back to having to put all the network statements in for R7 (and any other sites along the path if there were others), on the R6 router? Or do bi-directional redistribution on the remote side(s) now??

Jon Marshall · ‎01-26-2016

I think Milan's suggestion is the way to go.

To answer your specific questions -

yes you would redistribute the default route from BGP into OSPF and yes it could mean that traffic goes to the "wrong" HE device but it is only an extra hop.

If you already have a default route in OSPF that should not matter because if you are using stub areas then the default will be an inter area route whereas the redistributed route from BGP will be an OSPF external so it the inter area will be preferred.

Yes you still need R6, for example, to advertise it's subnets and any other sites subnets that rely on it for MPLS connectivity.

Again If there are not a lot of subnets I would be inclined to use network statements under BGP because it gives you more control.

If you can advertise a summary address per site that includes unused subnets for future use then you also wouldn't need to remember to add a network statement for BGP but to be honest if you are adding a new subnet to a site you can just make it part of the process.

Jon

Bi-directional Redistribution of OSPF and BGP with Multiple Connections Issue