...Well, what fun would it be - Page 2

dmarekatc · ‎01-21-2016

Hello Forum,

I’m hoping the community can provide some assistance on configuration for loop prevention & route flapping with multiple connection points given bi-directional redistribution of OSPF & BGP. Obviously the ultimately goal being a stable state for routing with all the redundancy in an efficient manner (the goal for all of us managing a network).

I’ve attached a sample diagram to help illustrate the topology for various ways remote sites (Rx) are connected back to two Head-End routers, though in reality there are hundreds of remote routers involved in the different connectivity designs (standalone, daisy-chained, etc.).

A couple of items to note / assume they cannot be changed:

The network is currently all OSPF –based, but in transitioning to MPLS (Provider Based L3) for the Carrier circuits only and it requires the use of BGP.
The P2P links are much higher bandwidth than the Carrier circuits, and thus OSPF would seemingly need to be favored over BGP.
Each router with a Carrier circuit has its own unique ASN, and OSPF Process 100 will be referenced on the routers.

Where I’m at: (you’ll want to view the diagram for the device references)

Using HE1 (Head-End Router) and R5 (Remote Site) as an easy starting point, as it doesn’t need OSPF for the LAN side of R5. No other BGP instance is running until stated.

For the MPLS/BGP connectivity, both HE1 & R5 are CER’s of course and peered to their respective PER’s. On R5, I’m redistributing Connected in BGP rather than specifying networks. All good there, and can take it a step farther with redistributing BGP into OSPF and OSPF into BGP on HE1, as well as adding the redistribution to HE2 but keeping its MPLS connection shutdown. I’ve also increased the admin distance for BGP to be 115 and when redistributing BGP into OSPF the metric is being increased. Again, no problem as you’d expect and of course the purpose of the mentioned distance/metric increases is to ensure OSPF is favored in all cases; and you can ping and connect from both PC1 & PC2.

As you may have guessed, a problem will arise if/when I enable the interface for the MPLS connection on HE2 as a loop would be introduced. So, first step was adding route-maps using tag attributes, but the caveat there is that you can’t use a set tag when redistributing an IGP into BGP.

You’ll get: % "<route-map name>" used as redistribute ospf into bgp route-map, set tag not supported

Alright, so I moved to a community attribute to get around that. Where I’m at now then, when I do enable the HE2 interface for the MPLS circuit isn’t a router crushing loop thankfully but unfortunately a route flap occurs within BGP it would appear; and you get about 30 seconds of connectivity from one side (say PC1) then loss on that side, while the other side (PC2) “works”, and repeat of course.

So with the diagram and all of that background, and I’ll provide config snippets below, I’m hoping someone with more extensive BGP & integration knowledge can provide some configuration help within the parameters I’ve outlined above. Thank you in advance.

I’ve left off the neighbor statements and other some other lines to keep things brief

HE1:
ip community-list 115 permit 115
!
route-map bgp2ospf deny 10
match community 115
!
route-map bgp2ospf permit 20
set tag 110
!
route-map ospf2bgp deny 10
match tag 110
!
route-map ospf2bgp permit 20
set community 115
!
router ospf 100
redistribute bgp 65001 metric 2000 subnets route-map bgp2ospf
!
router bgp 65001
redistribute ospf 100 match internal external 1 external 2 route-map ospf2bgp
distance 115 0.0.0.0 255.255.255.255
no auto-summary
!

HE2:
ip community-list 115 permit 115
!
route-map bgp2ospf deny 10
match community 115
!
route-map bgp2ospf permit 20
set tag 110
!
route-map ospf2bgp deny 10
match tag 110
!
route-map ospf2bgp permit 20
set community 115
!
router ospf 100
redistribute bgp 65002 metric 2000 subnets route-map bgp2ospf
!
router bgp 65002
redistribute ospf 100 match internal external 1 external 2 route-map ospf2bgp
distance 115 0.0.0.0 255.255.255.255
no auto-summary
!

R5:
router bgp 65005
redistribute connected
distance 115 0.0.0.0 255.255.255.255
no auto-summary
!

My hope and assumption, is that once this single remote site and redundant connections to/between the head-ends is sorted out, a similar route-map or whatever the solution is can be applied on all the other remote routers that also have both OSPF & BGP with similar bi-directional redistribution – Such as R3 and R6.

All of that said, if someone has comments on alternate approach methods than bi-directional redistribution or the like, I’m certainly willing to listen – BGP is not my strong suit so tips & tricks are welcome. If there is need for any further clarity on config or what’s occurring, feel free to ask.

Also what to give a shout-out to Giuseppe and Harold Ritter who’s comments in two different older posts of similar situations helped me out.

Those were respectively: Mutual Redistribution BGP<>OSPF and TAG problem in OSPF-BGP

Thanks again!

dmarekatc · ‎01-26-2016

Milan and Jon,

I want to say thank you for all your comments and considerations, not only with the very original topic issue per se, but also the various discussions on design considerations given certain parameters (Jon) and when those are removed (Milan).

Here's one more question, and perhaps it's more preference or doesn't really matter, but as it stands every site with a BGP connection (as depicted in the diagram: like R3, R5, R6, etc.) has its own ASN.

Would it have been "better" to have a single ASN that was used at all applicable sites, or just as fine with the unique?

One thing that comes to mind that would benefit by having them all the same ASN would be that it makes it easier for any consistent config change to be automated. E.g. If I wanted to change the admin distance on all routers via an automation tool/script, the same config works on all. Beyond that, I'm not sure which way would be favored.

So anything Pro / Con wise otherwise either of you (or anyone else) can think?

In the actual environment, there may a daisy-chain of routers coming off of HE1 as large as 20, with a few provider MPLS/BGP-based circuits scattered in strategically along the way - Again, each one of those BGP instances having a unique ASN. Concerns?

milan.kulik · ‎01-27-2016

Hi Marek,

one Pro for using unique AS numbers per site is:

In a case of troubleshooting you can easily see which site had originated the BGP prefix you are receiving.

Of course you can reach the same by adding communities or SoO to the prefixes, but simply looking at the AS numbers is the easiest way.

The Cons are:

MPLS providers usually push the customers to use the same AS number on all sites - it's easier for them to manage. So you need an agreement with your provider.

And in a case of having hundreds of sites it's not so easy to keep the table of AS numbers used per site.

Best regards,

Milan

dmarekatc · ‎01-27-2016

Fair enough on all that - I couldn't think of anything very significant myself but good to know I'm not missing something that'd be detrimental given the design.

Each site with an MPLS circuit / BGP does in fact have its own ASN (no issue with the provider).

OK - So after mulling everything that's been discussed, this perhaps creative alternate solution came to mind. I'm curious if you, Jon, or someone new sees potential issues or a reason why this couldn't work.

So a recap of where things are at based on discussion:

On both Head-Ends (HE1 & HE2), BGP is redistributed into OSPF. And BGP is originating a default route out.

On Remote Sites (i.e. R6), BGP is redistributed into OSPF to pass along that default route so that in an outage condition (link between R7 & HE1 is down), both R7 & R6 can get back to the Head-Ends. Additionally, R7 networks are listed in BGP on R6 so Head-Ends know how to get to it during said outage.

All of this is done at a higher administrative distance (and metric if/as applicable) to favor OSPF during normal operation, which includes a default route as well because the remote sites are totally NSSA.

The hesitation here for me is the need to ensure new networks get applied to any/all routers like R6, when a new site gets added to the daisy-chain (let's call it R8, inserted between R6 & R7); but this seems less prone to possible issue than doing mutual redistribution on the remote router(s) - e.g. R6.

Now then... what if the Head-End portion remains as described. But on the remote router(s) like R6, though they are totally NSSA sites, why not skip putting all the potential networks in BGP there and don't redistribute BGP into OSPF. Rather, could you possibly add a default zero route to R6 into OSPF (by redistributing statics into OSPF or by adding a default originate - can't say I've ever tried the latter on an NSSA type area) and then redistribute OSPF into BGP instead, at a higher metric, which basically keeps the "flow"/direction the same as the redistribution on HE1 & HE2.

The thought here is that during the described outage condition, R6 still knows about R7 networks through OSPF and by redistributing OSPF into BGP there, the Head-Ends will now know of all the networks out on that daisy-chain so traffic can get to them even after changes to the environment (no config updates on R6 needed); and with the default zero route on R6 into OSPF, R7 would know how to get back. And because this would all be done at a higher metric, under normal network conditions, everything should flow over the OSPF links as desired. In theory, would seem very stable & simple (unlike mutual redistribution) and less administrative burden (no need to deal with a large number of networks statements in BGP that can change).

Could all that be too good to be true?... Thoughts / comments??

Cheers,

-Marek

Jon Marshall · ‎01-27-2016

Marek

This is getting quite complicated :)

I have a few concerns about the design so far and they are both related to the areas you are using.

If you use NSSA then the HE devices as ABRs should inject a default route into OSPF for each area which is what you want.

That route would be inter area so it would take precedence over any BGP redistributed or locally generated route on R6, for example.

The problem is your HE devices are doing redistribution from BGP to OSPF which also makes them ABSRs and so for any area they are directly connected to they will inject type 7 LSAs into that area for routes from every other area.

If you then redistributed OSPF into BGP on R6, for example, R6 would then be advertising all areas subnets via BGP.

I believe what you need is this command on your HE devices -

"area x nssa no-redistribution" <-- where x is the area number

which tells your HE devices not to advertise type 7 LSAs into the areas because you don't need them with a default route being advertised by each HE device into each area.

That command is used when your ABR is also an ASBR which is the case with the HE devices in your setup.

That however raises another issue.

R6, whether it redistributes the BGP default route or generates it's own would then also become an ASBR and generate a type 7 LSA for the default route and because the HE devices are ABRs one of them should translate this to a type 5 and flood throughout the OSPF domain.

Again probably not what you want.

To be completely honest I cannot really tell you what will happen without testing it and my access to an online lab I use is not available at the moment so the above are just some concerns that perhaps you or Milan (or anyone else) can comment on.

I could well be mistaken about the above but I can't see it at the moment.

Jon

dmarekatc · ‎01-27-2016

...Well, what fun would it be to discuss if it wasn't complicated!?! HA HA ;)

Just to be clear - When you say there's a few concerns, are you referring to the idea I just described or the previous way discussed (BGP into OSPF & network statements at the applicable Remotes)?

Here's some additional background on the current environment, that perhaps addresses things for you or at least gives you a better lay of the land.

The Head-Ends are indeed ABRs / ASBRs and inject a default route, along with static routes that are redistributed (as previously mentioned).

For standalone sites today, like R4, the nssa area statement on the Head-Ends look like this: area 1 nssa no-summary
For daisy-chained sites, like R6-R7-HE1, the nssa area statement on the Head-Ends look like this (to address the very thing you mention): area 3 nssa no-redistribution no-summary
Remote Sites of course would simply be: area X nssa
The MPLS/BGP circuits are not active/production right now, but instead they are F/R with OSPF running over the PVCs (one PVC to each Head-End). e.g. on R6 there is a connection to HE1 & HE2 via the PVCs, in addition to the path via R7.

So hopefully all that helps/answers more than it adds new questions.

I would agree, a mock-up in a lab is likely needed to get a better idea.

But back to my latest proposal, where on an R6 remote, OSPF redistributes into BGP. If that default static route was redistributed in OSPF (for R7's use during a link failure), would you not think that:

It would propagate to R7 & beyond, to HE1 but because it would be a worse metric, it'd be ignored.
While it would also be redistributed into BGP, any traffic would still get back to the Head-End, but again the route itself would be ignored/not propagated at the Head-Ends because it'd have a worse metric.

If all of that still proves to be an issue, I suppose there could still be simplicity in redistributing OSPF into BGP at R6 (assuming #2 above is true), but rather than have R6 push a default route out, use a weighted default static on R7 pointed to R6 that would only come into play if a link is down / the more favorable OSPF route from HE1 is lost.

Thoughts there?

Jon Marshall · ‎01-27-2016

When you say remote sites do these still have connections to the HE devices or not ?

If they do I would think you want that command everywhere ie. you only need a default route in every area, nothing more.

I may have misunderstood your point :)

It would propagate to the HE device but not sure what you mean about worst metric, worse than what ?

Within an area eg. R6's, the default from the HE device due to the NSSA would always be preferred because it is inter area and R6's default route would be external type 7 LSA so no issues there, it is just what exactly the HE devices would do with that type 7 from your MPLS routers that I would like to test.

If at all possible I would prefer to have R6 advertise a default route rather than using weighted statics simply because if you add a new site in the chain you then have to remember to add that route as well and we are trying to make it as simple as possible in terms of adding new sites etc.

I guess you have a number of alternatives to test with ie. using network statements or just doing one way redistribution from OSPF into BGP on the MPLS routers both of which would work as long as the only routes in the NSSA were those local to the area.

If I get access to the lab which I should do, hopefully soon, I can test out the default route issue unless you know for sure it won't be an issue ?

Jon

dmarekatc · ‎01-27-2016

Yes, all the Remotes are connected back to the Head-Ends "directly" (via an MPLS circuit) or indirectly (through other routers in the daisy-chain).

Worse metric in the sense that the Head-Ends are already sending out a default route - so R6 gets one via HE1 to R7; but if it too is sending a default out due to redistribution with an increased metric, then any other router (HE1, HE2, R7) would see it as "worse" . E.g. It would be ignored by the Head-Ends always, and R7 would ignore it because the default from HE1 would be favored unless it wasn't available due to a link loss.

Ha - I found myself responding above as I was reading what you put. So now that I've read further, I think we're on the same page; which puts me down to the concern with the use of statics - I couldn't agree more with you on that... It's just merely a potential option (just not a very ideal one).

I do not know if the default route back across BGP from R6 would be an issue.

And again, I appreciate the discussions. As I go through this, it may be a case where use of an overlay design could make things easier to accomplish from an end-state perspective (avoid the redistribution all together), but it'd be nice to figure out a solid/stable, workable design under these premise.

Thanks,

-Marek

Jon Marshall · ‎01-27-2016

Sorry, didn't answer your original question.

The concern about type 7 LSAs for all other areas was to do with your latest proposal of redistributing OSPF into BGP on R6, for example, but you have mainly answered that.

The concern about the default route and how the HE devices will treat the type 7 LSA received from R6 was to do with more the general design and the idea of just using a default route from BGP to OSPF.

Happy to admit the second concern may not be a concern at all :)

Jon

milan.kulik · ‎01-28-2016

Hi Marek,

I'd clarify:

"On Remote Sites (i.e. R6), only the default route (advertised to BGP from HE1/HE2) is redistributed from BGP into OSPF. So that in an outage condition (link between R7 & HE1 is down), both R7 & R6 can get back to the Head-Ends. Additionally, R7 networks are listed in BGP on R6 so Head-Ends know how to get to it during said outage."

In that case you just need to assure the default route received from OSPF is always preferred on the remote routers like R6 and R7.

I'll leave this issue to your OSPF (total NSSA, etc.) discussion with Jon as it's quite deep and difficult to follow already.

Best regards,

Milan

dmarekatc · ‎01-28-2016

Hi Milan,

Thanks for the comment.

Indeed, the default route from OSPF would need to be preferred - and I believed addressed with the increased admin distance ("distance 115 0.0.0.0 255.255.255.255") being used, along with any increase in the metric when redistributing.

Appreciate all your input.

Best regards,

-Marek

milan.kulik · ‎01-26-2016

Hi Marek,

yes, you would need to redistribute the default route from BGP to OSPF on R6.

And you would need to advertise the R7 subnets to BGP from R6.

It would make a good sense to advertise only them, not to redisrtibute all prefixes received from OSPF.

And of course, when redistributing BGP prefixes to OSPF on HE1/2 routers, use such a big metric value for the redistributed prefixes to be sure the same subnet received from native OSPF will win over the same subnet redistributed from BGP.

Best regards,

Milan

Bi-directional Redistribution of OSPF and BGP with Multiple Connections Issue