Solved: Path selection will be a

Marc Bouchard · ‎06-01-2015

I am working on a design in GNS3 to failover our MPLS links to IPSEC. At the moment, all I'm trying to do is setup BGP properly and i'm having a heck of a hard time figuring this out (very rusty here...)

Based on the diagram below:

on the left is our datacenter. We have 2 MPLS links and an IPSEC router, as well as our core switches. Core switches run OSPF only. The others run both OSPF and BGP. The branch office on the right has a similar setup with a single MPLS link.

Objectives:

DataCenter: Load balance MPLS-A and MPLS-B, and failover if both are down to IPSEC tunnel (will be a DMVPN config with all branch offices).

BranchOffice: Failover to IPSEC if MPLS goes down.

- Fastest convergence possible

- Nice to have: failover if MPLS is up but connection has quality issues (packet loss for example).

- Nice to have: failback once all issues are resolved (but must avoid flapping)

BGP is using a single AS (iBGP only)

I created a generic configuration for all BGP routers. Since I don't want to fully mesh, every single router in the branch offices and the ones in the datacenter, I will need to use route reflectors. I think this is where my issue is.

I am peering all local routers in a site, and peering with the corresponding remote router.

BGP Peerings:

DBL-MPLS-A: with DBL-MPLS-B, DBL-IPSEC and MTL-MPLS

DBL-MPLS-B: with DBL-MPLS-A, DBL-IPSEC and MTL-MPLS

MTL-MPLS: with DBL-MPLS-A, DBL-MPLS-B and MTL-IPSEC

DBL-IPSEC: with DBL-MPLS-A, DBL-MPLS-B and MTL-IPSEC

MTL-IPSEC: with MTL-MPLS and DBL-IPSEC

My basic configuration is this (both remote and local have the same config right now, but I figured I MIGHT have to have different settings so I setup two peer-groups)

router bgp <as_number>
bgp log-neighbor-changes
bgp redistribute-internal
bgp scan-time 20
network <local_aggregated_subnet> mask 255.255.0.0
aggregate-address <local_aggregated_subnet> 255.255.0.0 summary-only
timers bgp 5 15
redistribute ospf <instance> metric 1
neighbor MFS-Local peer-group
neighbor MFS-Local remote-as <as_number>
neighbor MFS-Local update-source Loopback1
neighbor MFS-Local route-reflector-client
neighbor MFS-Local next-hop-self
neighbor MFS-Local soft-reconfiguration inbound
neighbor MFS-Remote peer-group
neighbor MFS-Remote remote-as <as_number>
neighbor MFS-Remote update-source Loopback1
neighbor MFS-Remote route-reflector-client
neighbor MFS-Remote next-hop-self
neighbor MFS-Remote soft-reconfiguration inbound
neighbor <remote_site_peer> peer-group MFS-Remote
neighbor <local_site_peer> peer-group MFS-Local

My problem is that from DBL-Core, I can't ping MTL-Core (loopback addresses). Seems like the traffic dies somewhere (loops?) Any insight would be very much appreciated.

Thanks!

Marc

jmattbullen · ‎06-02-2015

I have a very similar production network with remote sites having MPLS primary and broadband/aircard secondary that connects into our dmvpn network only difference is we use EIGRP in the datacenter instead of OSPF. I would not do the full mesh BGP or even worry about route-reflectors. I would just BGP between:

DBL-MPLS-B -> MPLS PE

DBL-MPLS-A -> MPLS PE

DBL-IPSEC -> MTL-IPSEC

and that's it.

you traffic engineer with your redistribution into OSPF on both side.

For example, on MPLSA/B and MTL-MPLS you'd do something like:

route-map BGP->OSPF permit 100

set metric 50

router ospf 1

redistribute bgp 65000 route-map BGP->OSPF subnets

On dmvpn:

route-map BGP->OSPF permit 100

set metric 100

router ospf 1

redistribute bgp 65000 route-map BGP->OSPF subnets

From DBL-Core perspective you would have equal cost sharing between A/B for routes at the remote site and in the event that both those are down the route from DBL-IPSEC would kick in very quickly. The remote site will follow the same process.

Another way I've skinned this cat would be to drop MTL-core and OSPF at the remote site and just do HSRP between MTL-MPLS and MTL-IPSEC with object-tracking to change the master if a route is no longer reachable. As for your question regarding changing routing based on performance, look into cisco performance routing (Pfr)
. though it looks like a beast to configure so I'd wait and do that as a separate project

hope this helps let me know if you have any questions.

View solution in original post

Jon Marshall · ‎06-01-2015

Marc

Is this a VPLS setup ?

If it is a L3 MPLS setup then you don't peer with the branch using IBGP.

Can you clarify because if it is L3 MPLS then I can't see the point of -

1) running IBGP anywhere because your core switches only run OSPF and I am assuming your core switches do all the routing for your vlans.

2) the purpose of the failover router ie. why not just connect the firewalls to the core switches.

Edit - are the failover routers for DMPVN and if so what routing protocol are you going to use with that ?

Jon

Marc Bouchard · ‎06-01-2015

Hi Jon,

It's L3 MPLS. We were looking for a way to do the failover, and a we hired a consultant who came up with this scenario. We do not manage the CEs. He proposed extended the BGP of the CEs to our failover router, which we are indeed adding for DMVPN purposes, as I do not want to use the ASAs for routing internal traffic.

I haven't looked at the specifics of the DMVPN yet. As per my diagram, both OSPF and BGP would be running on the failover router, and BGP would be used across to peer with the remote site (same AS). I do not want to extend the OSPF areas across (GRE tunnels, complex area setup etc...).

Would you have a better way to do this? I'm open to suggestions. And assuming we stay with this scenario, do you know what my issue would be (I'm guessing some peering/route reflection issues but not sure).

Thanks!

Jon Marshall · ‎06-01-2015

If it's L3 MPLS you don't peer your DC routers with the branch routers because that is an IBGP connection and you don't have this with L3 MPLS.

If you want to replicate this in GNS you could either build an MPLS network where your MPLS cloud is, or more likely just add some routers in the middle to act as PEs and then have those PEs peer with each other using IBGP.

Your CE devices would then peer with the PEs using EBGP and this would more closely replicate what you are trying to test.

In terms of extending IBGP between the MPLS router(s) and the failover router it depends on what else you are doing in terms of routing.

It sounds like you want to run EBGP on the MPLS connections (you have to - see above) and then run IBGP across the DMPVN connection which would mean the EBGP routes would be preferred unless the MPLS link failed.

Then within each site you peer the routers using IBGP although if that is what you are proposing bear in mind you would need router reflectors on the DMPVN part.

But the part that isn't making sense to me is I assume your core switches are where your local subnets are routed ie. the client's default gateways are on the core switches. And these only run OSPF.

Which would mean you would have to redistribute any BGP learned routes into OSPF and so any decision as to which link to use ie. MPLS or DMVPN would be taken by the core switches unless you are proposing to simply send everything to the MPLS router and if it's link is down it will redirect traffic to the failover router because of the IBGP peering between it and the failover router ?

In terms of better ways I can't really make a sensible comment at the moment because I don't understand the full picture of what the consultant is proposing. I assume he knew about the proposed used of DMVPN as a backup solution so perhaps there is a good reason to run IBGP that I am simply not seeing.

If the consultant came up with a scenario for all of this did he do a design document for it which explains exactly how all of this is meant to work ?

If you paid good money for this then I don't want to be redesigning this for you on the fly because the design you have may be fit for purpose but there are definitely some things that are currently not clear.

Happy to try and help but I really need to understand exactly what the consultant has proposed to make any sensible suggestions.

That aside though like I say if you want to emulate the actual setup you are going to need some PE devices between your CE devices.

It's later where I am so I'm logging off but I'll check in with this tomorrow if someone else hasn't picked up the thread.

Jon

Marc Bouchard · ‎06-02-2015

Well, I had everything up and running until I added the second MPLS router in the datacenter. For design/test purposes, I was assuming that the MPLS link was a single AS, thus everything would be iBGP. I know very little of MPLS, but isn't the service provider running a single AS for our connections? We are indeed running everything on the core switches (default gateways etc...) which are Brocade switches unfortunately, until next year when we replace them.

"It sounds like you want to run EBGP on the MPLS connections (you have to - see above) and then run IBGP across the DMPVN connection which would mean the EBGP routes would be preferred unless the MPLS link failed"

Indeed, that is what we want to do. I am redistributing both ways between BGP and OSPF, and I assumed that if all paths from the remote sites are in the BGP table, then BGP would select a prefered path, and redistribute that one to OSPF. If that path dies, then the replacement path would be redistributed in OSPF.

To be honest, that wasn't my initial solution, I was simply going to run OSPF on the DMVPN router, no BGP and leave the service provider to deal with that part. I will try to get more details as to the reasoning behind using BGP again today. I have stated all the facts here, so there is no other obscure reasons for DMVPN other than failover (well some traffic will use IPSEC at all times i.e. replication data between datacenter/DR) but other than that it's for failover.

Thanks for your time,

Marc

Jon Marshall · ‎06-02-2015

I know very little of MPLS, but isn't the service provider running a single AS for our connections?

The SP uses their own AS number but you use a different one for your sites.

Usually you use an AS number from the private range and you can use a different one per site or the same one in every site.

But either way it will be different from the SPs AS number for a L3 MPLS. Your CE devices form EBGP peerings with SPs PE devices not IBGP peerings and the key point is that your sites do not peer with each other, only their local PE device.

Indeed, that is what we want to do. I am redistributing both ways between BGP and OSPF, and I assumed that if all paths from the remote sites are in the BGP table, then BGP would select a prefered path, and redistribute that one to OSPF

This is where I am unclear of the design. Your MPLS routers receive EBGP routes from the PE devices and send these to the failover router with IBGP. The failover router receives IBGP routes via DMPVN and sends these to the MPLS routers

Now your MPLS routers will choose the EBGP routes and redistribute them into OSPF for your core switches.

But what stops your failover router redistributing it's BGP routes into OSPF as well ?

And then the core switches have two sets of OSPF routes both externals and it comes down to the least cost path.

There are ways round that but I guess this is why I am struggling to see what IBGP gives you because you still have to manipulate the OSPF routes to favour the MPLS link whether you use IBGP or not.

Unless it is simply because you do not want to run OSPF on your DMPVN ?

I was going to suggest using EIGRP on the DMVPN which is the favoured protocol and then running EIGRP as well as OSPF on the core switches and favouring the OSPF routes by changing the EIGRP AD and not running IBGP anywhere but you can't do that because your core switches don't support EIGRP.

You could still run EIGRP on the DMVPN but simply use a floating static route on the core switches in each site pointing to the failover router, again no need for IBGP anywhere.

To be honest there are a multitude of ways of achieving what you want so let me have a think about it and also if you can come back with more details of how the IBGP part is meant to work that would be really helpful.

Jon

Marc Bouchard · ‎06-02-2015

Response from our consultants:

The plan is to only use BGP in the WAN edge.

- We prefer to use BGP in the WAN edge to as to be running the same protocol as the provider to simplify interoperability (especially since the MPLS connected router is not managed by you).
- We feel running BGP throughout the WAN edge will actually reduce complexity of the design.
- There are also less moving parts involved in running BGP over an IPSec tunnel than OSPF.
- BGP also offers some advantages for multi-topology routing
- There are only two paths at each location and the MPLS will generally always be the preferred path and the second is for failover so there is not a lot of thinking for the protocol to do, it just has to realize when the primary link fails and restores. Convergence time will be determined by BGP running on the MPLS routers so we will not take advantage of OSPFs faster convergence time as BGP will have to see the links is down before relaying that info to OSPF.

Jon Marshall · ‎06-02-2015

Marc

I don't think I am getting the full picture here.

Apoloies for this but what i said yesterday about your MPLS routers preferring their EBGP routes was wrong.

The issue is the AS path of the routes over DMVPN will be shorter than the EBGP routes received from the MPLS PE routers and so when the failover routers send their routes via IBGP to the MPLS routers those will be preferred because AS path length comes before BGP type of route in the best path selection.

So any traffic from your your core switches will go to the MPLS routers and then to the failover router which is not what you want.

Even if you run EBGP over the DMVPN link the AS path is still shorter (because it doesn't include the SP AS number) so those routes would still be preferred.

You can manipulate the AS path length but there is no mention of that in anything said so far.

The other thing I don;t understand is whether you are redistributing BGP into OSPF on the failover router or not.

If you are then, as previosuly mentioned, the core switches will now be getting OSPF routes from both routers per site and I'm don't see how this is going to work without maipulating OSPF or bouncing traffic between the MPLS and failover routers (assumign you fix the AS path issue).

It's just not clear how any of this is meant to work from what you have told me so far.

Like i say I am happy to help out but if you hired consultants then you must have a design document or something off them toegther with how it is all meant to fit together and possibly configuration details as well.

I would be happy to provide you with some questions to ask them to clarify the situation if that would help but as I said before I don't want to redesign this on the fly because perhaps with all the details it may make a lot of sense.

Jon

Marc Bouchard · ‎06-02-2015

Path selection will be a whole other issue. We will set local preference attributes to address this. Right now, I just want to be able to reach MTL from DBL and vice versa.

When redistributing from BGP to OSPF, I would assume the preferred path will be the one redistributed, so if the MPLS links are active, that route will be the one distributed in OSPF. If it fails, then the IPSEC router will be the active route in OSPF. that's the objective anyway.

We had very little from our consultant, other than the general idea, a presentation of the solution, a lab demo of the BGP failover (with static routes instead of OSPF).

Running BGP over the DMVPN was prefered due to unicast vs multicast requirements. We would have needed a GRE tunnel I think for OSPF and I didn't want that. I want each office to have it's own area 0 backbone (at first i asked the MPLS provider for a superbackbone option, running area 0 over their network but they said no.)

To me everything seems logical, the proposal makes sense. It's the implementation that's fuzzy/problematic.

Jon Marshall · ‎06-02-2015

Okay, I wasn't aware that you were going to use local preference.

I don't think the solution will work.

What will happen is if the MPLS link fails the failover router will redistribute BGP into OSPF and the MPLS router will receive these routes.

The MPLS router will redistribute these into BGP and they will have a weight of 32768.

When the MPLS link comes back up it will receive the BGP routes from the PE but they will have a weight of 0.

Weight is the first thing used in the best path selection and the higher the better so the MPLS router will stick with the routes with a weight of 32768 which point to the failover router ie. it will not redistribute the routes it learnt from the PE into OSPF.

So your core switch sticks with the OSPF routes learnt from the failover router and you never fail back.

You can manipulate weight on the MPLS router ie. make the routes received from the PE > 32768 and then it should fail back.

However you should be filtering what is redistributed from OSPF into BGP anyway ie. you should only advertise local subnets for that site and not just redistribute all OSPF learned routes otherwise you are advertising out non local subnets from all the sites.

If you did that then the weight is a non issue.

I suspect the demo worked because they used static routes and not OSPF.

All that said if you are happy with their solution then by all means use it.

I am not sure what you mean by the implementation ie. to bulld the lab in GNS you need what you already have but then you need -

1) a PE per site (or two PEs for the DC one for each CE).

2) your CEs use one AS number or a different AS number per site. If you use the same number at each site you also need to use the "allowas-in <no>" command so that each CE can accept routes with the same AS number in the path.

You do not peer CEs across the MPLS network.

You peer to the PE devices ie. the DC CEs peer with the DC PEs, the remote site CE peers with remote site PE device etc.

The PE's peer between themselves using IBGP and the PEs use their own AS number ie the same one for all PEs and different from any of AS numbers you use for your CEs.

Personally I would build an MPLS network in GNS3 but if you have never done that before it is quite a bit of configuration to get working and you probably just want to test the scenario rather than learn MPLS.

Jon

jmattbullen · ‎06-02-2015

I have a very similar production network with remote sites having MPLS primary and broadband/aircard secondary that connects into our dmvpn network only difference is we use EIGRP in the datacenter instead of OSPF. I would not do the full mesh BGP or even worry about route-reflectors. I would just BGP between:

DBL-MPLS-B -> MPLS PE

DBL-MPLS-A -> MPLS PE

DBL-IPSEC -> MTL-IPSEC

and that's it.

you traffic engineer with your redistribution into OSPF on both side.

For example, on MPLSA/B and MTL-MPLS you'd do something like:

route-map BGP->OSPF permit 100

set metric 50

router ospf 1

redistribute bgp 65000 route-map BGP->OSPF subnets

On dmvpn:

route-map BGP->OSPF permit 100

set metric 100

router ospf 1

redistribute bgp 65000 route-map BGP->OSPF subnets

From DBL-Core perspective you would have equal cost sharing between A/B for routes at the remote site and in the event that both those are down the route from DBL-IPSEC would kick in very quickly. The remote site will follow the same process.

Another way I've skinned this cat would be to drop MTL-core and OSPF at the remote site and just do HSRP between MTL-MPLS and MTL-IPSEC with object-tracking to change the master if a route is no longer reachable. As for your question regarding changing routing based on performance, look into cisco performance routing (Pfr)
. though it looks like a beast to configure so I'd wait and do that as a separate project

hope this helps let me know if you have any questions.

Jon Marshall · ‎06-03-2015

you traffic engineer with your redistribution into OSPF on both side.

Do you mean just not run IBGP at all, run EBGP between CEs and PEs and also the failover routers and then redistribute into OSPF and manipulate the costs of OSPF ?

If so I totally agree and was what I was trying to say but didn't express as well as you -)

I can't see the point of IBGP here and if the core switches only have OSPF and they are making the routing decisions it seems to be unnecessary to me.

Jon

Marc Bouchard · ‎06-03-2015

A huge thank you to the both of you for taking the time to help me out. I went back to the initial design which was what jmattbullen suggested (using OSPF to do everything). We got the consultants involved because I wanted an alternate solution for the VPN/backup link section as I didn't want to use the ASAs for this, and that's what they came up with. They had set this up for other clients from what they told us, although instead of using OSPF, they were VRRP'ing the interfaces (single MPLS/single VPN link).

I got the GNS3 setup to work like this, but I'm getting my local routes re-injected into BGP from the remote site (due to the IPSEC link probably). So I have to block this. I have already set a priority by simply setting the redistribution metric from BGP at 500 on the IPSEC router, that leaves the two MPLS links as primary, with equal costs. Good approach?

Again, thank you very much for your insight/assistance.

Jon Marshall · ‎06-03-2015

Marc

You will need to only redistribute the local site's subnets into BGP from OSPF otherwise yes you will end up advertising other remote sites subnets.

This applies to both MPLS/IPSEC and obviously your DC dual MPLS router setup as well as the IPSEC.

In terms of OSPF and how you influence the costs it's up to you really ie. whatever you feel comfortable as there are multiple ways to do it.

Doesn't really matter as long as the MPLS router is favoured when it is up.

So when you say you went back to the original design do you mean no IBGP ie. let OSPF choose the routes ?

Jon

Marc Bouchard · ‎06-03-2015

Indeed, I was going to do a simple OSPF network, one area, and simple redistribution from the provider. So no iBGP. Our network guy is very happy lol :)

BGP failover to IPSEC design - Configuration problems