Solved: Okay I think the issue is - Page 2

jonesl1 · ‎03-16-2015

Hello All,

Needing some assistance with an IP SLA setup that I'm having some issues wrapping my head around. I have a router that has two connections to a remote site router. One connection from HQ-Rtr has a serial connection to an MPLS cloud using BGP as it's routing protocol. The other connection is an Ethernet connection to an ASA (which uses an IPSEC tunnel to the remote router). Here is what I'd like to do. I'd like to use the MPLS cloud connection as the primary connection. Once it fails, I would want the Ethernet connection to the ASA to kick in. However, I would like it to fall BACK over to the MPLS cloud connection in the event that the Serial connection comes back online.

With that said, I know I'll have to use IP SLA to make this work but I'm running into an issue getting it to fall BACK to the primary route. I'm not sure why. I'm basically doing the following:

track 1 ip sla 1 reachability

       ip sla auto discovery
         ip sla 1
                icmp-echo a.a.a.a source-interface Serial1/0
                request-data-size 32
       ip sla schedule 1 life forever start-time now

ip route 1.1.1.0 255.255.255.0 x.x.x.x track 1 (learned via bgp from cloud, metric 20)

ip route 1.1.1.0 255.255.255.0 y.y.y.y 25

So, to me.....this says as long as I can ping a.a.a.a from source int s1/0, the route for network 1.1.1.0 should go to x.x.x.x. Then once it fails, it falls over to y.y.y.y. It requires the 25 cost so that it doesn't take precedence over the x.x.x.x route. Am I seeing this correctly so far? Then once x.x.x.x comes back online, it SHOULD fall back over to that one being it has the lower cost route of 20. Is this right?

Well regardless it's not working quite as expected so someone had mentioned something about setting another IP SLA up with a 'Boolean and' statement. I'm not 100% sure about this so if anyone can explain this to me and how it would work in the above scenario or why it would be done, then that would help also. Here is what they suggested:

track 2 list boolean and
object 1 not

and to change my routes to look like the following:

------Remove route for x.x.x.x all-together-------

ip route 1.1.1.0 255.255.255.0 y.y.y.y track 2

I'm guessing (and I do mean guessing) this says to monitor track 1 and if track 1 is NOT true (false) then apply the route for y.y.y.y?

Can someone take a look and help me with the concept, I guess maybe I'm in left field here but I'm struggling a bit on making this work.

Thanks in advance,

jonesl1 · ‎03-16-2015

On the central side, I'm redistributing Statics, which is where I'm introducing the route.

Jon Marshall · ‎03-16-2015

Okay that makes more sense, I though it was me :-)

So when you shutdown the loopback interface on router A your MPLS central site router doesn't receive the route and so should failover to the ASA ?

So what route is the MPLS router showing for that loopback ie. prefix and mask when it's a BGP route.

And what route are you using on the central site router to point to the ASA, again prefix and mask.

Jon

jonesl1 · ‎03-16-2015

Not you at all. :) I'm was just trying to understand the concept and implement into my scenario, hopefully eliminating a lot of the convoluted portions. However, some of those pieces you needed and I wasn't giving them. I do apologize.

Routing entry for xx.xxx.1.164/32
Known via "bgp xxxxx", distance 20, metric 0
Tag xxxxx, type external

Last update from xx.xxx.253 01:18:48 ago
Routing Descriptor Blocks:
* xx.xxx.3.253, from xx.xxx.3.253, 01:18:48 ago
      Route metric is 0, traffic share count is 1
      AS Hops 2
      Route tag xxxxx
      MPLS label: none

Central site router looks like this:

ip route xx.xxx.1.164 255.255.255.255 xx.xxx.2.226 200

Jon Marshall · ‎03-16-2015

Okay I think the issue is this.

You are redistributing statics into BGP. So what happens is that when your MPLS link is up you receive the loopback via BGP and it has an AD of 20 so goes into the IP routing table because it has a better AD.

When the MPLS link fails your static route with an AD of 200 is placed into the IP routing table and redistributed into BGP.

When the MPLS link comes back up your router receives the same route from the PE router.

Now your MPLS router has two routes in BGP for the same destination and prefix.

The one from the PE has a weight of 0 but the static that was redistributed locally has a weight of 32768 because that is the weight assigned to locally generated routes.

The higher the weight the better so BGP sticks with the route via the ASA.

The usual solution to this is to modify the routes received from the PE so they have a weight > 32768 and so once the MPLS link comes back up they are preferred and installed in the IP routing table.

You can do that as a test if you want however you are just unfortunate in that you have hit this issue with your test because in production it wouldn't occur.

The reason being you would have a summary route or default static route pointing to your ASA so the more specific routes via MPLS are always used if they are available.

Hope that makes sense.

Jon

jonesl1 · ‎03-16-2015

So how would I go about modifying the routes received from the PE to test that? I guess I'm not familiar with that part. It makes sense what you are explaining though.

Jon Marshall · ‎03-16-2015

If you want to confirm this is what is happening when the MPLS link comes back up on the MPLS router -

"sh ip bgp x.x.x.164"

should show you two routes.

One will have the next hop as the PE and one the ASA. The ASA route should be the best path and it will be because of the higher weight.

Jon

Jon Marshall · ‎03-16-2015

I can give you a config if you like but a more realistic test would be instead of configuring a static for the loopback with a 255.255.255.255 mask use a 255.255.255.252 mask.

This obviously assumes you don't have any other loopbacks in that range. If you do you may want to create a new loopback using an unused IP subnet for the test.

Your BGP route is still advertised with the existing mask but your static is a less specific route.

So when the MPLS link comes back up the more specific BGP route should be used.

The above would emulate what you are actually going to be doing in production.

Like I say though if you want to modify the weight just let me know.

Jon

jonesl1 · ‎03-16-2015

You ARE AMAZING!!! You are absolutely right. That's exactly what it was. I created a route-map and changed the weight of that single route and sure enough....works GREAT! I do thank you. I guess I obviously won't need that route map nor that weight change once I get the device (new router) on the opposite end like it should be. That sound right? Again, you have helped me a ton. Sorry I took up the majority of your day trying to hash this out, but I do appreciate all you've done and you'll definitely get the 5 star rating from me! :) Thanks a ton Jon!

Jon Marshall · ‎03-16-2015

No problem. glad to have helped.

You shouldn't need the route map once you go into production as long as you are using a summary route or a default route pointing to your ASA ie. a less specific route.

One thing I was going to mention but didn't want to confuse the issue because I didn't know how it would affect your testing is whether or not you need to advertise that loopback from the central site at all.

Basically for your test you were advertising from the central site a route for a network that belonged to the remote site.

It depends on how many sites you have and how you want them to reach the site with the failed MPLS link.

If you advertise out the failed sites subnets from the central site then all other sites will use MPLS to get to this site and go to the central site which then sends the traffic via VPN to the remote site and the return traffic follows the same path so you are using both networks.

If you didn't advertise out the failed sites networks from the central site then every other site would have to use their VPN connections to reach the remote site.

It's not clear whether you have other sites and how you envisage it working.

I don't want to confuse the issue so please feel free to ignore but I just don';t want you to implement it and then find it isn't working the way you thought it would.

Jon

jonesl1 · ‎03-17-2015

Well you are correct in your thinking. I do have multiple other sites that would need to communicate back to that remote site. So I don't really want to advertise it from the central side. However, I guess I don't know how to not distribute it into bgp when I have 'redistribute static' set up in my BGP instance. How do I advertise the other statics, but not that one?

I actually have over 100+ remotes.

I do appreciate your concern and your regards for not just slapping in a config that partially works. That's what makes you respectable! Thank you!

Jon Marshall · ‎03-17-2015

So if a remote site fails it will use it's VPN link.

Is this going to be a VPN mesh in effect ie. each remote site can create a VPN to any other remote site or is it going to be a hub and spoke VPN where all VPNs have to go via the central site ?

If it is mesh then no problem and the solution is easy.

If it is a hub and spoke then VPN traffic would involve two tunnels for every site the remote site needs to get to ie.

one from the remote site to the central site and another from the central site to the other site.

It may be there is no remote site to remote site connectivity which would make it a lot simpler ie. a remote site that loses it's MPLS link would only ever need to use it's VPN to the central site.

So it basically comes down to how do you envisage the failover working ?

Jon

jonesl1 · ‎03-17-2015

So ultimately we currently have one of the sites which is having a horrible connection to it's provider via MPLS. We are trying to prevent downtime to this site (and other problem circuits) by implementing a redundant scenario. So our intent is to add a router that has both the serial for the MPLS and an additional cellular card for when MPLS fails (whereas the current router only has the MPLS serial connection).

There's nothing within the MPLS cloud preventing it from talking to other remotes, so in that respect it's mesh. However, the devices it needs to talk to typically reside on the central side.

So we are merely trying to allow for a redundant connection when the providers connection starts running horrible. It's very intermittent, but continues to happen and the Telco advises that they can't see any problems.....hence this whole project. Ultimately, we are just attempting to use the redundant ASA VPN option for the sites with circuit issues.

Jon Marshall · ‎03-17-2015

Okay, so lets concentrate on a remote site to central site connection.

You do not need any static routes for specific remote networks at either site, only static routes for that sites local subnets.

And these are redistributed into BGP presumably.

Then at the remote site you need either -

1) a default route pointing to the ASA

or

2) a summary route for the private IPs for the central site pointing to the ASA.

At the central site you need the same ie. either a default or a summary pointing to the remote site networks.

That static should not be redistributed into BGP at either site. So when you redistribute statics into BGP you need to use a route map to stop that route being advertised eg.

deny that route and then permit all other statics which should only be for the subnets local to that site.

This way if a remote site loses it's MPLS connection then it uses it's static pointing to the ASA. Because it's MPLS connection is down it is no longer advertising it's subnets via MPLS so the central site also uses it's static to get to the remote site via the ASA.

Because all the other sites are still using MPLS the central site will only use the VPN for the failed site because it will still be receiving more specific routes for all the other sites via BGP.

So the static route you add pointing to the ASA does not need an AD added but it must be less specific than the routes received from BGP.

The above should work fine for remote site to central site connectivity in case of an MPLS failure at a remote site.

If a remote site loses it's MPLS link and it needs to talk to another remote site how it works depends on how you have setup the VPNs. So the above principle is still the same ie.

the remote site that has failed uses it's VPN because of the static. Because it is no longer advertising it's subnets via MPLS the other remote site also uses it's VPN.

But the point I was making is it depends how these VPNs are setup ie. can a remote site make a direct VPN connection to another remote site or does it have to go via the central site ASA. If it does then that requires more configuration on your central site ASA.

Like I say remote site to remote site may not be an issue for you in which case don't worry about it too much.

Hope all the above makes sense.

Any queries, clarifications etc. please feel free to ask.

Jon

jonesl1 · ‎03-16-2015

And you are correct. When I shutdown loopback on Rtr A, central side MPLS doesn't receive the route and DOES failover to ASA. It just doesn't fail back.

Jon Marshall · ‎03-16-2015

Should say if you don't want to use a default route because of potential internet access then presumably you are using private addressing so you can use a summary route covering all your private IPs for remote sites pointing to the ASA.

Again it would only be used if the more specific routes weren't there ie. the MPLS link has gone down.

Jon

IP SLA and BGP