Solved: BGP default route injection for WAN failover connectivity

Travis Hysuick · ‎09-06-2012

Hi all,

I've run into a bit of a wall while trying to come up with a design for our WAN in terms of providing failover Internet connectivity.

Let me explain the topology first:

We have 2 corporate head offices one in Canada and one in the US, with an internet presence at each location. The COs are connected to a number of remote sites via an MPLS WAN (each site is esentially a stub with no internet connectivity of their own), and as it were at present, CO2 is supplying a default route to the PE router which is then distributed via BGP to the remote sites.

OSPF is the IGP in use at the COs, with eBGP at the WAN edge.

Default static routes are defined on the two CE routers at the respective COs, however, as mentioned, only CO2 advertises this route to the carrier PE. Because the next-hop of CO2s default route is not the Internet Edge, if the Internet connection at CO2 goes down, Internet traffic for the remote sites is black-holed.

What I am trying to come up with a design for is failover-capable Internet access for the remote MPLS sites. Ideally, I would like to get rid of any static routes with the exception of the default routes on the Internet Edge devices at the respective COs, and let the IGP and EGP propagate the default routes in a predictable manner. The end result should be that all Internet traffic for the remote sites should exit at CO2, unless that connection is otherwise unavailable. Additionally, the two COs should use their own local Internet connection unless it becomes unavailable, in which case they should redirect their Internet traffic through the other CO. I know that realistically, we should be looking at multihoming our Interrnet Edge services to different carriers for redundancy, however that isn't an option at this time.

I've looked into using tracked (IP SLA monitored) static routes with different metrics, as well as BGP conditional advertisements, however the problem I am running into is that I can get the failOVER to work, but the failBACK is kind of stumping me; once CO2 gets a default route via BGP (with an AD of 20), it never fails back to the internal OSPF route (AD 110) once the local Internet Edge comes back up.

Thoughts or comments would be appreciated, thanks to all in advance!

Mohamed Sobair · ‎09-06-2012

Travis,

follow these steps and you should be fine:

1- create TWO Static Default Routes at Both COs pointing to the Internet Edge, Make sure You have Tracking (IP SLA) for those Static Default Routes. This is to make sure the routes are removed from the routing table once tracking failed.

2- Have an IBGP Configured between Your COs.

3- Redistribute the Static Default Routes you have created at Both COs Into BGP, but Make Sure CO2 advertises it with Lower MED Value and CO2 advertises it with Higher MED Value.

With the above, you ensure every location uses its own Internet Link and fail back to the Secondary site incase of a failure, this will also ensure it returns back to the original link if it comes back online.

Also the above ensures Remote Sites always uses CO2 as an exit point to the Internet and fail over to CO1 incase of a failure. Why? Becaue if CO2 Internet Link is down , it should be recieving a default from CO1, once its back up onlie, it should use its own static default to the Internet.

Regards,

Mohamed

View solution in original post

Marwan ALshawi · ‎09-06-2012

You might try EEM to refresh ospf or bgp route after the fallback for example run the command clear ip route ospf

Or any other thing that refresh the route using EEM associated with your ip sla track

Hope this help

Sent from Cisco Technical Support iPad App

Florin Barhala · ‎09-06-2012

Hi mate,

The topology is pretty foggy for me, so I would ask you this:

- how many routers do you have at each CO_site.

- Internet connection is provided from a different router than the PE, or is it a Internet/MPLS service?

- you mentioned you use OSPF at each site, what routers participate in OSPF?

- how are the two CO connected? Via MPLS service?

- are you running eBGP on each site using the same AS number? Do you export/announce any prefix on any of the eBGP sessions?

- it would be useful to add a rectangle/circle to encompass CO1, CO2, MPLS_involved_equipments, Internet_equipments

Travis Hysuick · ‎09-06-2012

Hi Florin,

Thanks for the feedback, I realized shortly after posting how much info was missing from the topology, attached is a more detailed version.

To answer some of your questions;

1. At CO1, we have 5 routers exchanging OSPF routes internally, the internet edge (STNASAFW) is not performing any dynamic routing with the carrier, just a static route pointing at the carrier (the other device, STNASAVPN is a dedicated VPN solution that isn't relevant to the discussion).

On the MPLS side, there are 2 routers, each one connecting to a different carrier (however for the purpose of this discussion, the CDN MPLS side is not relevant), the WAN router I'm concerned with (STNATTMPLS) exchanges eBGP routes with the carrier PE (STNATTPE) over a L3 MPLS VPN, and redistributes routes learned from OSPF into the BGP process.

At CO2, the topology is identical with the only exception being that CO2 does not have a leg into the CDN MPLS service. The CE router at CO2 is currently injecting the defaut route into the BGP process, thus all the remote sies in the US get their Internet connectivity through CO2.

2. The Internet connection at the COs are provided from separate carrier links, denoted as (SKTELINET) and (ATTINET) respectively.

3. All of the routers included in the respective CO 'boxes' in the topology diagram as participating in the OSPF routing.

4. COs are connected via the same MPLS VPN that the remote sites are connected to.

5. Each site on the US MPLS VPN is using the same ASN (which I don't like), and the carrier is providing the AS-override service on the VRF. We are using eBGP between the CE/PE routers at each site and redistributing summary prefixes for each site (and some limited specific prefixes) to the eBGP peer.

6. Done, hope it looks a little better this time around!

Marwan ALshawi · ‎09-10-2012

You can use the concept described by Mohamed above

However there is a question here

If co2 Internet link is down remote sites should use co1 as the Internet gateway

What about co2 site itself do you want it to use co1 Internet over the mpls cloud ?

Again using default static route with ip sla tracking at each site Internet edge router can monitor the link/Internet next next hop availability

Redistributing this route into ospf and back into ebgp toward the mpls PE will ensure the default route distribution from both sites

Using a bgp attribute to influence route preference by the mpls PEs must. Be used such as bgp AS prepending

Hope this helps

Sent from Cisco Technical Support iPad App

Mohamed Sobair · ‎09-06-2012

Travis,

follow these steps and you should be fine:

1- create TWO Static Default Routes at Both COs pointing to the Internet Edge, Make sure You have Tracking (IP SLA) for those Static Default Routes. This is to make sure the routes are removed from the routing table once tracking failed.

2- Have an IBGP Configured between Your COs.

3- Redistribute the Static Default Routes you have created at Both COs Into BGP, but Make Sure CO2 advertises it with Lower MED Value and CO2 advertises it with Higher MED Value.

With the above, you ensure every location uses its own Internet Link and fail back to the Secondary site incase of a failure, this will also ensure it returns back to the original link if it comes back online.

Also the above ensures Remote Sites always uses CO2 as an exit point to the Internet and fail over to CO1 incase of a failure. Why? Becaue if CO2 Internet Link is down , it should be recieving a default from CO1, once its back up onlie, it should use its own static default to the Internet.

Regards,

Mohamed

Travis Hysuick · ‎09-10-2012

The solution I came up with is a bit of a hybrid of both Mohamed and Marwan's offerings;

Ultimately, the solution I came up with was to use a static IP SLA tracked static route at the CE MPLS edge devices in CO1 and CO2, in each case, the IP SLA monitor uses an ICMP-echo to each COs ISP gateway interface. This allows the CE MPLS routers to inject a default route into the local OSPF process for failover, pointing to the MPLS WAN rather than the local carrier interface for Internet access.

A fairly elegant little trick I came up with to ensure that the IP SLA traffic always takes the same path was to advertise the local ISP subnet via OSPF to the routers in each CO, otherwise the IP SLA may consider the ISP gateway to be up if it were able to get a response by sending the request across the WAN and out of CO2 Internet connection.

Additionally, I'm using a route map match the default route advertised from the CO1 CE router to perform an AS-prepend, such that both COs will advertise a default route to the BGP PE peers, but Internet-bound traffic from the remote sites with flow through CO2 under 'normal' circumstances based on the AS-path.

Origianally, I did try using a higher weighted MED value to provide the BGP default routes, however, I was unable to get this to working without having the PE routers configured with the 'bgp always-compare-med' option set. As this is a global bgp parameter change, it would be difficult / unlikely that the carrier would enable this setting simply for our benefit, that said, the AS-path does work predictably and reliably.

I still have some work to do in terms of filtering the advertised routes between the MPLS WAN connections and the COs, but as it were, I am able to perform automatic failover / failback reliably.

Thank you all for the replies, this was a great learning experience putting some (what I consider to be) advanced routing techniques to use.

Travis