cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2210
Views
0
Helpful
11
Replies

Help with WAN/MPLS BGP-OSPF Route Preference

cmonks119
Level 1
Level 1

Hello All, I'm looking for some suggestions on a new network design. We have two data centers (directly connected) and many remote sites (DMVPN), all running OSPF. We are looking at adding both MPLS and SD-WAN solutions to replace DMVPN. These services normally prefer to run BGP, so I'm looking at how best to integrate BGP with our infrastructure. 

 

The issue i'm running into is BGP/OSPF route preference. I would like to do mutual redistribution from OSPF to BGP, so that all WAN and MPLS sites follow shortest path to data center resources, and allow allow for primary path fail over, etc. The main sticking point right now is that if I advertise data center subnets to WAN/MPLS, then those same subnets are learned back through the second data center, the eBGP routes (AD 20) are preferred over the directly-connected OSPF routes (AD 110). I currently do not have separate sets of WAN routers, so my WAN/MPLS connections are terminating on my core routers, thus the eBGP vs OSPF route preference issue. 

 

Basically I need the WAN subnets preferred over eBGP, and the LAN subnets preferred over OSPF, but each acting as a fail over for the other. I'm trying as much as possible to accomplish this without having to rely on prefix lists and ACLs, keeping it all dynamic. The reason for this is just because our data centers contain hundreds of prefixes, listing and maintaining that list would be a pain. Same with summarizing or aggregation.

 

Some things I've tried or looked into:

1. Raise AD of eBGP, or lower AD of OSPF. This creates a secondary issue, since it is basically 'all or nothing.' I can get the OSPF routes preferred over eBGP, but then it also prefers ALL routes learned through eBGP from the opposite data center over OSPF, and tries to route the long way around to get to WAN sites. 

2. Run iBGP across the two data centers rather than two separate ASNs. I actually got a config working for this, because I can use weight/path to prefer iBGP routes to eBGP, then OSPF will beat out iBGP routes. I'd like to stay away from this just thinking long term, separate ASNs for the two data centers seems to be a better way to go, especially as we might look at converting our current point-to-point links over to MPLS. 

3. Use a VRF for MPLS/WAN BGP. I was able to get this working in a lab, on newer code versions using the VRF import/export commands. I was hoping that the VRF BGP routes would show up differently in the global routing table, allowing me to control them more like OSPF routes with cost. However it looks like they are still importing into the GRT as BGP routes with AD 20. With BGP running in a VRF, and OSPF running in the GRT, is there any way to get those VRF  routes into the GRT as OSPF E1 routes, or similar? 

4. Use a VRF with GRE tunnels and OSPF neighbors between VRF and global. I haven't tested this one, but it seems like it would work, just a bit hacky and probably not great for production. 

5. Use a separate set of WAN routers to terminate WAN/MPLS BGP, then connect those to core via OSPF. I'm sure this would resolve all my issues, but getting the extra hardware may not happen. Trying to do this via VRFs is my best chance as of right now. 

6. Route tagging/communities/filters. I can't find a good way to get any of this working, since both data centers are in OSPF area 0, I can't find any unique values to do matching on (without having to list specific prefixes, or advertising routers, etc).

 

This 'seems' like a fairly straight forward setup, hopefully I am just missing something and there is a better way to do this. Any advise is appreciated. 

 

Capture.PNG

11 Replies 11

Francesco Molino
VIP Alumni
VIP Alumni
Hi

Before giving you any advice or configs, can you share please the routing (bgp and ospf) configs on DC routers?
Other question is, all subnets from DC are contiguous or not?

Thanks
Francesco
PS: Please don't forget to rate and select as validated answer if this answered your question

Current configs are simple, just standard OSPF, everything relevant is in area 0. There are two 10g point to point links between the DCs, so we are just doing ECMP between them. The new WAN, MPLS and BGP do not exist yet. 

 

Our route table has about 1000 subnets. DC1 has quite a mix of subnets and masks in the 10/8 and 192.168/16 ranges, probably a couple hundred, plus random VPN routes. DC2 is much simpler, only about 50 or so route table entries, but limited to a couple 10.x/16s then public DMZs and B2B VPNs. The remaining route table entries are all WAN sites via DMVPN on OSPF, all in the 172.16/12 range. That is why I say doing prefix lists or summaries are probably possible, I’m just looking for as maintenance-free setup as possible.

Hi

 

Ok if you don't want to use any acls/prefix-list, I guess you don't want to use network statement as well (thinking about BGP backdor command).

 

You can acomplish by changing the AD.

Let me explain:

 - By default eBGP has 20 and OSPF 110

 - On OSPF, you can change the default distance per routes type.

 - On BGP, you can change the AD per neighbor, let's say you set it to 115

 - On OSPF, you can have define 110 for inter and intra area and 200 for external

 

Here and example: (sorry I did it using IOU as I'm not in the office and not able to do a real lab)

 

image.png

 

IOU3 advertises 192.168.10.0/24 subnet

IOU2 advertises 10.10.0.0/16 (few /24 subnets)

IOU1 advertises 10.1.0.0/16 (few /24 subnets)

 

I assume, you have network statements in OSPF to advertise your local subnets? Am i right? (that's why I asked for your routing config)

 

On DC1 and DC2:

 

router ospf 10
 distance ospf external 200

 

 

Here, if you didn't do any filter in OSPF, routes learned from DC2 through BGP will be redistributed in OSPF to DC1 as external routes and local connected subnets of DC2 learned as intra-area routes (and vice versa).

 

Now, in terms of BGP:

On DC1, for example my BGP peer (IOU3) has IP 172.16.2.3. The below command says that every subnets I learn from this peer should have AD 115 instead of 20:

 

distance 115 172.16.2.3 0.0.0.0

 

 

in terms of routing (RIB) on IOU1: (you can see subnets from DC2 are in RIB using OSPF but all others external from DC2 are in BGP)

 

O    10.10.0.0/24 [110/20] via 172.16.1.2, 00:09:53, Ethernet0/2
O    10.10.1.0/24 [110/20] via 172.16.1.2, 00:09:53, Ethernet0/2
O    10.10.2.0/24 [110/20] via 172.16.1.2, 00:09:53, Ethernet0/2

B     192.168.10.0/24 [115/0] via 172.16.2.3, 00:09:22

 

and I shutdown BGP neighbor on DC1, the 192.168.10.0/24 will be learned as OSPF E2 routes:

O E2  192.168.10.0/24 [200/1] via 172.16.1.2, 00:00:05, Ethernet0/2

 

This is a quick example showing that routes learned from remote sites will have an AD of 115 instead of 20 but always prefered over BGP than OSPF and fallback to OSPF (AD 200) in case BGP is down on 1 router.

 

 


Thanks
Francesco
PS: Please don't forget to rate and select as validated answer if this answered your question

Thanks for that explanation, I knew there had to be some workaround I wasn't seeing. I tested in my lab and it appears to be working, so that solved about 90% of the issue. Unfortunately I also have a lot of External routes already in OSPF, mostly due to static route redistribution.

 

So now how do I handle the existing External routes.. Let's say DC1 has a redistributed static route to a non-OSPF router. DC2 learns that through OSPF as AD 110 E2. If I change OSPF External AD to 200, now that route is 200 E2. I add BGP to DC1, and it advertises that same route to MPLS. MPLS then advertises it over to DC2. I've increased the BGP cost to 190, so DC2 puts that BGP route in the route table as 190 E2 overriding the OSPF 200 E2 that I had. Same issue as before, but now only on the existing External routes (internal routes are fixed, since 110 will override the 190 BGP AD). 

 

 

I can set a BGP Community when redistributing External OSPF routes to MPLS (route-map match route-type external). Then on the opposite DC, filter that community out of BGP. I will lose those external routes in the event of a DC-DC failure, but keep all the internal routes.. Or in cases where I have the same E2 or E1 route out of both data centers (VPN failover, etc.) I will split-brain those routes.

 

 

For this purpose, are you ok to use prefix lists or acls?

Without that It's gonna be difficult.

 

I'm thinking about 2 solutions:

- use acl to match all these subnets and increase the AD for this acl

OR

- use prefix list and conditional route injection. I mean you check for example of DC1 receives a route of DC2 from ospf and if not advertise those subnets to bgp. The goal is to validate that ospf peering is up.

 

 


Thanks
Francesco
PS: Please don't forget to rate and select as validated answer if this answered your question

I was thinking along the same lines as you, I was playing with using conditional advertisement with advertise map, I just don't have it completely working yet.. My plan was to match an OSPF route as you mention, using two advertise maps; one exist-map for normal operations that would advertise all routes and set a community like 999, then a non-exist-map that would advertise all routes with a community like 888. Then on DC2, I could set my inbound route-map to deny 999 routes but permit 888 routes, which would only be sent in the case OSPF is down. As I said I don't have this completely working yet, right now it seems like one of the maps works and it advertises all the routes, but during a failover it stops advertizing any routes at all, so I need to try to figure that out. But i'm not using prefix-list or ACL for the advertisement, i'm trying just to match a route map that does a 'permit all' on all the existing routes..maybe this is not supported with advertise map? 

Can you share the config you're trying to push?

Thanks
Francesco
PS: Please don't forget to rate and select as validated answer if this answered your question

Here is one of the configs.. still inconsistent, and I'm not yet seeing where the issue is.. I have two routers setup this way, one will advertise correctly and the other one won't.

 

router ospf 1
 router-id 20.20.99.2
 redistribute bgp 65502 metric 1000 subnets route-map BGPtoOSPF
 passive-interface default
 no passive-interface FastEthernet1/0
 no passive-interface FastEthernet2/0
 no passive-interface FastEthernet3/0
 no passive-interface FastEthernet4/0
 distance ospf external 200
!
router bgp 65502
 bgp router-id 20.20.99.2
 bgp log-neighbor-changes
 timers bgp 10 30
 neighbor 172.16.22.1 remote-as 10000
 neighbor 172.16.22.1 description MPLS2
 !
 address-family ipv4
  redistribute ospf 1 match internal external 1 external 2 route-map OSPFtoBGP
  neighbor 172.16.22.1 activate
  neighbor 172.16.22.1 send-community
  neighbor 172.16.22.1 advertise-map MPLS-BGP-OUT-NORMAL exist-map DC1-ROUTE-CHECK1
  neighbor 172.16.22.1 advertise-map MPLS-BGP-OUT-FAILOVER non-exist-map DC1-ROUTE-CHECK2
  neighbor 172.16.22.1 soft-reconfiguration inbound
  neighbor 172.16.22.1 route-map MPLS-BGP-IN in
  distance 150 172.16.22.1 0.0.0.0
 exit-address-family
!
ip forward-protocol nd
no ip http server
no ip http secure-server
!
ip bgp-community new-format
ip community-list standard DC1-ALLOW permit 65501:777
ip community-list standard DC1-BLOCK permit 65501:666
!
!
!
ip prefix-list DC1-PREFIX-LIST seq 10 permit 10.10.99.10/32
!
route-map DC1-ROUTE-CHECK1 permit 10
 match ip address DC1-PREFIX-LIST
 match source-protocol ospf 1
!
route-map DC1-ROUTE-CHECK2 permit 10
 match ip address DC1-PREFIX-LIST
 match source-protocol ospf 1
!
route-map MPLS-BGP-OUT-NORMAL permit 10
 set community 65502:666
!
route-map MPLS-BGP-OUT-FAILOVER permit 10
 set community 65502:777
!
route-map BGPtoOSPF permit 10
 set tag 10000777
!
route-map OSPFtoBGP permit 10
!
route-map MPLS-BGP-IN deny 5
 match community DC1-BLOCK
!
route-map MPLS-BGP-IN permit 10
 match community DC1-ALLOW
!
route-map MPLS-BGP-IN permit 20
 set community 10000:777
!
!

The route i'm watching for is in the route table.. 

O        10.10.99.10/32 [110/52] via 10.99.2.1, 00:10:28, FastEthernet1/0

 

But it's still using the 'failover' map
B-CORE2#sh ip bgp neighbors 172.16.22.1 | inc Condition
  Condition-map DC1-ROUTE-CHECK1, Advertise-map MPLS-BGP-OUT-NORMAL, status: Withdraw
  Condition-map DC1-ROUTE-CHECK2, Advertise-map MPLS-BGP-OUT-FAILOVER, status: Advertise
but it's not advertising ANY routes
B-CORE2#sh ip bgp neigh 172.16.22.1 advertised-routes

Total number of prefixes 0
here is a full dump of the other router. This one IS matching the advertise-map and advertising routes. There are some differences in this config (I was doing more tweaking on the first one trying different things), but the first one started out this same way, still with inconsistent results. So I must have something wrong somewhere...

router ospf 1
 router-id 10.10.99.2
 redistribute bgp 65501 metric 1000 subnets route-map BGPtoOSPF
 passive-interface default
 no passive-interface FastEthernet1/0
 no passive-interface FastEthernet2/0
 no passive-interface FastEthernet3/0
 no passive-interface FastEthernet4/0
 distance ospf external 200
!
router bgp 65501
 bgp router-id 10.10.99.2
 bgp log-neighbor-changes
 timers bgp 10 30
 neighbor 172.16.12.1 remote-as 10000
 neighbor 172.16.12.1 description MPLS1
 !
 address-family ipv4
  redistribute ospf 1 match internal external 1 external 2 route-map OSPFtoBGP
   neighbor 172.16.12.1 activate
  neighbor 172.16.12.1 send-community
  neighbor 172.16.12.1 advertise-map MPLS-BGP-OUT-NORMAL exist-map DC2-ROUTE-CHECK
  neighbor 172.16.12.1 advertise-map MPLS-BGP-OUT-FAILOVER non-exist-map DC2-ROUTE-CHECK
  neighbor 172.16.12.1 soft-reconfiguration inbound
  neighbor 172.16.12.1 route-map MPLS-BGP-IN in
  distance 150 172.16.12.1 0.0.0.0
 exit-address-family
!
ip forward-protocol nd
no ip http server
no ip http secure-server
!
ip bgp-community new-format
ip community-list standard DC2-BLOCK permit 65502:666
ip community-list standard DC2-ALLOW permit 65502:777
!
!
ip prefix-list DC2-PREFIX-LIST seq 10 permit 20.20.99.10/32
no cdp log mismatch duplex
!
route-map MPLS-BGP-OUT-NORMAL permit 10
 set community 65501:666
!
route-map DC2-ROUTE-CHECK permit 10
 match ip address prefix-list DC2-PREFIX-LIST
 match source-protocol ospf 1
!
route-map MPLS-BGP-OUT-FAILOVER permit 10
 set community 65501:777
!
route-map BGPtoOSPF permit 10
 set tag 10000777
!
route-map OSPFtoBGP permit 10
!
route-map MPLS-BGP-IN deny 5
 match community DC2-BLOCK
!
route-map MPLS-BGP-IN permit 10
 match community DC2-ALLOW
!
route-map MPLS-BGP-IN permit 20
 set community 10000:777
!
!








O*E1  0.0.0.0/0 [200/2] via 10.5.1.2, 00:14:08, FastEthernet4/0
      10.0.0.0/8 is variably subnetted, 12 subnets, 2 masks
C        10.1.1.0/24 is directly connected, FastEthernet3/0
L        10.1.1.2/32 is directly connected, FastEthernet3/0
C        10.1.3.0/24 is directly connected, FastEthernet2/0
L        10.1.3.1/32 is directly connected, FastEthernet2/0
C        10.5.1.0/24 is directly connected, FastEthernet4/0
L        10.5.1.1/32 is directly connected, FastEthernet4/0
C        10.10.99.2/32 is directly connected, Loopback0
O        10.10.99.10/32 [110/2] via 10.5.1.2, 00:14:08, FastEthernet4/0
O        10.10.99.11/32 [110/2] via 10.5.1.2, 00:14:08, FastEthernet4/0
O        10.10.102.1/32 [110/2] via 10.5.1.2, 00:14:08, FastEthernet4/0
C        10.99.2.0/24 is directly connected, FastEthernet1/0
L        10.99.2.1/32 is directly connected, FastEthernet1/0
      20.0.0.0/8 is variably subnetted, 7 subnets, 2 masks
O        20.1.1.0/24 [110/41] via 10.99.2.2, 00:14:08, FastEthernet1/0
O        20.1.3.0/24 [110/41] via 10.99.2.2, 00:14:08, FastEthernet1/0
O        20.5.1.0/24 [110/41] via 10.99.2.2, 00:14:08, FastEthernet1/0
O        20.20.99.2/32 [110/41] via 10.99.2.2, 00:14:08, FastEthernet1/0
O        20.20.99.10/32 [110/42] via 10.99.2.2, 00:14:08, FastEthernet1/0
O        20.20.99.11/32 [110/42] via 10.99.2.2, 00:14:08, FastEthernet1/0
O        20.20.102.1/32 [110/42] via 10.99.2.2, 00:14:08, FastEthernet1/0
      172.16.0.0/16 is variably subnetted, 5 subnets, 2 masks
C        172.16.12.0/24 is directly connected, FastEthernet0/0
L        172.16.12.2/32 is directly connected, FastEthernet0/0
O E2     172.16.100.0/24 [200/1000] via 10.99.2.2, 00:14:08, FastEthernet1/0
O E2     172.16.101.0/24 [200/1000] via 10.99.2.2, 00:14:08, FastEthernet1/0
O E2     172.16.102.0/24 [200/1000] via 10.99.2.2, 00:14:08, FastEthernet1/0
A-CORE2#
A-CORE2#
A-CORE2#
A-CORE2#
A-CORE2#sh ip bgp neighbors 172.16.12.1 | inc Condition
  Condition-map DC2-ROUTE-CHECK, Advertise-map MPLS-BGP-OUT-NORMAL, status: Advertise
  Condition-map DC2-ROUTE-CHECK, Advertise-map MPLS-BGP-OUT-FAILOVER, status: Withdraw
A-CORE2#
A-CORE2#
A-CORE2#
A-CORE2#sh ip bgp neig
A-CORE2#sh ip bgp neighbors 172.16.12.1 adver
A-CORE2#sh ip bgp neighbors 172.16.12.1 advertised-routes
BGP table version is 69, local router ID is 10.10.99.2
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter,
              x best-external, a additional-path, c RIB-compressed,
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

     Network          Next Hop            Metric LocPrf Weight Path
 *>  10.1.1.0/24      0.0.0.0                  0         32768 ?
 *>  10.1.3.0/24      0.0.0.0                  0         32768 ?
 *>  10.5.1.0/24      0.0.0.0                  0         32768 ?
 *>  10.10.99.2/32    0.0.0.0                  0         32768 ?
 *>  10.10.99.10/32   10.5.1.2                 2         32768 ?
 *>  10.10.99.11/32   10.5.1.2                 2         32768 ?
 *>  10.10.102.1/32   10.5.1.2                 2         32768 ?
 *>  10.99.2.0/24     0.0.0.0                  0         32768 ?
 *>  20.1.1.0/24      10.99.2.2               41         32768 ?
 *>  20.1.3.0/24      10.99.2.2               41         32768 ?
 *>  20.5.1.0/24      10.99.2.2               41         32768 ?
 *>  20.20.99.2/32    10.99.2.2               41         32768 ?
 *>  20.20.99.10/32   10.99.2.2               42         32768 ?
 *>  20.20.99.11/32   10.99.2.2               42         32768 ?
     Network          Next Hop            Metric LocPrf Weight Path
 *>  20.20.102.1/32   10.99.2.2               42         32768 ?
 *>  172.16.100.0/24  10.99.2.2             1000         32768 ?
 *>  172.16.101.0/24  10.99.2.2             1000         32768 ?
 *>  172.16.102.0/24  10.99.2.2             1000         32768 ?

Total number of prefixes 18

In your advertise map, can you add a match ip prefix-list with a 0.0.0.0/0 and run a debug bgp to see what the router is doing?

Have you tried shutdown the peer and put it back?

Thanks
Francesco
PS: Please don't forget to rate and select as validated answer if this answered your question

I havent' been able to get my lab back online to try this, but I did at one point have a 'match' statement with a prefix-list of 0.0.0.0/0, and it did not help, so I just took it back off (route-map permit without a match statement should match all routes by default).

 

I haven't tried shutdown and put back, but I've shutdown and rebooted the entire lab a couple times between testing sessions. 

Let me know when your lab is back up to do more tests.
If you have a topology design with your configs, i can build the lab myself and come back with answers

Thanks
Francesco
PS: Please don't forget to rate and select as validated answer if this answered your question
Review Cisco Networking for a $25 gift card