OSPF default-information originate head scratcher

markah · ‎12-10-2020

Hi,

I'm looking at a current issue where we have 2 x ASBR routers also peering with BGP neighbors where the default-route is received. Local preference configuration on the ASBR routers influence the preferred outbound path and ASBR-1 installs the eBGP default in to it's routing table. ASBR 2 has an iBGP entry for the default route as they are peering with each other.

The same 2 ASBR routers are configured with the 'default-information originate' command in OSPF. These ASBRs then both connect to an upstream switch configured to run ospf (everything is in area0)

What appears to be happening is that ASBR-1 with the eBGP default in its routing table is starts advertising the default to it's OSPF neighbors (ASBR-2 and the downstream switch). ASBR-2 then installs this OSPF learnt default in its routing table as it has a better AD than the iBGP learnt route).
Now that ASBR-2 has the both the iBGP leant default and the OSPF learnt default entries it prefers the OSPF route learnt from ASBR-1 and installs it in the routing table. ASBR-2 at this point starts advertising the default route due to having the default-information originate command and the valid OSPF route in its routing table.

Is this normal behaviour?

this behavious is leading to a routing loop because both ASBR-1 and ASBR-2 are now advertising the default route to the downstream switch. The downstream switch now has equal cost default routes via either ASBR. Due to link costs ASBR-2's best path to the default (advertised by ASBR-1) is via the downstream switch. This results in some traffic from ASBR-2 following the default to the downstream switch which then forwards some of the traffic back creating a loop.

I'd like add that this issue isn't my own creation and I have a couple of solutions in mind. More curious if this is a bug or just down to misconfiguration.

One more point is that ASBR-2 will at times stop advertising the default. When I issue the show ip database external 0.0.0.0 detail (or something similar, not at a device to check syntax) at this time there is only one Type-5 LSA which is via ASBR-1.

Thanks

pigallo · ‎12-10-2020

Hello,

before thinking to a bug or so... since you noticed default flaps on ASBR2, enable ip routing debug to check if this is periodic route installation/withdrawal.

If yes then possibly you have a control plane loop.

@markah wrote:

I'd like add that this issue isn't my own creation and I have a couple of solutions in mind. More curious if this is a bug or just down to misconfiguration.

One more point is that ASBR-2 will at times stop advertising the default. When I issue the show ip database external 0.0.0.0 detail (or something similar, not at a device to check syntax) at this time there is only one Type-5 LSA which is via ASBR-1.

Thanks

Richard Burts · ‎12-10-2020

The original poster asks "Is this normal behaviour?" There are things that we do not know about this environment and that might change my response but based on what we know so far I believe that this is indeed normal behavior. The important parts are that there are 2 ASBR and that one is preferred, and that the path from one ASBR to the other is through the switch. So if ASBR2 has a packet for outside it will attempt to forward it to ASBR1. But the path to ASBR1 is through the switch and the switch may have a default route pointing back to ASBR2. This results in the switch sometimes sending the packet back to ASBR2. I do not see anything that looks like a bug, but perhaps a bit of an awkward design.

It seems to me that there are several things that you could do to resolve this issue:

- manipulate the OSPF cost for the interfaces connecting the switch and ASBR2 and make this route less attractive than the route from ASBR1.

- configure a GRE tunnel running between ASBR1 and ASBR2 and run OSPF on that tunnel. With this the ASABR will believe that they are communicating directly. ASBR2 would send its traffic for ASBR1 via GRE. Physically it goes through the switch but the switch would no longer see a packet which needs the default route.

- remove the local preference. Let each ASBR learn its own default route. Then ASBR2 would no longer try to forward to ASBR1 when it needs for forward to outside.

HTH

Rick

markah · ‎12-10-2020

Thanks both for your responses and recommendations, I will possibly attempt the debug suggestion but may have to go with the quickest solution with reconfiguration.

For additional background ASBR-1 connects to ASBR-2 with a 1G link. While both ASBR-1 & ASBR-2 connect to the upstream switch with 10G links so probably preferable to keep this as the path.

I was possibly thinking of using a route-map after the default-information originate command on ASBR-1 to set the metric-type to type-1 (N7K switch). I think this would resolve the issue as ASBR-2 is advertising the default as a Type-2 so the down stream switch would prefer the default back to ASBR-1.

Georg Pauwen · ‎12-10-2020

Hello,

whether it is a bug or not, what you want is to have only one, the active, ASBR advertise the default route into OSPF. You could do this with EEM scripts.

Let's say the ASBRs are configured as below:

ASBR1 (Primary ISP)

router bgp 1
neighbor 192.168.12.2 remote-as 1
neighbor 192.168.12.2 next-hop-self
neighbor 192.168.12.2 route-map LOCAL_PREF_RM in
neighbor 1.1.1.2 remote-as 2
!
route-map LOCAL_PREF_RM permit 10
set local-preference 150
!
router ospf 1
default-information originate

ASBR2 (Secondary ISP)

router bgp 1
neighbor 192.168.12.1 remote-as 1
neighbor 192.168.12.1 next-hop-self
neighbor 2.2.2.2 remote-as 3
!
router ospf 1
default-information originate

This configuration leads to the problem you describe. Now, the EEM scripts below would check the origin of the default route in BGP. If ISP1 is up, ASBR2 does not advertise the default route into OSPF, and vice versa. The scripts would look like below (I set them to run every 30 seconds, you can change that value, or trigger the script based on a syslog message that occurs when the eBGP neighbor goes down):

ASBR1

event manager applet NO_DEFAULT_ROUTE
event timer watchdog time 30
action 1.0 cli command "enable"
action 2.0 cli command "show ip bgp topology * | inc 0.0.0.0"
action 3.0 string match "*192.168.12.2*" "$_cli_result"
action 4.0 if $_string_result eq "1"
action 5.0 cli command "conf t"
action 6.0 cli command "router ospf 1"
action 7.0 cli command "no default-information originate"
action 8.0 cli command "end"
action 9.0 end
!
event manager applet DEFAULT_ROUTE
event timer watchdog time 30
action 1.0 cli command "enable"
action 2.0 cli command "show ip bgp topology * | inc 0.0.0.0"
action 3.0 string match "*1.1.1.2*" "$_cli_result"
action 4.0 if $_string_result eq "1"
action 5.0 cli command "conf t"
action 6.0 cli command "router ospf 1"
action 7.0 cli command "default-information originate"
action 8.0 cli command "end"
action 9.0 end

ASBR2

event manager applet NO_DEFAULT_ROUTE
event timer watchdog time 30
action 1.0 cli command "enable"
action 2.0 cli command "show ip bgp topology * | inc 0.0.0.0"
action 3.0 string match "*192.168.12.1*" "$_cli_result"
action 4.0 if $_string_result eq "1"
action 5.0 cli command "conf t"
action 6.0 cli command "router ospf 1"
action 7.0 cli command "no default-information originate"
action 8.0 cli command "end"
action 9.0 end
!
event manager applet DEFAULT_ROUTE
event timer watchdog time 30
action 1.0 cli command "enable"
action 2.0 cli command "show ip bgp topology * | inc 0.0.0.0"
action 3.0 string match "*2.2.2.2*" "$_cli_result"
action 4.0 if $_string_result eq "1"
action 5.0 cli command "conf t"
action 6.0 cli command "router ospf 1"
action 7.0 cli command "default-information originate"
action 8.0 cli command "end"
action 9.0 end

paul driver · ‎12-10-2020

Hello

@markah wrote:

One more point is that ASBR-2 will at times stop advertising the default. When I issue the show ip database external 0.0.0.0 detail (or something similar, not at a device to check syntax) at this time there is only one Type-5 LSA which is via ASBR-1

Are you performing mutual redistribution, its possible your hitting a race condition between ospf and bgp when the bgp route is withdrawn and readvertised the ospf route is still preffered.

Added to that with bgp/ospf mutual redistribution, When bgp is redistributed into ospf the asn of the bgp is set as a route-tag in the ospf lsa however when ospf is redistributed into bgp the now a route tag (the original as-path) and origin of the route is lost.
Can you post a topology diagram please to show how your rtrs are physically connected.

Please rate and mark as an accepted solution if you have found any of the information provided useful.
This then could assist others on these forums to find a valuable answer and broadens the community’s global network.

Kind Regards
Paul

markah · ‎12-12-2020

Sorry for the delay in response to your help. I couldn't access the sign in page yesterday.

George, your suggestion looks interesting and offers an interesting solution, unforunately I've never configured EEM scripts before so it would be a bit of a learning curve to implement. One of the downsides of having to support so many different products is finding the time to try something different. One I will try and lab. Thanks for the suggestion.

Paul, there is no mutual redistribution in this solution, subnets advertised in to BGP using network statements. I will take a look for the network diagram next week, not had much chance the last couple of days as I've been bombarded with incidents.

The issue actually presented itself when one of the 7Ks was replaced for a newer model. Previously the old configuration didn't have the default-information originate on either ASBR but I think mistakenly it was assumed that the redistribution of the default from BPG in to OSPF was possible md was working before the switch replacement (from all I've read t has never been possible to redistribute the default into OSPF). After the replacement of the switch another part of the business connected to the downstream switches claimed they couldn't access several networks after the hardware swap out. I think rather than determine what they couldn't access and then advertise the relevant networks accordingly the quick fix was to add the default-information originate command as I think the belief was that the default-route was being advertised before the change. I never got to see the routing tables before the switch replacement so I can't be sure. I only got involved when the incident was raised a few weeks later (the issue is intermittent).

Life as a support engineer...……..

Georg Pauwen · ‎12-12-2020

Hello,

the EEM scripts are fairly easy to implement actually. If you can post the running configs of your ASBRs, I can customize the scripts, you can then pretty much cut & paste them into your configuration.

paul driver · ‎12-12-2020

Hello
Okay no mutual redistribution is occurring, so if i visualize your topology correctly, the two WAN rtrs are running bgp/ospf which are redistributing ospf into bgp and advertising default routes via ospf into your lan?
If so you can simply advertise one for the ospf defaults from the preferred wan rtr with an ospf metric-type 1 so that lan will always take this path and as/when the bgp default route from the wan rtr is lost, Then its advertised ospf default into the lan will be withdrawn and the other ospf will be advertised into the lan.

WAN rtr 1 < preffered path>
router ospf x
default-information originate metric-type 1

WAN rtr 2
router ospf x
default-information originate

Please rate and mark as an accepted solution if you have found any of the information provided useful.
This then could assist others on these forums to find a valuable answer and broadens the community’s global network.

Kind Regards
Paul

markah · ‎12-14-2020

Hi Paul, I was thinking that your recommendation above would be the quickest and simplest fix although the Nexus 7706 doesn't allow me to set the metric in the default-information originate command so I will need to configured a route-map to set the type-1.

DC1-XXX-DS1-GI%GI(config-router-vrf)# default-information originate ?
<CR>
always Always advertise default route
route-map Policy to control distribution of default route

MHM Cisco World · ‎12-14-2020

..

MHM Cisco World · ‎12-12-2020

If this is your topology then
make your edge router which run OSPF inject default route,

this prevent the loop.

that Best solution

markah · ‎12-14-2020

Hopefully this woeful diagram might paint the pic. This design was implemented years before my time I'd like to add.

The firewalls separate 2 different business units.

ASBR-1 has a default-route eBGP route from Switch-1 (best local-pref) which it advertises in OSPF with the default-info originate command. ASBR-2 prefers installs the OSPF default-route advertised by ASBR-1. The 10Gig links makes the best path from ASBR-2 to ASBR-1 via the downstream switch. Because ASBR-2 is also configured with default-info originate ASBR-2 also 'sometime' starts advertising a default-route. Downstream switch at this point has 2 equal cost default routes, one with next hop of ASBR-1 and the other with next hop of ASBR-2. This leads to some traffic from ASBR-2 routing to the downstream switch and some traffic will get routed back in the loop. We also have VPC's and HSRP on the ASBRs so either switch can be the forwarding switch for the traffic coming from the L3 SVIs.

MHM Cisco World · ‎12-14-2020

eBGP with OPSF is OK because eBGP AD is 20 and OPSF 110,

the issue in
iBGP and OSPF because iBGP is 200.

ASR have default learn from ASA through eBGP, so it have in route-table

ASR now is ASRB for OSPF

ASR will inject default route into OSPF,

downstream SW now learn the default through the OPSF from ASR

NO default information in downstream any more all downstream learn default from ASR

MHM Cisco World · ‎12-14-2020

I found the issue here

ASR-1 get default from ASA because it connect to active ASA failover Peer,

ASR-2 not get feudal from ASA because it connect to standby failover Peer,

You need SW connect there two ASR to ASA failover.