cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2430
Views
25
Helpful
7
Replies

Null0 static routes for BGP aggregates cause redistibution into OSPF

robert.gillen
Level 1
Level 1

Hey all,


Had an issue for us that caused a large outage for us, we logged a TAC case but didn't really an answer as to why, essentially that yes it does happen, as we told them we were able to replicate the issue in lab after investigation. So hoping someone has some insight, mainly so we can understand what is happening. We have resolved.

To try keep it simple here is the summary of the issue:

  • Designs and simulations were all modeled on NXOS software
  • Temporary hardware (using IOS-XE) was installed due to chip shortages and delays in manufacturing
  • Advertising BGP aggregate subnets to new SDWAN VPN concentrators
  • BGP aggregates were advertised using "network command" under bgp
  • Network command routes were defined by Null0 static routes
    • ip route vrf BLUE 169.254.244.0 255.255.255.0 Null0 252
  • Routes received from BGP peers are redistributed into the local core network OSPF process
  • Issue: Null0 routes with network advertisements got automatically redistributed into OSPF - causing issues
  • Issue: doesn't occur on NXOS and investigation revealed issue could be replicated on IOS-XE devices (routers and switches)
  • Issue: doesn't occur with any other static routes and the network command (only those with destination null0)
  • its not the redistribute connected statement

Fix:

  • tag static null0 routes
    • ip route vrf BLUE 169.254.244.0 255.255.255.0 Null0 252 tag 1234
  • deny tag 1234 in the route-map for BGP to OSPF redistribution

I'm stumped to know why and I only have 2 assumptions, 1 - its a bug, 2 - is it because Null0 routes are automatically created when using "aggregate-address x.x.x.x" in BGP?

 

Configs and example debug below.
BGP:
router bgp 65003
bgp router-id 22.22.22.22
bgp log-neighbor-changes
timers bgp 30 90
!
address-family ipv4 vrf BLUE
network 10.0.0.0 mask 255.0.0.0
network 10.112.0.0 mask 255.255.0.0
network 169.254.244.0 mask 255.255.255.0
network 172.16.0.0 mask 255.240.0.0
network 192.168.0.0 mask 255.255.0.0
neighbor 10.66.253.5 remote-as 65001
neighbor 10.66.253.5 activate
neighbor 10.66.253.5 route-map all-DC-subnets-in in
neighbor 10.66.253.5 route-map prepend out
exit-address-family
!

OSPF:

!
router ospf 100 vrf WILCORP
router-id 22.22.22.22
capability vrf-lite
area 0 authentication message-digest
redistribute connected
redistribute bgp 65003 route-map all
passive-interface default
no passive-interface Vlan1997
no passive-interface TenGigabitEthernet1/1/1.1
no passive-interface TenGigabitEthernet1/1/2.1
network 10.66.253.33 0.0.0.0 area 0
network 10.66.254.1 0.0.0.0 area 0
network 10.66.254.5 0.0.0.0 area 0
!

route-map:

!
route-map all deny 1
match ip address prefix-list all-DC-subnets
!
route-map all deny 2
match tag 1234
!
route-map all permit 10
match ip address prefix-list all
set tag 1996
!

 

debug when issue occured:

  1. add null0 route
  2. add the network command
  3. LSA is generated


*Aug 4 04:22:24.500: BGP: Applying map to find origin for 169.254.244.0/24
*Aug 4 04:22:24.501: BGP: Applying map to find origin for 169.254.244.0/24
*Aug 4 04:22:24.501: BGP: Applying map to find origin for 169.254.244.0/24
*Aug 4 04:22:24.501: OSPF-100 LSGEN: Build external LSA 169.254.244.0, mask 255.255.255.0, type 5, age 0, options 0x20, seq 0x80000001
*Aug 4 04:22:24.501: OSPF-100 LSGEN: MTID Metric Metric-type FA Tag Topology Name
*Aug 4 04:22:24.501: OSPF-100 LSGEN: 0 1 2 0.0.0.0 1996 Base

 

2 Accepted Solutions

Accepted Solutions

Richard Burts
Hall of Fame
Hall of Fame

There is much we do not know about this situation and perhaps some of that unknown information might change the suggestion that I have. But based on what is described in the original post I suggest that this is what is going on:

- the static route to null0 associates the subnet of the route with an interface (as opposed to a static specifying a next hop which is independent of any interface) and is treated as a locally connected subnet. There are many discussion in the community about this behavior when a static route specifies only an outbound interface and not a next hop.

- now that the subnet is associated with an interface the redistribute connected in OSPF causes the subnet to be redistributed into OSPF.

- The original post says "its not the redistribute connected statement". I wonder on what basis did they determine this? 

- The original post says "doesn't occur with any other static routes". I wonder if they tested with other static routes which specify only the outbound interface or tested just with "normal" static routes which specify a next hop? 

- the original post says that this behavior occurs on some platforms and does not occur on other platforms. I am surprised at this but accept that it is possible.

- the important thing is that IF you need the static route to null0 and IF you need to redistribute static then you may need a route map to filter out the null0 route.

HTH

Rick

View solution in original post

Hello @robert.gillen ,

you have provided an initial post with many details that describes a different behaviuor between current devices in production running IOS XE and devices tested in the design and simulation phase based on NX-OS ( Nexus switches of some type).

This is not the first or only one thread where diffferences in routing protocols and redistribution are described between IOS/ IOS XE and NX-OS devices.

You have already found a workaround that uses a modified version of the route-map used to filter BGP routes redistribution into OSPF using a route-tag in the static routes and then denying routes having that route tag value.

 

If we review the more current documentation about redistribution in IPv4 unicast between two dynamic protocols we find that the conditions for prefixes to be passed from protocol A to protocol B are:

a) the prefix is installed in the IP routing table by protocol A ( BGP in your case)

b) a special case exists for connected routes that appear in the routing table as connected routes but they are also matching a network statement.

 

Now, you have static routes to null0 that match a network statement under router bgp. These network statements inejct the corresponding prefixes in BGP table as locally originated routes with next-hop 0.0.0.0 and weight 38,768.

As noted by @Richard Burts  this type of configuration leads these static routes to be treated as similar to connected routes.

 

What is really interesting in your scenario is that in router OSPF configuration you have also a redistribute connected , but you have fixed the route leakage by changing the route-map applied to BGP into OSPF redistribution with no route-map applied to redistribute connected.

 

On the other hand, we know that BGP has its own BGP tables or RIB and the locally injected routes / prefixes are part of this and they are considered best path and advertised to BGP peer(s).

So this is probably the "grey zone of implementation":  the corresponding routes are installed in IP routing table as static routes with exit interface null0 ( it would be interesting to test using a static route to a physical interface to see if there is any change) but they are also prefixes in the BGP table locally injected and best path advertised to BGP peer(s).

 

From the BGP configuration point of view you could use aggregate-address .....  summary-only instead of static to null0 + network command, but you would need network commands in BGP for component subnets to trigger the advertising.

 

I personally prefer to use aggregate-address now instead of the combo static route to null0 + network command.

So I would suggest to try this way to see if  you see any changes.

Warning : when using aggregate-address you will need BGP network commands for some component subnets so you may need to filter them in redistribution of BGP into OSPF, in other words the problem can just shift to the component subnets of each aggregate.

 

Hope to help

Giuseppe

 

View solution in original post

7 Replies 7

Richard Burts
Hall of Fame
Hall of Fame

There is much we do not know about this situation and perhaps some of that unknown information might change the suggestion that I have. But based on what is described in the original post I suggest that this is what is going on:

- the static route to null0 associates the subnet of the route with an interface (as opposed to a static specifying a next hop which is independent of any interface) and is treated as a locally connected subnet. There are many discussion in the community about this behavior when a static route specifies only an outbound interface and not a next hop.

- now that the subnet is associated with an interface the redistribute connected in OSPF causes the subnet to be redistributed into OSPF.

- The original post says "its not the redistribute connected statement". I wonder on what basis did they determine this? 

- The original post says "doesn't occur with any other static routes". I wonder if they tested with other static routes which specify only the outbound interface or tested just with "normal" static routes which specify a next hop? 

- the original post says that this behavior occurs on some platforms and does not occur on other platforms. I am surprised at this but accept that it is possible.

- the important thing is that IF you need the static route to null0 and IF you need to redistribute static then you may need a route map to filter out the null0 route.

HTH

Rick

Hey Rick, cheers for the reply.

Interesting points, and to add some info:

- the static route to null0 associates the subnet of the route with an interface (There are many discussion in the community about this behavior when a static route specifies only an outbound interface and not a next hop):
I thought the same, though i tested this using 169.254 addresses (when there were no matching interfaces) and had the same result occur.

- The original post says "its not the redistribute connected statement". I wonder on what basis did they determine this?
This was my original assumption too and logged a TAC case regarding this, Was able to prove it wasn't by (I'm pretty sure)
1 - added a single null0 route without adding advertisement into BGP - no OSPF propagation.
2 - added new loopback interface with 169.254 address - OSPF propagation occurs.

 

- The original post says "doesn't occur with any other static routes". I wonder if they tested with other static routes which specify only the outbound interface or tested just with "normal" static routes which specify a next hop? 

we tested with both a normal ip address destination static route and then the destination being a loopback interface.
Our current WAN IPVPN configs all have this too with no issue.


- the important thing is that IF you need the static route to null0 and IF you need to redistribute static then you may need a route map to filter out the null0 route.

Yeah - no redistribute static has been configured or needed.

Rob

 

 

Hello @robert.gillen ,

you have provided an initial post with many details that describes a different behaviuor between current devices in production running IOS XE and devices tested in the design and simulation phase based on NX-OS ( Nexus switches of some type).

This is not the first or only one thread where diffferences in routing protocols and redistribution are described between IOS/ IOS XE and NX-OS devices.

You have already found a workaround that uses a modified version of the route-map used to filter BGP routes redistribution into OSPF using a route-tag in the static routes and then denying routes having that route tag value.

 

If we review the more current documentation about redistribution in IPv4 unicast between two dynamic protocols we find that the conditions for prefixes to be passed from protocol A to protocol B are:

a) the prefix is installed in the IP routing table by protocol A ( BGP in your case)

b) a special case exists for connected routes that appear in the routing table as connected routes but they are also matching a network statement.

 

Now, you have static routes to null0 that match a network statement under router bgp. These network statements inejct the corresponding prefixes in BGP table as locally originated routes with next-hop 0.0.0.0 and weight 38,768.

As noted by @Richard Burts  this type of configuration leads these static routes to be treated as similar to connected routes.

 

What is really interesting in your scenario is that in router OSPF configuration you have also a redistribute connected , but you have fixed the route leakage by changing the route-map applied to BGP into OSPF redistribution with no route-map applied to redistribute connected.

 

On the other hand, we know that BGP has its own BGP tables or RIB and the locally injected routes / prefixes are part of this and they are considered best path and advertised to BGP peer(s).

So this is probably the "grey zone of implementation":  the corresponding routes are installed in IP routing table as static routes with exit interface null0 ( it would be interesting to test using a static route to a physical interface to see if there is any change) but they are also prefixes in the BGP table locally injected and best path advertised to BGP peer(s).

 

From the BGP configuration point of view you could use aggregate-address .....  summary-only instead of static to null0 + network command, but you would need network commands in BGP for component subnets to trigger the advertising.

 

I personally prefer to use aggregate-address now instead of the combo static route to null0 + network command.

So I would suggest to try this way to see if  you see any changes.

Warning : when using aggregate-address you will need BGP network commands for some component subnets so you may need to filter them in redistribution of BGP into OSPF, in other words the problem can just shift to the component subnets of each aggregate.

 

Hope to help

Giuseppe

 

Rob

A couple of points:

- I should have proof read my reply better before I hit send. I said "if you need to redistribute static". You correctly point out that you have no redistribution of static. What I should have said was "if you need to redistribute connected". It is not important there there was a static route. I believe that it is important that the route is treated as a connected subnet.

@Giuseppe Larosa provides an interesting suggestion about using aggregate address. But even that approach might lead to the same issue.

- you mention a couple of times using 169.254 addresses. In terms of this behavior I believe that it does not matter whether the address is considered as routable or not. It matters whether a subnet is considered as locally connected or not.

- I am a bit surprised that you find that the behavior is not consistent over various platforms. But I accept that this may very well be the case. As the Cisco product line includes multiple OSes we observe that there are differences in details of the implementations. Giuseppe has described this as a gray area and I believe this is a valuable insight and I agree. I believe that engineers for those product lines would not regard what you report as a "bug". I believe that as users of multiple Cisco product lines we need to adapt to differences in implementation. On some products you do not need to worry about redistribution of null0 routes and on other products you do.

HTH

Rick

robert.gillen
Level 1
Level 1

Very interesting thanks, @Giuseppe Larosa 

Now, you have static routes to null0 that match a network statement under router bgp. These network statements inject the corresponding prefixes in BGP table as locally originated routes with next-hop 0.0.0.0 and weight 38,768.
I've re-read your b) comment here https://www.cisco.com/c/en/us/support/docs/ip/enhanced-interior-gateway-routing-protocol-eigrp/8606-redist.html


I see what @Richard Burts means now about the routes being treated as locally connected.
This would seem to be why and can see that from the lab as well looking at the BGP table.

*> 172.16.4.0/24 0.0.0.0 0 32768 i                   Directly Connected Interface via Network Command
*> 192.168.1.0 10.100.254.19 41 32768 i        OSPF Route via Network Command
*> 192.168.9.0 10.100.254.19 100 32768 i      Static Route to IP Address via Network Command (couldn't try one to physical Interface in a vrf)
*> 192.168.22.0 0.0.0.0 0 32768 i                    Static Route to Null0 via Network Command

For context normally we would summarise these routes via OSPF and then just use those in the network statement. But due to legacy networks we haven't been able to to that in this instance, especially during transition from a traditional WAN to a SDWAN.

Thanks both for the input,
Rob.

Rob

You are welcome. I am glad that our explanations have been helpful. Thank you for marking this question as solved. This will help other participants in the community to identify discussions which have helpful information. This community is an excellent place to ask questions and to learn about networking. I hope to see you continue to be active in the community.

HTH

Rick

using the aggregate with suppress feature.

this make you advertise the DC subnet aggregate only. 
now the aggregate using with tag, 
deny redistribute connect into OSPF with tag.

Review Cisco Networking products for a $25 gift card