Re: BGP local route removal issue

stephon.a.it · ‎09-05-2024

Hey everybody,

I was searching for a sufficient answer however I didn't find anything so figured I'd post the question for some feedback.

I have a BGP configuration on a Nexus 7k. The 7k also has an EIGRP and an OSPF neighbor. In the BGP configuration, rather than using redistribution, I added a network statement for the EIGRP and OSPF routes respectively. It was only one route and I preferred the route have its local origination in the BGP table as opposed to " ? " for a redistributed out or route-map influence etc...

An unusual issue occurred when I rebooted the EIGRP and OSPF neighbors; The routes were removed from the 7k however the 7k's BGP configuration was still advertising the route due to the network statement configuration. This created a loop because from the 7k's perspective with redundancy in the network, it was receiving the route from it's eBGP neighbor originating from itself!

Once the EIGRP and OSPF routers rebooted, the routes were installed in their respective process but due to eBGP having a better admin distance (20), the BGP process never preferred the local route and the neighbor session had to be hard cleared for it to be fixed. From my understanding, BGP won't advertise a network that it doesn't have in it's route table so I'm confused on two points:

1) Why did the 7k continue to advertise the route when it had been removed?

2) Once the route came back, why didn't the local route take precedence by normal BGP convergence standards as opposed to me hard resetting the session?

I'd appreciate any feedback on this.

Thanks!

MHM Cisco World · ‎09-05-2024

Can yoh more elaborate

Also share topology

MHM

Pavel Tarakanov · ‎09-05-2024

Best option here will be to collect tac-pacs during the issue and open the case with TAC. It could be either design issue (routing loops/etc) or software defect or any other issue.

Also topology with content of the tables (routing, BGP, OSPF/EIGRP databases, event-history) will help to find the root cause.

paul driver · ‎09-05-2024

Hello
It sounds like the loop occurred due to the time bgp took to converge as apposed to the eigrp/ospf, as such the route in bgp was being seen between the bgp peers and not getting removed.

By the way you could have redistributed the igp into bgp via a route map and set the origin that way, as such the admin distance would have been less preferred than the igp seed metrics

Can you post a topology diagram for this please if applicable.

Please rate and mark as an accepted solution if you have found any of the information provided useful.
This then could assist others on these forums to find a valuable answer and broadens the community’s global network.

Kind Regards
Paul

stephon.a.it · ‎09-11-2024

I recreated the topology and added some show commands details. Here in my topology, I recreated the reloads on the OSPF and EIGRP and the routes were successfully removed from BGP instantly.

R1#sh ip route ospf
Codes: L - local, C - connected, S - static, R - RIP, M - mobile, B - BGP
D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area
N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
E1 - OSPF external type 1, E2 - OSPF external type 2
i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
ia - IS-IS inter area, * - candidate default, U - per-user static route
o - ODR, P - periodic downloaded static route, H - NHRP, l - LISP
a - application route
+ - replicated route, % - next hop override, p - overrides from PfR

Gateway of last resort is not set

5.0.0.0/32 is subnetted, 1 subnets
O 5.0.0.1 [110/2] via 1.0.5.5, 00:04:41, GigabitEthernet0/1
R1#
R1#
R1#
R1#sh ip route eigrp
Codes: L - local, C - connected, S - static, R - RIP, M - mobile, B - BGP
D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area
N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
E1 - OSPF external type 1, E2 - OSPF external type 2
i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
ia - IS-IS inter area, * - candidate default, U - per-user static route
o - ODR, P - periodic downloaded static route, H - NHRP, l - LISP
a - application route
+ - replicated route, % - next hop override, p - overrides from PfR

Gateway of last resort is not set

6.0.0.0/29 is subnetted, 1 subnets
D 6.0.0.0 [90/130816] via 1.0.6.6, 00:17:24, GigabitEthernet0/2
R1#

R1#sh run part router bgp 65001
Building configuration...

Current configuration : 392 bytes
!
! Last configuration change at 14:52:18 UTC Wed Sep 11 2024
!
!
!
!
router bgp 65001
bgp router-id 1.1.1.1
bgp log-neighbor-changes
neighbor 1.0.2.2 remote-as 65002
neighbor 1.0.3.3 remote-as 65003
!
address-family ipv4
network 5.0.0.1 mask 255.255.255.255
network 6.0.0.0 mask 255.255.255.248
neighbor 1.0.2.2 activate
neighbor 1.0.3.3 activate
exit-address-family
!
!
end

R1#

****Reload*****

R1#sh ip route
Codes: L - local, C - connected, S - static, R - RIP, M - mobile, B - BGP
D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area
N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
E1 - OSPF external type 1, E2 - OSPF external type 2
i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
ia - IS-IS inter area, * - candidate default, U - per-user static route
o - ODR, P - periodic downloaded static route, H - NHRP, l - LISP
a - application route
+ - replicated route, % - next hop override, p - overrides from PfR

Gateway of last resort is not set

1.0.0.0/8 is variably subnetted, 8 subnets, 2 masks
C 1.0.2.0/24 is directly connected, GigabitEthernet0/0
L 1.0.2.1/32 is directly connected, GigabitEthernet0/0
C 1.0.3.0/24 is directly connected, GigabitEthernet0/3
L 1.0.3.1/32 is directly connected, GigabitEthernet0/3
C 1.0.5.0/24 is directly connected, GigabitEthernet0/1
L 1.0.5.1/32 is directly connected, GigabitEthernet0/1
C 1.0.6.0/24 is directly connected, GigabitEthernet0/2
L 1.0.6.1/32 is directly connected, GigabitEthernet0/2
R1#

stephon.a.it · ‎09-11-2024

***Before reload****

R1#sh ip bgp
BGP table version is 11, local router ID is 1.1.1.1
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter,
x best-external, a additional-path, c RIB-compressed,
t secondary path,
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

Network Next Hop Metric LocPrf Weight Path
*> 5.0.0.1/32 1.0.5.5 2 32768 i
*> 6.0.0.0/29 1.0.6.6 130816 32768 i
R1#

***After Reload***

R1#sh ip bgp

***IGPs re-established***

R1#sh ip bgp
BGP table version is 15, local router ID is 1.1.1.1
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter,
x best-external, a additional-path, c RIB-compressed,
t secondary path,
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

Network Next Hop Metric LocPrf Weight Path
*> 5.0.0.1/32 1.0.5.5 2 32768 i
*> 6.0.0.0/29 1.0.6.6 130816 32768 i
R1#

MHM Cisco World · ‎09-11-2024

Bgp receive route from itself ? That not correct' the router see it AS in AS-PATH and reject the prefix.

What is happened here is bgp is slow compared to both eigrp and ospf hence the bgp will wait for specific time before remove the prefix.

You can run eem with ip sla track to detect ospf/eigrp neighbor down and action no network.

Or try instead use redistrubte (I not sure this way work)

MHM

stephon.a.it · ‎09-11-2024

In the topology above, BGP worked as I expected. The OSPF and EIGRP routes were added into R1's BGP via a network statement. Once I reloaded those routers, the OSPF and EIGRP routes were removed, and thus removed from R1's BGP table. In the customer environment, instead of removing the routes, R3 still received the route from R1 even though it wasn't there and it was advertised all the way to R2 who in turn advertised it to R1. Because R1 didn't have the route, it accepted the route as an eBGP route. However my understanding is it should've worked the way the lab above functioned. I'm thinking maybe it's a code issue. Also the customer is using Nexus's and these are IOSvs.

MHM Cisco World · ‎09-11-2024

1.0.5.5 and 1.0.6.6 for which these IP's

MHM

stephon.a.it · ‎09-11-2024

1.0.5.5 is R5 and 1.0.6.6 is R6

MHM Cisco World · ‎09-11-2024

So before and after the R1 still see prefix from R5/R6 ospf/eigrp routers

When you reload R5/R6 do you see any log message inR1 for ospf/eigrp neighbor down?

MHM

stephon.a.it · ‎09-11-2024

No the prefixes were removed when R5 and R6 were coming back online. yes both neighbors logged as down during the reload.

MHM Cisco World · ‎09-11-2024

""No the prefixes were removed when R5 and R6 were coming back online""

Can you more elaborate I dont understand this

MHM

MHM Cisco World · ‎09-12-2024

can you check the next-hop of prefix in bgp table in real network

MHM

paul driver · ‎09-11-2024

Hello
Thanks for sharing the topology -
What i can see clearly is you do not seem to be advertising any transit subnets between the igp/bgp domains also what isnt clear is if R5/6 have default routes to reach the external rtrs or if any of the external rtrs (including rtr1) have or are advertising any default routes?

Based on what you have shared, Logically you should not have experienced any loops if/when the igp rtrs reloaded, the bgp rib will have withdrawn those prefixes as/when the rtrs eigrp/ospf peering was torn down upon a reload.

Please rate and mark as an accepted solution if you have found any of the information provided useful.
This then could assist others on these forums to find a valuable answer and broadens the community’s global network.

Kind Regards
Paul