03-26-2019 09:31 PM
One of my ISP's reached out to me the other day to notify me that I had unadvertised both of my /24 prefixes towards them for a millisecond, so fortunately this didn't cause any routing issues out into the Internet towards our prefixes. I checked the logs on the router and noticed nothing related to allowing this behavior and there were no engineer changes to un-advertise those prefixes towards this BGP peer. The BGP session was also established and up for over 1 year. There wasn't any CPU spikes to make BGP glitch or cause other strange behavior around this time.
They also notified me that this happened on March 6th for a longer period of time and the prefix's were pulled from being advertised towards the Internet. I also have no evidence of this behavior and there were no BGP changes on my ASR to cause this behavior.
Can anyone explain what may have caused this? I'm running dual ASR1002's in a multi homed scenario with dual ISP's who both advertise our /24 prefixes.
thanks
=============================================================================== Route Table (Router: Base) =============================================================================== Dest Prefix[Flags] Type Proto Age Pref Next Hop[Interface Name] Metric ------------------------------------------------------------------------------- 1.1.1.0/24 Remote BGP 06h36m21s 170 55.44.66.56 0 ------------------------------------------------------------------------------- No. of Routes: 1 Flags: n = Number of times nexthop is repeated B = BGP backup route available L = LFA nexthop available S = Sticky ECMP requested =============================================================================== *A:bear1.phx1# show router route-table 68.106.64.0 =============================================================================== Route Table (Router: Base) =============================================================================== Dest Prefix[Flags] Type Proto Age Pref Next Hop[Interface Name] Metric ------------------------------------------------------------------------------- 2.2.2.0/24 Remote BGP 06h40m20s 170 55.44.66.56 0 ------------------------------------------------------------------------------- No. of Routes: 1 Flags: n = Number of times nexthop is repeated B = BGP backup route available L = LFA nexthop available S = Sticky ECMP requested ===============================================================================
Solved! Go to Solution.
03-29-2019 06:53 AM
I got a response from the ISP. Looks like there was no issue here. I'll close the discussion. thanks everyone.
After working through the route age discrepancy at our peer with engineering, I can provide clarification. For the more recent and recurring route age refreshes, where we do not see our peer drop and also do not see a route age change from the neighbor router/AS network, this local age refresh is a result of new installs on that chassis. When completing the new internet service turnups on the ALU gateways, the local route table is refreshed. This is only observed from the local chassis and wouldn’t be affecting active routing or any other router/AS. So we don’t need to hunt for any communications associated to those age refreshes, there isn’t any. I was also able to validate the same behavior/age looking at any prefix received on any peer at the same gateway router locally. I was also able to verify that the age refresh was not transient to any other routers/networks among any of them. I’m sorry we got off track accounting for that point.
03-27-2019 12:50 AM
Hello,
it is going to be difficult to track anything in the millisecond range down. Even if you have BFD configured, the minimum configurable value is 50 milliseconds, so if it is less than that, nothing will be registered.
I guess it would be useful to see the logs from your ISP(s) if you can get them...they must have something logged, otherwise they would not have notified you...
03-27-2019 02:22 AM
Hello
First question would be how are you advertising this subnet to your ISP, has the interface of this subnet been withdrawn due its related interface temporally being shutdown or flapped etc..
You say the peering has been up for a while but your post shows(assume provided from ISP) the age for that prefix was 6+ hrs, what does the ages show for other routes, have you you check your logs for around this time.
Going forward you can initiate a debug on that only prefix and see it produces any results in the future that you can work against.
access-list 100 permit ip host 1.1.1.0 host 255.255.255.0
debug ip bgp updates 100
03-27-2019 06:35 AM
I believe that Paul has identified an important detail. About 6 hours 36 minutes (or 40 minutes) ago something happened in BGP. Maybe it was some internal event or perhaps some external event. Can you post the output show ip bgp neighbor
HTH
Rick
03-28-2019 02:42 PM
the only logs I have from the ISP are those 06h36m21s timestamps on the route age I posted in the first post. I'll find out if they can provide any more.
I scrubbed the configs for the public IP's and posted the output of sh ip bgp neighbor
ASR1002#sh ip bgp neighbors 55.44.66.56 BGP neighbor is 55.44.66.56, remote AS 5555, external link Description: ISP BGP version 4, remote router ID 55.44.66.56 BGP state = Established, up for 21w3d Last read 00:00:04, last write 00:00:26, hold time is 90, keepalive interval is 30 seconds Neighbor sessions: 1 active, is not multisession capable (disabled) Neighbor capabilities: Route refresh: advertised and received(new) Four-octets ASN Capability: advertised and received Address family IPv4 Unicast: advertised and received Enhanced Refresh Capability: advertised Multisession Capability: Stateful switchover support enabled: NO for session 1 Message statistics: InQ depth is 0 OutQ depth is 0 Sent Rcvd Opens: 1 1 Notifications: 0 0 Updates: 2 48009330 Keepalives: 473147 433742 Route Refresh: 0 0 Total: 473150 48443073 Default minimum time between advertisement runs is 30 seconds For address family: IPv4 Unicast Session: 55.44.66.56 BGP table version 1900650436, neighbor version 1900650359/1900650436 Output queue size : 0 Index 117, Advertise bit 1 117 update-group member My AS number is allowed for 5 number of times Outbound path policy configured Route map for outgoing advertisements is LOCAL_AS Slow-peer detection is disabled Slow-peer split-update-group dynamic is disabled Sent Rcvd Prefix activity: ---- ---- Prefixes Current: 2 731432 (Consumes 46811648 bytes) Prefixes Total: 2 190657176 Implicit Withdraw: 0 182926233 Explicit Withdraw: 0 6999511 Used as bestpath: n/a 669958 Used as multipath: n/a 0 Outbound Inbound Local Policy Denied Prefixes: -------- ------- Invalid Path: 4383188 n/a Other Policies: 63555254 n/a Total: 67938442 0 Number of NLRIs in the update sent: max 3, min 0 Last detected as dynamic slow peer: never Dynamic slow peer recovered: never Refresh Epoch: 1 Last Sent Refresh Start-of-rib: never Last Sent Refresh End-of-rib: never Last Received Refresh Start-of-rib: never Last Received Refresh End-of-rib: never Sent Rcvd Refresh activity: ---- ---- Refresh Start-of-RIB 0 0 Refresh End-of-RIB 0 0 Address tracking is enabled, the RIB does have a route to 55.44.66.56 Connections established 47; dropped 46 Last reset 21w3d, due to Admin. shutdown of session 1 Transport(tcp) path-mtu-discovery is enabled Graceful-Restart is disabled Connection state is ESTAB, I/O status: 1, unread input bytes: 0 Connection is ECN Disabled Mininum incoming TTL 0, Outgoing TTL 1 Local host: 55.44.66.55, Local port: 24079 Foreign host: 55.44.66.56, Foreign port: 179 Connection tableid (VRF): 0 Enqueued packets for retransmit: 0, input: 0 mis-ordered: 0 (0 bytes) Event Timers (current time is 0x34B92D6571): Timer Starts Wakeups Next Retrans 473626 476 0x0 TimeWait 0 0 0x0 AckHold 5479230 3071178 0x0 SendWnd 0 0 0x0 KeepAlive 0 0 0x0 GiveUp 0 0 0x0 PmtuAger 1 1 0x0 DeadWait 0 0 0x0 Linger 0 0 0x0 iss: 921970404 snduna: 930960336 sndnxt: 930960336 sndwnd: 32768 irs: 1336347572 rcvnxt: 2425078609 rcvwnd: 15985 delrcvwnd: 399 SRTT: 300 ms, RTTO: 303 ms, RTV: 3 ms, KRTT: 0 ms minRTT: 1 ms, maxRTT: 709 ms, ACK hold: 200 ms Status Flags: none Option Flags: higher precendence, nagle, path mtu capable, md5 Datagrams (max data segment is 1024 bytes): Rcvd: 9252656 (out of order: 1), with data: 8781418, total data bytes: 1088731036 Sent: 6450220 (retransmit: 476 fastretransmit: 0),with data: 473149, total data bytes: 8989931 ASR1002#
I will begin running these commands against my prefixes to see if I can capture any bgp events the next time this occurs. thanks Paul.
access-list 100 permit ip host 1.1.1.0 host 255.255.255.0
debug ip bgp updates 100
03-28-2019 07:38 PM
Thanks for the output of show ip bgp neighbor. Unfortunately I do not see anything there that sheds light on this issue. Keep monitoring and let us know if it happens again.
HTH
Rick
03-29-2019 06:53 AM
I got a response from the ISP. Looks like there was no issue here. I'll close the discussion. thanks everyone.
After working through the route age discrepancy at our peer with engineering, I can provide clarification. For the more recent and recurring route age refreshes, where we do not see our peer drop and also do not see a route age change from the neighbor router/AS network, this local age refresh is a result of new installs on that chassis. When completing the new internet service turnups on the ALU gateways, the local route table is refreshed. This is only observed from the local chassis and wouldn’t be affecting active routing or any other router/AS. So we don’t need to hunt for any communications associated to those age refreshes, there isn’t any. I was also able to validate the same behavior/age looking at any prefix received on any peer at the same gateway router locally. I was also able to verify that the age refresh was not transient to any other routers/networks among any of them. I’m sorry we got off track accounting for that point.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide