cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2067
Views
0
Helpful
6
Replies

BGP Stops Advertising Prefixes for No apparent Reason

Jesse Shumaker
Level 1
Level 1

One of my ISP's reached out to me the other day to notify me that I had unadvertised both of my /24 prefixes towards them for a millisecond, so fortunately this didn't cause any routing issues out into the Internet towards our prefixes. I checked the logs on the router and noticed nothing related to allowing this behavior and there were no engineer changes to un-advertise those prefixes towards this BGP peer. The BGP session was also established and up for over 1 year. There wasn't any CPU spikes to make BGP glitch or cause other strange behavior around this time.

 

They also notified me that this happened on March 6th for a longer period of time and the prefix's were pulled from being advertised towards the Internet. I also have no evidence of this behavior and there were no BGP changes on my ASR to cause this behavior.

 

Can anyone explain what may have caused this? I'm running dual ASR1002's in a multi homed scenario with dual ISP's who both advertise our /24 prefixes.

 

thanks

 

===============================================================================

Route Table (Router: Base)

===============================================================================

Dest Prefix[Flags]                            Type    Proto     Age        Pref

      Next Hop[Interface Name]                                    Metric  

-------------------------------------------------------------------------------

1.1.1.0/24                            Remote  BGP       06h36m21s  170

       55.44.66.56                                                 0

-------------------------------------------------------------------------------

No. of Routes: 1

Flags: n = Number of times nexthop is repeated

       B = BGP backup route available

       L = LFA nexthop available

       S = Sticky ECMP requested

===============================================================================



*A:bear1.phx1# show router route-table 68.106.64.0



===============================================================================

Route Table (Router: Base)

===============================================================================

Dest Prefix[Flags]                            Type    Proto     Age        Pref

      Next Hop[Interface Name]                                    Metric  

-------------------------------------------------------------------------------

2.2.2.0/24                               Remote  BGP       06h40m20s  170

      55.44.66.56                                                  0

-------------------------------------------------------------------------------

No. of Routes: 1

Flags: n = Number of times nexthop is repeated

       B = BGP backup route available

       L = LFA nexthop available

       S = Sticky ECMP requested

===============================================================================
1 Accepted Solution

Accepted Solutions

I got a response from the ISP. Looks like there was no issue here. I'll close the discussion. thanks everyone.

 

After working through the route age discrepancy at our peer with engineering, I can provide clarification. For the more recent and recurring route age refreshes, where we do not see our peer drop and also do not see a route age change from the neighbor router/AS network, this local age refresh is a result of new installs on that chassis. When completing the new internet service turnups on the ALU gateways, the local route table is refreshed. This is only observed from the local chassis and wouldn’t be affecting active routing or any other router/AS. So we don’t need to hunt for any communications associated to those age refreshes, there isn’t any. I was also able to validate the same behavior/age looking at any prefix received on any peer at the same gateway router locally. I was also able to verify that the age refresh was not transient to any other routers/networks among any of them. I’m sorry we got off track accounting for that point.

View solution in original post

6 Replies 6

Hello,

 

it is going to be difficult to track anything in the millisecond range down. Even if you have BFD configured, the minimum configurable value is 50 milliseconds, so if it is less than that, nothing will be registered.

 

I guess it would be useful to see the logs from your ISP(s) if you can get them...they must have something logged, otherwise they would not have notified you...

Hello

First question would be how are you advertising this subnet to your ISP, has the interface of this subnet been withdrawn due its related interface temporally being shutdown or flapped etc..

 

You say the peering has been up for a while but your post shows(assume provided from ISP) the age for that prefix was 6+ hrs, what does the ages show for other routes, have you you check your logs for around this time.

 

Going forward you can initiate a debug on that only prefix and see it produces any results in the future that you can work against.

 

access-list 100 permit ip host 1.1.1.0 host 255.255.255.0

debug ip bgp updates 100


Please rate and mark as an accepted solution if you have found any of the information provided useful.
This then could assist others on these forums to find a valuable answer and broadens the community’s global network.

Kind Regards
Paul

I believe that Paul has identified an important detail. About 6 hours 36 minutes (or 40 minutes) ago something happened in BGP. Maybe it was some internal event or perhaps some external event. Can you post the output show ip bgp neighbor

 

HTH

 

Rick

HTH

Rick

the only logs I have from the ISP are those 06h36m21s timestamps on the route age I posted in the first post. I'll find out if they can provide any more.

 

I scrubbed the configs for the public IP's and posted the output of sh ip bgp neighbor

 

 

ASR1002#sh ip bgp neighbors 55.44.66.56
BGP neighbor is 55.44.66.56,  remote AS 5555, external link
 Description: ISP
  BGP version 4, remote router ID 55.44.66.56
  BGP state = Established, up for 21w3d
  Last read 00:00:04, last write 00:00:26, hold time is 90, keepalive interval is 30 seconds
  Neighbor sessions:
    1 active, is not multisession capable (disabled)
  Neighbor capabilities:
    Route refresh: advertised and received(new)
    Four-octets ASN Capability: advertised and received
    Address family IPv4 Unicast: advertised and received
    Enhanced Refresh Capability: advertised
    Multisession Capability:
    Stateful switchover support enabled: NO for session 1
  Message statistics:
    InQ depth is 0
    OutQ depth is 0

                         Sent       Rcvd
    Opens:                  1          1
    Notifications:          0          0
    Updates:                2   48009330
    Keepalives:        473147     433742
    Route Refresh:          0          0
    Total:             473150   48443073
  Default minimum time between advertisement runs is 30 seconds

 For address family: IPv4 Unicast
  Session: 55.44.66.56
  BGP table version 1900650436, neighbor version 1900650359/1900650436
  Output queue size : 0
  Index 117, Advertise bit 1
  117 update-group member
  My AS number is allowed for 5 number of times
  Outbound path policy configured
  Route map for outgoing advertisements is LOCAL_AS
  Slow-peer detection is disabled
  Slow-peer split-update-group dynamic is disabled
                                 Sent       Rcvd
  Prefix activity:               ----       ----
    Prefixes Current:               2     731432 (Consumes 46811648 bytes)
    Prefixes Total:                 2  190657176
    Implicit Withdraw:              0  182926233
    Explicit Withdraw:              0    6999511
    Used as bestpath:             n/a     669958
    Used as multipath:            n/a          0

                                   Outbound    Inbound
  Local Policy Denied Prefixes:    --------    -------
    Invalid Path:                   4383188        n/a
    Other Policies:                63555254        n/a
    Total:                         67938442          0
  Number of NLRIs in the update sent: max 3, min 0
  Last detected as dynamic slow peer: never
  Dynamic slow peer recovered: never
  Refresh Epoch: 1
  Last Sent Refresh Start-of-rib: never
  Last Sent Refresh End-of-rib: never
  Last Received Refresh Start-of-rib: never
  Last Received Refresh End-of-rib: never
                                       Sent       Rcvd
        Refresh activity:              ----       ----
          Refresh Start-of-RIB          0          0
          Refresh End-of-RIB            0          0

  Address tracking is enabled, the RIB does have a route to 55.44.66.56
  Connections established 47; dropped 46
  Last reset 21w3d, due to Admin. shutdown of session 1
  Transport(tcp) path-mtu-discovery is enabled
  Graceful-Restart is disabled
Connection state is ESTAB, I/O status: 1, unread input bytes: 0
Connection is ECN Disabled
Mininum incoming TTL 0, Outgoing TTL 1
Local host: 55.44.66.55, Local port: 24079
Foreign host: 55.44.66.56, Foreign port: 179
Connection tableid (VRF): 0

Enqueued packets for retransmit: 0, input: 0  mis-ordered: 0 (0 bytes)

Event Timers (current time is 0x34B92D6571):
Timer          Starts    Wakeups            Next
Retrans        473626        476             0x0
TimeWait            0          0             0x0
AckHold       5479230    3071178             0x0
SendWnd             0          0             0x0
KeepAlive           0          0             0x0
GiveUp              0          0             0x0
PmtuAger            1          1             0x0
DeadWait            0          0             0x0
Linger              0          0             0x0

iss:  921970404  snduna:  930960336  sndnxt:  930960336     sndwnd:  32768
irs: 1336347572  rcvnxt: 2425078609  rcvwnd:      15985  delrcvwnd:    399

SRTT: 300 ms, RTTO: 303 ms, RTV: 3 ms, KRTT: 0 ms
minRTT: 1 ms, maxRTT: 709 ms, ACK hold: 200 ms
Status Flags: none
Option Flags: higher precendence, nagle, path mtu capable, md5

Datagrams (max data segment is 1024 bytes):
Rcvd: 9252656 (out of order: 1), with data: 8781418, total data bytes: 1088731036
Sent: 6450220 (retransmit: 476 fastretransmit: 0),with data: 473149, total data bytes: 8989931

ASR1002#

I will begin running these commands against my prefixes to see if I can capture any bgp events the next time this occurs. thanks Paul. 

 

access-list 100 permit ip host 1.1.1.0 host 255.255.255.0

debug ip bgp updates 100

 

 

Thanks for the output of show ip bgp neighbor. Unfortunately I do not see anything there that sheds light on this issue. Keep monitoring and let us know if it happens again.

 

HTH

 

Rick

HTH

Rick

I got a response from the ISP. Looks like there was no issue here. I'll close the discussion. thanks everyone.

 

After working through the route age discrepancy at our peer with engineering, I can provide clarification. For the more recent and recurring route age refreshes, where we do not see our peer drop and also do not see a route age change from the neighbor router/AS network, this local age refresh is a result of new installs on that chassis. When completing the new internet service turnups on the ALU gateways, the local route table is refreshed. This is only observed from the local chassis and wouldn’t be affecting active routing or any other router/AS. So we don’t need to hunt for any communications associated to those age refreshes, there isn’t any. I was also able to validate the same behavior/age looking at any prefix received on any peer at the same gateway router locally. I was also able to verify that the age refresh was not transient to any other routers/networks among any of them. I’m sorry we got off track accounting for that point.

Review Cisco Networking for a $25 gift card