06-28-2009 05:27 AM - edited 03-04-2019 05:15 AM
All,
I thought that the default timers for bgp were 60 and holddown was 180. I may be wrong, but shouldn't a route that falls out of the table be at most put back into the table after 180 seconds (3 min.)?
We tested our failover this weekend, and we shut down our main router to watch our block roll over to our backup router. We lost two packets. I peer with the provider using the same AS on my end (both of my routers are using bgp 1 for instance, and I peer with bgp 2). I'm wondering if this is the reason the failover happened so quickly?
Thanks,
John
Solved! Go to Solution.
06-29-2009 10:46 AM
John,
This behaviour is due to "bgp fast-external-fallover
" enabled by default. This command, suppresseds the timers.
Negate it and retest.
Sam
PS: Good post !!
06-29-2009 11:27 AM
Hello John,
fast-external-fallover tracks the state of the outgoing interface towards the eBGP peer.
If that interface is detected down the session is torned down too without having to wait for hold timer to expire
Hope to help
Giuseppe
06-28-2009 11:34 AM
Hello John,
>> but shouldn't a route that falls out of the table be at most put back into the table after 180 seconds (3 min.)?
This is BGP not RIP, there is no holddown timer here.
Hope to help
Giuseppe
06-29-2009 03:03 AM
Giuseppe,
Can you explain what the timers option is for on the neighbor statement? Also, when I do a "sh ip bgp neighbor x.x.x.x" I get holddown timers:
BGP neighbor is 15.15.15.2, remote AS 12, external link
BGP version 4, remote router ID 209.30.236.1
BGP state = Established, up for 00:01:08
Last read 00:00:08, hold time is 180, keepalive interval is 60 seconds
Neighbor capabilities:
Route refresh: advertised and received(old & new)
Address family IPv4 Unicast: advertised and received
What is it used for, and is there another way that we can keep it from failing over so quickly?
Thanks,
John
06-29-2009 06:19 AM
John,
since you shut down the router the BGP peer immediately broke causing the convergence. The 60 second keepalives and 180 hold time were not involved in this process since your interface went down. To test those timers you will need to maintain the interface up but not allow the keepalives to get to the peer router. You can use IP event dampening to prevent flapping interfaces from causing multiple convergences, but not aware of any user-defined parameters that will delay the convergence from happening.
06-29-2009 07:42 AM
since you shut down the router the BGP peer immediately broke causing the convergence.
How does the neighboring router know the interface went down without using the keepalives?
Thanks,
John
06-29-2009 10:22 AM
Interface goes down => routes via this interfaces are withdrawn => BGP session is teared down.
On the other hand, an ACL blocking TCP 179, would take a lot longer to be detected, about 3min as you expected.
HTH
Sam
06-29-2009 10:29 AM
Sam,
I guess my main question is why my route failed over so quickly. If the interface goes down, how can I control the convergence time or is this impossible?
I'm really not grasping the concept of having hold timers, but they're only queried if there's an access-list blocking the port. I would think that if the peer missed a hello packet, be it blocked or a down peer, the neighboring router should still send two more hellos before it flips to the other route, meaning 3 minutes by default.
Thanks Sam,
John
06-29-2009 10:45 AM
Hello John,
Sam has explained the probable reason for what you see.
Have you configured ebgp fast external fallover or its successor neigh x.x.x.x fall-over ?
Usually people complain of the slowness of failover when it relies on default timers.
You can see the timers as used to detect indirect failures like provider's staff putting in shut the session.
Reaction to link failure takes the time of interface link failure detection that depends on the technology in use:
for example if the link is a direct serial link and the provider router is also the DCE at OSI layer1 after shutting down the interface the other side goes down/down.
Another example is POS that be as less as 50 msecs.
Hope to help
Giuseppe
06-29-2009 10:46 AM
John,
This behaviour is due to "bgp fast-external-fallover
" enabled by default. This command, suppresseds the timers.
Negate it and retest.
Sam
PS: Good post !!
06-29-2009 11:12 AM
Sam,
AH! Now, if I'm peering with my ISP and I try to negate it on my end, can it be done on one of spoke routers or does it have to be done on the multihomed router?
Thank you for the compliment on "good post." =)
John
06-29-2009 11:18 AM
removing it from your end should be enough to see session taking longer to tear down (never seen this being a requirement...but I can see why one would want that :-)
The other things about timers, is that they are negotiated and lowest wins. so if you peering router is using default you can only benefit from longer hold time if you agree with peers to match urs or exceed them.
I am not entirely sure, but I recall seeing a new feature which stops this. it is used as a security feature to protect attackers to meltdown your CPU by reducing timers and therefore increasing BGP scans.
Sam
06-29-2009 11:19 AM
I found the command on Cisco's site, so now I have to ask:
How does the fast-external-fallover know that the peer went down if it's not using hello packets? Does it just see the route fall from the table, perform some kind of soft reconfig, and then fallover to the other peer?
Thanks,
John
06-29-2009 11:27 AM
Hello John,
fast-external-fallover tracks the state of the outgoing interface towards the eBGP peer.
If that interface is detected down the session is torned down too without having to wait for hold timer to expire
Hope to help
Giuseppe
06-29-2009 11:30 AM
Many thanks for the rating !
It's link status related,
Usage Guidelines
The bgp fast-external-fallover command is used to disable or enable fast external fallover for BGP peering sessions with directly connected external peers. The session is immediately reset if link goes down. Only directly connected peering sessions are supported.
If BGP fast external fallover is disabled, the BGP routing process will wait until the default hold timer expires (3 keepalives) to reset the peering session. BGP fast external fallover can also be configured on a per-interface basis using the ip bgp fast-external-fallover interface configuration command.
Sam
06-29-2009 11:34 AM
So, if I wanted for the peer to wait for four hours before rolling my block over, I would need to disable fast-external-failover, and then set my timers to 4800 14400 and have the provider do the same? Or should I leave my default keepalives at 60, and then set my holdtime for 14400?
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide