09-23-2015 12:21 AM - edited 03-05-2019 06:56 AM
Hi all,
I am facing with long time converge of BGP, Kindly give me some point I can find out.
Our problem is:
Our company has 2 NNI connection with our partner. The protocol was used is BGP.
When we do the fail over test (deactivate the primary link to check the time traffic move to secondary link), The time to converge is 7mins.
As normal, it should be 90second,
So the problem is belong our partner.
As policy, They can not share us the configuration.
So, Any one have any idea about this problem?
Please help me provide some key word which i can discovery the problem.
09-23-2015 03:26 AM
Hi Ivtquan1991,
It is difficult to know what is the problem without configuration, could you please share the configuration.
Another point could you please share how are you simulating the failure scenario, are you shutting down the interface or the BGP.
Kind regards,
Ahmed Muhi
09-23-2015 08:36 PM
Hi Ahmed Muhi,
Thank you for your comment,
In fail over test, We deactivate the primary link (JUNOS) and monitor the traffic will go through secondary link.
09-24-2015 09:09 AM
Hi Ivtquan1991,
Can you also do "debug ip bgp events" and "debug ip bgp updates", when you do the failover scenario, to have more details about what really happens.
Another point how do you measure the time it takes for the recovery process, I mean for example when you do the shutdown to the main link, you start to calculate the time, until you get a message that your BGP is down, I guess the reason should be "hold time expire". right?
Then the router goes through the re-convergence process, to find an alternate path. After all is finished traffic starts to flow again. If that is the case then it does not mean that you are facing a problem with convergence, BGP has already detected the problem earlier when it gave the message BGP is down.
I have some other questions to ask, though let us just clear the above.
Final point, you can see the agreed upon hold time by using this command "sho ip bgp neighbor x.x.x.x | include hold".
All the best!
09-23-2015 03:49 AM
By default, the max hold time should be 3 minutes (180 seconds). BGP will use the lowest of the timers during peering negotiation, so you could change your timers to something lower for keepalive and hold times. That would force their side to negotiate to your speed.
You can change your timers on the neighbor line itself or globally:
Global:
timers bgp 7 21
Neighbor:
neighbor <peer address> timers 7 21
The above would send a keepalive every 7 seconds and set the hold time to 21 seconds. When you lose 3 keepalives, you should fail over. Something to note would be that you will need to clear the peering after making this change. A soft reset of the peering will not reset these timers as far as I know, so there will be some downtime associated to this change. Once you make the change, clear the peering and then look at the neighbor to see if the change was made:
show ip bgp neighbor <neighbor address>
It will be at the top of the result from that command...
HTH,
John
09-23-2015 08:34 PM
Hi John,
Thank you for your comment,
because we got other connection with another partner and it is working fine now.
So, The trouble is not our configuration and its seem not due to keepalive and hold times (we use same parameter for all circuits).
Do you have any key word which I can find out more
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide