cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2083
Views
0
Helpful
5
Replies

[BGP] Too long time to converge the fail over test.

lvtquan1991
Level 1
Level 1

Hi all, 

 

I am facing with long time converge of BGP, Kindly give me some point I can find out.

Our problem is:

Our company has 2 NNI connection with our partner. The protocol was used is BGP.

When we do the fail over test (deactivate the primary link to check the time  traffic move to secondary link), The time to converge is 7mins. 

As normal, it should be 90second, 

So the problem is belong our partner.

As policy, They can not share us the configuration.

So, Any one have any idea about this problem?

Please help me provide some key word which i can discovery the problem.

5 Replies 5

Ahmed Muhi
Level 1
Level 1

Hi Ivtquan1991,

It is difficult to know what is the problem without configuration, could you please share the configuration.

Another point could you please share how are you simulating the failure scenario, are you shutting down the interface or the BGP.

 

Kind regards,

Ahmed Muhi

Hi Ahmed Muhi, 

 

Thank you for your comment, 

In fail over test, We deactivate the primary link (JUNOS) and monitor the traffic will go through secondary link.

 

Hi Ivtquan1991,

 

Can you also do "debug ip bgp events" and "debug ip bgp updates", when you do the failover scenario, to have more details about what really happens.

Another point how do you measure the time it takes for the recovery process, I mean for example when you do the shutdown to the main link, you start to calculate the time, until you get a message that your BGP is down, I guess the reason should be "hold time expire". right?

Then the router goes through the re-convergence process, to find an alternate path. After all is finished traffic starts to flow again. If that is the case then it does not mean that you are facing a problem with convergence, BGP has already detected the problem earlier when it gave the message BGP is down.

I have some other questions to ask, though let us just clear the above.

Final point, you can see the agreed upon hold time by using this command "sho ip bgp neighbor x.x.x.x | include hold".

 

All the best!

 

 

John Blakley
VIP Alumni
VIP Alumni

By default, the max hold time should be 3 minutes (180 seconds). BGP will use the lowest of the timers during peering negotiation, so you could change your timers to something lower for keepalive and hold times. That would force their side to negotiate to your speed. 

You can change your timers on the neighbor line itself or globally:

Global:

timers bgp 7 21

Neighbor:

neighbor <peer address> timers 7 21

The above would send a keepalive every 7 seconds and set the hold time to 21 seconds. When you lose 3 keepalives, you should fail over. Something to note would be that you will need to clear the peering after making this change. A soft reset of the peering will not reset these timers as far as I know, so there will be some downtime associated to this change. Once you make the change, clear the peering and then look at the neighbor to see if the change was made:

show ip bgp neighbor <neighbor address>

It will be at the top of the result from that command...

HTH,

John

HTH, John *** Please rate all useful posts ***

Hi John, 

 

Thank you for your comment, 

because we got other connection with another partner and it is working fine now.

So, The trouble is not our configuration and its seem not due to keepalive and hold times (we use same parameter for all circuits).

Do you have any key word which I can find out more

 

Review Cisco Networking for a $25 gift card