08-08-2011 03:10 AM - edited 03-04-2019 01:12 PM
Hi Pro,
A customer has got two last mile connectivity to two Service Providers in every Hub and Spoke locations. Last mile protocol is BGP in all sites. One of the spoke site connected to our network has last mile issues that BGP keeps flapping very badly. Since that link is primary last mile, customer is facing disconnections over the applications that he is running. We ideally want if the primary last mile starts flapping then the link should not be used till it becomes stable or some period of time. How do we achieve this?
We tried of following options:
1. Could not work with BGP dampening as it takes effect only when the prefix is advertised and withdrawn, not when the BGP neighbor itself flaps.
2. Thought of using BFD but again this will not work out because when last mile flaps heavily, BGP also flaps heavily.
3. Thought of using multi hop BGP, and have a route to BGP neighbor over IP SLA. IP SLA monitor the last mile reachability with lower frequency period and BGP neighbor is routed over Object tracking static route. But whenever last mile flaps heavily, BGP also flaps.
4. Interface dampening could not work out because the switch port where last mile is connected is not detecting the flap as we are using media covnerter.
Please suggest:
1. Can we try to tweak BGP retry timer if possible. If BGP flaps then BGP should wait till retry timer to expire to establish the next BGP connection. But I couldnt get the syntax for this.
2. Can this be done via EEM?
Please suggest for any other options or if I missed any basic thing here.
thanks in advance
Arun
08-08-2011 04:43 AM
Hello Narainarun,
I understand that you are having heavy frequency of flaps in the lastmile, If you dont want to see the BGP flaps that are occuring at micro level basis, you can tweak the hold timers in the BGP.
Hold-time value to use when negotiating a connection with the peer. The hold-time value is advertised in open packets and indicates to the peer the length of time that it should consider the sender valid. If the peer does not receive a keepalive, update, or notification message within the specified hold time, the BGP connection to the peer is closed and routers through that peer become unavailable. The hold time is three times the interval at which keepalive messages are sent.
Although it is not advisable to tweak the hold timers, the benefit that you are going to get by tweaking is that you wont see the BGP flaps in the log messages.
My bet on this would be to keep the flapping link shut and have it thoroughly loop/stress tested with ISP.
syntax :
hold time
Range:
6 through 65,535 seconds
Default:
90 seconds
Not really sure if this can be done with EEM
Thanks,
Ricky Micky
Pls rate if the content was useful
08-08-2011 05:46 AM
Hi Ricky,
We already tweaked BGP Hold time to 20 seconds (from default 180 sec) but still the issue persists. Even if we reduce hold time to 6 sec (the lowest value), frequent flaps still impact the customer traffic as BGP also flaps. Thats why I wanted to make the link logically shut (BGP inactive) for sometime when the link flaps too badly.
For testing last mile provider, we are co-ordinating with BSO but it is stuck due to some issues in their side.
thanks in advance
Arun
08-08-2011 04:31 PM
Hi Arun,
I would suggest you to have thorough test on the link in question here before you setup some failover, Divert the traffic permanently using the routemap MED/local pref to the secondary link and have the primay link tested. The whole purpose gets defeated and the production traffic would be impacted back again if the primary fails over many a times.
Thanks,
Ricky Micky
08-09-2011 12:39 AM
Hi Ricky,
I am pushing the BSO for last mile test. Since there is a lag in response from them I thought of some other option. Thanks for your replies.
thanks
Arun
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide