cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1375
Views
0
Helpful
5
Replies

Tunning the GRE keepalive or the routing protocol timer for fast convergence?

Carlos T
Level 1
Level 1

Hi.

In a design with a hub router with ~100 GRE/IPSEC tunnels (and still growing), if we want to achieve high availability/fast convergence avoiding the cpu/memory overutilization, what could be better to fine tune? the GRE keepalive timers, or the routing protocol timer?  Is there a best practice for this? and what could be the recommended value?

The routing protocol is EBGP running over the GRE/IPSEC tunel and the Hub routers is ASR1001.

Thanks,

Carlos.

5 Replies 5

Richard Burts
Hall of Fame
Hall of Fame

Carlos

I faced a somewhat similar question with a customer who is running a pretty large hub and spoke network (with a bit more than 400 spokes). Differences in our network include the fact that there are two hub routers and each spoke has a tunnel to each hub, and the fact that our routing protocol was EIGRP and not EBGP. We came to the conclusion that it was better to depend on the routing protocol for detecting failure and converging and not to depend on the tunel keepalives. In fact we came to the conclusion that with the routing protocol to detect failures that there was little benefit in running GRE keepalives. So we did not enable this feature.

I recognize that EBGP will not converge as quickly as EIGRP. But I believe that you would benefit more from tuning the routing protocol than you would from tuning the GRE keepalives.

HTH

Rick

HTH

Rick

Disclaimer

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

I agree with Rick, I would suggest tuning your IGP for faster convergence (such tuning might include other than lost of  neighbor detection).

However you might still want to you GRE keepalives, as often it will take a tunnel interface "down", which depending on your network monitoring (if any) might be easier for such monitoring to detect a path outage than loss of a IGP path.  I.e. tune the IGP for fast convergence, and perhaps retain GRE keepalives for logical path failure monitoring.

Richard/Joseph,

Our design is also dual hub and each spoke has a tunel to each hub. As this is a service provider entry point to an mpls network (for internet routers), it is possible that it may grow to 400 spokes (almost same as your topology), or even more..

Due to the mixing vendors at each spoke and the possibility of large number of prefixes, we decided to use EBGP instead of an IGP (standard or propietary).

We want to achieve fast convergence, but not overwhelming router  resources (memory/cpu) and avoid routing flap instabilitys due to  agressive timers.

Questions:

1. If we decide to fine tune the EBGP timers, what could be the best values to start with? hello, dead interval? keeping in mind the large number of peerings (actually and in the future).

2. Of both solutions (use tuned gre keepalives, or tuned bgp timers) which is lighter (less cpu/memory cycles) in the hub router?  fast gre keepalives or more agressive hello dead bgp timers?

Thanks,

Carlos.

Disclaimer

The  Author of this posting offers the information contained within this  posting without consideration and with the reader's understanding that  there's no implied or expressed suitability or fitness for any purpose.  Information provided is for informational purposes only and should not  be construed as rendering professional advice of any kind. Usage of this  posting's information is solely at reader's own risk.

Liability Disclaimer

In  no event shall Author be liable for any damages whatsoever (including,  without limitation, damages for loss of use, data or profit) arising out  of the use or inability to use the posting's information even if Author  has been advised of the possibility of such damage.

Posting

If I recall correctly, basic eBGP timers don't lend themselves for very fast detection of lost peers (eBGP is more suited for using downing of physical link as a trigger - of course n/a for a tunnel).

If supported, BFD might be the best "soft" way to detect neighbor loss.

You might try reducing GRE keepalives, incrementally, while watching CPU loading.  I've often used 1 second keepalives across GRE tunnels, but for less than a 100 tunnels on a hub device.

Carlos

Joseph makes an interesting point about side benefits of using GRE keepalive in taking the tunnel line protocol down. This could trigger alerts which could be useful if you need network control staff to take some manual action to repair the connection. Since the original question was about speeding up convergence I had not considered the operational control potential as a point in favor of GRE keepalive. But it is an interesting point to consider in choosing which alternative is best.

One thought that occurs to me is that taking down the tunnel line protocol would be very effective where the routing protocol is an IGP like OSPF or EIGRP which assumes that routing peers are on a connected subnet and if the interface goes down then they immediately take down the peer relationship. But in BGP which does not necessarily assume that peers are on a connected subnet then I wonder if GRE keepalives would be particularly effective in taking down the peer/neighbor relationship.

I am also of the opinion that tuning the BGP timer would probably be less overhead on the router than using GRE keepalive and tuning the keepalive timers. The BGP packets used to maintain the BGP neighbor relationship are already required in the router and tuning the timers just means that it will be performed a bit more frequently. But GRE keepalive is not something that is already being done. So using this feature will require CPU processing to generate the packets and bandwidth usage to send the packets that would not be required otherwise.

HTH

Rick

HTH

Rick
Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Review Cisco Networking products for a $25 gift card