08-25-2011 06:12 PM - edited 03-07-2019 01:54 AM
Hi all,
I have been testing an HSRP setup using HSRP v1 and have been wondering
why it takes so long to switch back to the original active router after it has recovered from a
failure.
The timings I have when using the defaults, is a loss of packet forwarding for 28secs
when moving to the Standby router, even though the routing protocol has converged
and when the original active router is restored, packet forwarding is loss for 50secs.
I've include a topology map and the standby debug.
Packet forwarding doesn't happen until the Active router is found why ?
ideas and views welcome.
TIA
08-26-2011 12:55 AM
HSRP is pretty old protocol now and not much used anymore. For "today's network standards" I suggest you tweak its timers to smaller values.
08-26-2011 02:19 AM
Florin,
HSRP is pretty old protocol now and not much used anymore. For "today's network standards" I suggest you tweak its timers to smaller values.
Whoa! This is quite a strong assumption, considering the fact that HSRP was, for a long long long time, the only FHRP protocol supported on lower-end Cisco multilayer switches, and is a routine part of many Cisco design documents... and of all current relevant certification exams, too. Not much used? I absolutely disagree with that assumption. By no means am I a representative person to say this on a global level, but at least personally, I am encountering the HSRP all the time.
In fact, there are not many choices in the FHRP protocol field. You have the "open" VRRP which Cisco claims that it infringes its patents on HSRP (and only supported on 3560 and higher since 12.2(58)SE), you have the HSRP universally supported across Cisco product platforms, and of course, GLBP which is supported only on Cat4500 and higher. The 'ucarp' approach from BSD is kind of specific, and given the fact that Cisco patented the idea of the tuple
Old or not - the question is whether it is up to today's needs. Note that often, a protocol is deemed "old" just because it rects slowly. However, that is a conceptual problem: the Hello protocol so often used with most today's protocols is first and foremost intended to convey configuration data and localize neighbors. It is not so well designed to rapidly detect a loss of neighbor - and it should not be. Neighbor loss detection is a specific requirement for which a separate, lightweight protocol can be used, and recently, such protocol has indeed been introduced: the BFD. It is now a matter of integrating the existing protocols with BFD detection (making them BFD clients) to rapidly react to a neighbor loss, with low CPU demands. Hence, combining HSRP with BFD can provide exceptionally fast convergence without tweaking HSRP's timers themselves.
Best regards,
Peter
08-26-2011 03:26 AM
I will be testing the setup using msec timers, but just was suprised to see it
take so long to failover and then even longer when failing back, when using the
defaults. So my main point: is this normal behaviour ?
08-26-2011 04:00 AM
No 28 secs is not normal.
By default HSRP hellos are sent every 3 seconds. If a hello has not been received for 10 seconds by the standby then it becomes active.
50 seconds to resume sounds awfully like an STP (802.1d) timer being involved here.
Jon
08-26-2011 04:01 AM
Florin
Have to agree with Peter on this. That is an incredibly general statement, do you have any evidence to back it up ?
Jon
08-26-2011 04:10 AM
HSRP may be old but its still in play. As said by many already, its the most FHRP solution used. About the failover, preempt and use msec timers.
08-26-2011 04:48 AM
HSRP is all over our big corporate network and it works fine . Quite frankly whats the issue whether it falls back over in 5 or 40 seconds as long as its still has a working path...
08-26-2011 04:50 AM
Glen
Think that's the point. During the 28 and 50 secs there is no path available ie. no packet forwarding
Jon
08-26-2011 05:09 AM
I will do more debugging output for the STP,HSRP and EIGRP
to see the interactions and post, but will have to be later.
thanks
08-26-2011 07:22 AM
You have a long failover time. Interested in what you find out. I had a similar problem while using HSRP on routers. HSRP worked fine doing a power failure but longer delay if the interface just went down. I love HSRP, especially on layer three switches.
Jonathan,
08-26-2011 04:20 PM
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide