09-27-2025 04:14 AM
Hi All,
I wanted to share an issue I faced recently with HSRP between two Cisco Catalyst 3850 switches in a distribution layer setup.
The Problem:
We noticed that the HSRP active/standby roles kept flapping every few minutes, causing intermittent packet loss for end users. Logs showed frequent changes in HSRP state and %HSRP-5-STATECHANGE messages.
Troubleshooting Steps:
The Root Cause:
After a deeper look, it turned out that HSRP timers (hello/hold) combined with some minor interface errors were contributing to the instability. When the default timers (hello 3 sec / hold 10 sec) were used in a high‑traffic environment, even small delays were enough to cause state changes.
Solution:
Result:
HSRP stabilized, no more flapping, and user packet drops stopped completely.
I’m sharing this here in case others face similar issues. Has anyone else experienced HSRP instability due to timers and interface errors? Would love to hear how you normally tune HSRP for high‑availability designs.
Thanks,
Md. Irshad Ansari
09-27-2025 06:47 AM - edited 09-27-2025 07:13 AM
Not really.. it seems that your cabling issues caused all of the fuzz.
Consider UDLD and maybe BFD to detect issues of cables faster.
09-27-2025 07:07 AM
Hello @MD Irshad Ansari
-> pair HSRP with interface tracking or object tracking to make failovers smarter rather than purely timer-driven...
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide