I'm new here so please let me know if I posted in the wrong forum. Anyway, I have a strange problem that I have been unable to nail down recently. I have two 2800-series routers trunked on physical ports to each other, along with a switch connected to each router, forming a square of sorts. The routers have primary and backup HSRP interfaces configured for each VLAN (there are 9 VLANs). Right now traffic is flowing through the primary router; randomly I'll notice that I'll lose connectvitity to the primary HSRP gateway, and will be unable to send traffic to other subnets (SVI VLANs) for several minutes (without changing the gateway). I also notice the router's CPU usage spike up near 99% while this is happening. The last time this happened I was remoted into 3 different machines, and had a couple context windows up in each session. Has anyone encountered anything like this? I was thinking perhaps some asymmetric routing is occurring, but I'm not sure.
Any help is greatly appreciated.
at the first step, forget asymmetric routing.
Please check the affected process while the problem occur:
sh processes cpu sorted | exclude 0.00%
Which porccess need the most cpu power?
All device have the same VLANs?
Do you have spanning tree configured on all devices?
Do you received any spanning tree messages, sh log?
Do you use SVI (int vlan, ip add..) oder the phyical int (int fas.., ip add..) on the router?
Thanks for the reply.
1. I ran the sh proc cpu | exclude 0.00% but the highest utilization was only 8%. That was Net Input.
2. All devices currently in question are on the same VLAN
3. Spanning tree is not configured for the VLAN instances on the routers.
4. I don't see any.
5. SVI is being used for the interfaces.
I would suggest you to enter the following command on your router on which you notice high CPU :
process cpu threshold type total rising 80 interval 5 falling 20 interval 5
This command would generate a log message whenever your cpu goes above 80 % and will list down three process ids which were consuming maximum CPU during the spike. These process ids can then be co-related to "sh proc cpu" output.
Hope that helps,
perhaps, you will have a loop or a software bug.
So, can you splitt up the cable circle to avoid loop,
and test it for a while?
(And configure Talhas suggestion, of course)
Without redundancy, the problem shall not occur -> configure spanning tree