01-15-2015 07:53 AM - edited 03-07-2019 10:14 PM
Recently I have experienced some disruptions caused by things that should be isolated but impact the entire network.
I am trying to track down if this is possibly due to the conditions being too much for our 7600's to handle.
The first strange thing is we will get a loss of routing protocol adjacency on an SVI, but the VLAN is part of a 10G interface, where utilization during the time of the outage was about 8Gbps. Is this enough to cause the adjacency to drop?
For example we lost an adjacency on this router last night:
Switch Fabric Resources
Bus utilization: current: 1%, peak was 17% at 18:47:55 CDT Mon Apr 9 2012
Fabric utilization: Ingress Egress
Module Chanl Speed rate peak rate peak
1 0 20G 4% 48% @03:51 08Dec14 11% 53% @22:07 14Jan15
1 1 20G 4% 33% @15:57 18Aug12 0% 19% @17:49 18Aug12
2 0 20G 0% 1% @17:01 29Jun11 0% 1% @02:37 16Jun11
2 1 20G 10% 52% @22:07 14Jan15 3% 53% @03:52 08Dec14
3 0 20G 0% 6% @13:21 25Jul13 4% 12% @18:43 14Dec14
3 1 20G 2% 13% @07:45 10Dec14 1% 18% @14:39 09Jan15
4 0 20G 0% 5% @23:39 04Mar13 0% 6% @02:32 11Dec12
4 1 20G 0% 9% @10:55 29Apr14 0% 8% @11:14 21Aug11
5 0 20G 0% 0% 0% 2% @11:21 28Jul14
6 0 20G 1% 5% @11:25 19Jan12 2% 6% @11:25 19Jan12
System Resources
PFC operating mode: PFC3B
Supervisor redundancy mode: administratively sso, operationally sso
Switching resources: Module Part number Series CEF mode
1 WS-X6708-10GE CEF720 dCEF
2 WS-X6704-10GE CEF720 CEF
3 WS-X6748-GE-TX CEF720 dCEF
4 WS-X6748-GE-TX CEF720 dCEF
5 RSP720-3CXL-GE supervisor CEF
6 RSP720-3CXL-GE supervisor CEF
The interface that has the VLAN and subsequently the SVI was:
CORE-7600#sh int tenGigabitEthernet 1/6 | inc drops
Input queue: 0/2000/0/0 (size/max/drops/flushes); Total output drops: 1039261
Farther upstream toward the internet uplinks, we received an RX spike from an average ~3.5Gbps to around 4.6Gbps, but this was on an ASR 9010. This still caused ISIS adjacency to drop but I don't understand why.
I have other problems with design - our network is largely one big NSSA OSPF area, so we get Type 7 flooding happening as well, but basically I am trying to find out where the weakest link is and if there is any correlation between bandwith usage/spikes and our routing problems, why, and what I can do about it.
Thanks in advance
04-08-2015 02:52 PM
Bump
04-08-2015 09:30 PM
Hi,
When you say utilization around 8 Gbps is that a 30 sec or 5min average utlization?
Since you are seeing lot of outout drops, it is very likely that the traffic rate at times is going above 10 Gig. You could check the utlization graph for this interface on 1ms average to figure out the rate of traffic burst on this interface.
Hope this helps and grade my post if it is useful.
Regards,
Madhu.
04-09-2015 04:22 AM
Disclaimer
The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.
Liability Disclaimer
In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.
Posting
What I believe Madhu it telling you, your high utilization, and drops, might be impacting routing protocol keep alives. If so, a QoS policy that protects the routing protocol packets might correct the issue.
04-09-2015 08:07 AM
Thanks for the input. It is still good for me to know I can analyze and truthfully speak about the cause for these problems. I would have to tune some kind of monitoring tool to check this closely, since we poll on a 5min interval, but this gives some indication of traffic pattern:
04-09-2015 08:09 AM
Thank you for those that replied. I will look into QoS to maintain routing adjacency if we run into this again.
04-09-2015 12:12 PM
Edit - sorry, forget that.
Just noticed you posted the fabric utilisation figures and they suggest it is not being oversubscribed.
Jon
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide