cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
386
Views
10
Helpful
6
Replies

Network Outages - 7600 Exhausted?

Evan Roggenkamp
Level 1
Level 1

Recently I have experienced some disruptions caused by things that should be isolated but impact the entire network.

I am trying to track down if this is possibly due to the conditions being too much for our 7600's to handle.

The first strange thing is we will get a loss of routing protocol adjacency on an SVI, but the VLAN is part of a 10G interface, where utilization during the time of the outage was about 8Gbps. Is this enough to cause the adjacency to drop?

For example we lost an adjacency on this router last night:

Switch Fabric Resources
  Bus utilization: current: 1%, peak was 17% at 18:47:55 CDT Mon Apr 9 2012
  Fabric utilization:     Ingress                    Egress
    Module  Chanl  Speed  rate  peak                 rate  peak               
    1       0        20G    4%   48% @03:51 08Dec14   11%   53% @22:07 14Jan15
    1       1        20G    4%   33% @15:57 18Aug12    0%   19% @17:49 18Aug12
    2       0        20G    0%    1% @17:01 29Jun11    0%    1% @02:37 16Jun11
    2       1        20G   10%   52% @22:07 14Jan15    3%   53% @03:52 08Dec14
    3       0        20G    0%    6% @13:21 25Jul13    4%   12% @18:43 14Dec14
    3       1        20G    2%   13% @07:45 10Dec14    1%   18% @14:39 09Jan15
    4       0        20G    0%    5% @23:39 04Mar13    0%    6% @02:32 11Dec12
    4       1        20G    0%    9% @10:55 29Apr14    0%    8% @11:14 21Aug11
    5       0        20G    0%    0%                   0%    2% @11:21 28Jul14
    6       0        20G    1%    5% @11:25 19Jan12    2%    6% @11:25 19Jan12

System Resources
  PFC operating mode: PFC3B
  Supervisor redundancy mode: administratively sso, operationally sso
  Switching resources: Module   Part number               Series      CEF mode
                       1        WS-X6708-10GE             CEF720          dCEF
                       2        WS-X6704-10GE             CEF720           CEF
                       3        WS-X6748-GE-TX            CEF720          dCEF
                       4        WS-X6748-GE-TX            CEF720          dCEF
                       5        RSP720-3CXL-GE        supervisor           CEF
                       6        RSP720-3CXL-GE        supervisor           CEF

The interface that has the VLAN and subsequently the SVI was:

CORE-7600#sh int tenGigabitEthernet 1/6 | inc drops
  Input queue: 0/2000/0/0 (size/max/drops/flushes); Total output drops: 1039261


Farther upstream toward the internet uplinks, we received an RX spike from an average ~3.5Gbps to around 4.6Gbps, but this was on an ASR 9010. This still caused ISIS adjacency to drop but I don't understand why.

I have other problems with design - our network is largely one big NSSA OSPF area, so we get Type 7 flooding happening as well, but basically I am trying to find out where the weakest link is and if there is any correlation between bandwith usage/spikes and our routing problems, why, and what I can do about it.

Thanks in advance

6 Replies 6

Evan Roggenkamp
Level 1
Level 1

Bump

Hi,

 

When you say utilization around 8 Gbps is that a 30 sec or 5min average utlization?

Since you are seeing lot of outout drops, it is very likely that the traffic rate at times is going above 10 Gig. You could check the utlization graph for this interface on 1ms average to figure out the rate of traffic burst on this interface.

 

Hope this helps and grade my post if it is useful.

 

Regards,

Madhu.

Disclaimer

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

What I believe Madhu it telling you, your high utilization, and drops, might be impacting routing protocol keep alives.  If so, a QoS policy that protects the routing protocol packets might correct the issue.

Thanks for the input. It is still good for me to know I can analyze and truthfully speak about the cause for these problems. I would have to tune some kind of monitoring tool to check this closely, since we poll on a 5min interval, but this gives some indication of traffic pattern:

Evan Roggenkamp
Level 1
Level 1

Thank you for those that replied. I will look into QoS to maintain routing adjacency if we run into this again.

Edit - sorry, forget that.

Just noticed you posted the fabric utilisation figures and they suggest it is not being oversubscribed.

Jon

Review Cisco Networking for a $25 gift card