cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
812
Views
0
Helpful
6
Replies

Output queue drops on Cisco 3020

p.caforio
Level 1
Level 1

Dear All,

 

In one of my c-Class Enclosures I have two Cisco 3020.

On one of them I found output queue drops, drops that increases every day.. most probably during the night.

Drops are either on uplinks and access ports (backplane and external).

 

  Interface                   IHQ       IQD       OHQ       OQD      RXBS      RXPS      TXBS      TXPS      TRTL
-----------------------------------------------------------------------------------------------------------------
* Vlan1                         1         0         0         0         0         0         0         0         0
  FastEthernet0                 0         0         0         0         0         0         0         0         0
* GigabitEthernet0/1            0         0         0       304    592000       258   5791000       580         0
* GigabitEthernet0/2            0         0         0         0    684000       114    236000        93         0
* GigabitEthernet0/3            0         0         0         0         0         0     11000         5         0
* GigabitEthernet0/4            0         0         0         0   1689000      1561   3150000      1639         0
* GigabitEthernet0/5            0         0         0         0    240000       189     11000         5         0
* GigabitEthernet0/6            0         0         0         0    827000       200     11000         5         0
* GigabitEthernet0/7            0         0         0         0   6372000       614    423000       354         0
* GigabitEthernet0/8            0         0         0         0    653000       173    492000       167         0
* GigabitEthernet0/9            0         0         0       139   2626000       284   1113000       241         0
* GigabitEthernet0/10           0         0         0         0  18381000      4211   8860000      4479         0
* GigabitEthernet0/11           0         0         0         0         0         0     11000         5         0
* GigabitEthernet0/12           0         0         0         0    683000       648   1388000       665         0
* GigabitEthernet0/13           0         0         0       135    639000        63    378000        66         0
* GigabitEthernet0/14           0         0         0      1393     18000        32    842000        69         0
* GigabitEthernet0/15           0         0         0         0    109000        84     11000         5         0
* GigabitEthernet0/16           0         0         0       665   1682000       362   2197000       393         0
* GigabitEthernet0/17           0         0         0       131   6318000      5103   1878000       425         0
* GigabitEthernet0/18           0         0         0      9476   8897000      2251  23711000      7011         0
* GigabitEthernet0/19           0         0         0     22052         0         0     11000         5         0
* GigabitEthernet0/20           0         0         0     17833         0         0     11000         5         0
* GigabitEthernet0/21           0         0         0         0         0         0     11000         5         0
* GigabitEthernet0/22           0         0         0         0         0         0     11000         5         0
  GigabitEthernet0/23           0         0         0         0         0         0         0         0         0
  GigabitEthernet0/24           0         0         0         0         0         0         0         0         0
* Port-channel1                 0         0         0      9607  15234000      7375  25604000      7452         0

 

What is strange is that I immediately directed my attention to the load but didn't find any eccessive traffic (see attachment).

 

Port-channel 1 is made by GigabitEthernet0/17 and 18 aggregated.

The major drops are on interfaces 19 and 20, were a tape library is connected. If you look at the traffic, it is very low on these interfaces.

 

Cisco IOS Software, CBS30X0 Software (CBS30X0-IPBASE-M), Version 12.2(58)SE1, RELEASE SOFTWARE (fc1)

 

Any suggest?

 

thanks,

Paolo

 

6 Replies 6

Leo Laohoo
Hall of Fame
Hall of Fame

1.  Unless you have the need to use IOS version 12.2(58)SE, I'd recommend you stay away from this version.  Use 12.2(55)SE8 or 12.2(55)SE9. 

 

2.  Post the output to the command "sh controller e <INTERFACE>" of the interfaces having problems.

First, take a look of the OQD increase of this night:

 

  Interface                   IHQ       IQD       OHQ       OQD      RXBS      RXPS      TXBS      TXPS      TRTL
-----------------------------------------------------------------------------------------------------------------

* GigabitEthernet0/20           0         0         0     43116         0         0     12000         6         0

 

 

Then sh controller e GigabitEthernet0/20

 

     Transmit GigabitEthernet0/20             Receive
     80360099 Bytes                        263504487 Bytes
      6979753 Unicast frames                  636088 Unicast frames
     89232015 Multicast frames                     0 Multicast frames
     54095035 Broadcast frames                   111 Broadcast frames
            0 Too old frames               263459307 Unicast bytes
            0 Deferred frames                      0 Multicast bytes
            0 MTU exceeded frames              12412 Broadcast bytes
            0 1 collision frames                   0 Alignment errors
            0 2 collision frames                   0 FCS errors
            0 3 collision frames                   0 Oversize frames
            0 4 collision frames                   0 Undersize frames
            0 5 collision frames                   0 Collision fragments
            0 6 collision frames
            0 7 collision frames               28590 Minimum size frames
            0 8 collision frames              361047 65 to 127 byte frames
            0 9 collision frames               18804 128 to 255 byte frames
            0 10 collision frames              44184 256 to 511 byte frames
            0 11 collision frames              75306 512 to 1023 byte frames
            0 12 collision frames             108268 1024 to 1518 byte frames
            0 13 collision frames                  0 Overrun frames
            0 14 collision frames                  0 Pause frames
            0 15 collision frames
            0 Excessive collisions                 0 Symbol error frames
            0 Late collisions                      0 Invalid frames, too large
            0 VLAN discard frames                  4 Valid frames, too large
            0 Excess defer frames                  0 Invalid frames, too small
     42969093 64 byte frames                       0 Valid frames, too small
     15216507 127 byte frames
     16185256 255 byte frames                      0 Too old frames
      5427240 511 byte frames                      4 Valid oversize frames
      1434554 1023 byte frames                     0 System FCS error frames
     69074153 1518 byte frames                     0 RxPortFifoFull drop frame
            0 Too large frames
            0 Good (1 coll) frames
            0 Good (>1 coll) frames

 

Same thing for interfaces 0/1,13,14,16 and uplinks 17,18.

 

The differnce I see between access ports and uplinks is that on uplinks I have:

 

         7 Symbol error frames

901196266 Valid frames, too large


.. "too large" frame increasing continuously.

 

thanks,

Paolo

Ok, let me give you some pointers to this very, very useful output.  

 

  1. Not everyone knows how to reset the counters for the command "sh controller e <port>";
  2. Left-hand column, from "Too old frames" down to "Excess defer frames" must be "0";  
  3. Right-hand column, first section, "Alignment errors" down to "Collision fragments" must be "0";
  4. Next section, "Overrun frames" and "Pause frames" is BEST to be "0".  You can have numbers here but make sure it's not incrementing; 
  5. Last Section, "Symbols error frame" must be "0".  The next four lines (too large or too small frame counters) are irrelevant.  They increment when you have dot1Q trunking.  Don't be alarmed.  FCS, down below, should always be "0".

 

Ok, your line card is dropping packets due to known "issue" of micro-burst.  This is to be expected.

Thanks for your hints!

 

Ok, it's a microburst problem but I have to confess that I can't currently find:

- What is the cause

- If it is a problem or not (but I believe it is)

- How to fix it...

 

I found that a server receives frames for a destination Mac address of another machine.

It should appen only if the switch receives a frame for a mac address unknown in its table and it will continue broadcast frames until it receive a frame from the destination machine on a certain port.

The traffic is huge and causes the drop.

The problem happen 10 times a week .. not continuously and not at the same time.

It is CIFS between a Citrix presentation server and an HP NAS .

The source server is directly connected to the 3020 and the HP NAS is on another swich. Both switches are connected to a 3750 stack through a portchannel.

 

Have you any suggest for this problem?

I have no ideas other then wireshark and session monitoring.

thanks,

Paolo

Server microburst can be "rectified" in a short term by properly configuring QoS.  


I said "properly" because QoS behaves differently with different switch models.  I would recommend you open a TAC Case so you can get TAC to craft a QoS for your requirement.  You will need a few days for the QoS to be tuned correctly.

I have many updates on this matter.

I mirrored interface 20 to port 21 where a currently unused windows server is installed and I found the cause of the drops.

 

Traffic burst !! Less then one second of burst where the load reach 100% (1Gbps)

The same on all interfaces where there are drops.

 

Now the strage thing... using Wireshark I found the source and destination of this traffic.

Consider 10.101.1.160 the server I'm mirroring.

source of this burst is ALWAYS server (10.101.1.183) into the same enclosure.

destination is ALWAYS server (10.101.1.6) on another Enclosure, connected in the same way to the core switch.

It is CIFS traffic.

 

Check attachments.

Mirroring another interface, same result : burst from 10.101.1.183 to 10.101.1.6.

 

How is it possible that a switch forwards unicast traffic to more than one port?!? Malfunctioning?!

 

Paolo

Review Cisco Networking for a $25 gift card