cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
8477
Views
20
Helpful
6
Replies

Output Error on N9k 10G interface

Arya81
Level 1
Level 1

Hi to all,

I'm experiencing increments of output errors on a 10G interface:

 

Ethernet1/1 is up
admin state is up, Dedicated Interface
  Hardware: 100/1000/10000/25000 Ethernet, address: 2c4f.525c.ba07 (bia 2c4f.525c.ba08)
  Description:xxxxxxxx
  Internet Address is xxxxxxxxxxxxxxx
  MTU 9150 bytes, BW 10000000 Kbit, DLY 10 usec
  reliability 255/255, txload 13/255, rxload 7/255
  Encapsulation ARPA, medium is broadcast
  full-duplex, 10 Gb/s, media type is 10G
  Beacon is turned off
  Auto-Negotiation is turned on  FEC mode is Auto
  Input flow-control is off, output flow-control is off
  Auto-mdix is turned off
  Rate mode is dedicated
  Switchport monitor is off
  EtherType is 0x8100
  EEE (efficient-ethernet) : n/a
  Last link flapped 29week(s) 1day(s)
  Last clearing of "show interface" counters never
  17 interface resets
  Load-Interval #1: 30 seconds
    30 seconds input rate 301003400 bits/sec, 55200 packets/sec
    30 seconds output rate 532395640 bits/sec, 53879 packets/sec
    input rate 301.00 Mbps, 55.20 Kpps; output rate 532.40 Mbps, 53.88 Kpps
  Load-Interval #2: 5 minute (300 seconds)
    300 seconds input rate 297893384 bits/sec, 56022 packets/sec
    300 seconds output rate 424997712 bits/sec, 53892 packets/sec
    input rate 297.89 Mbps, 56.02 Kpps; output rate 425.00 Mbps, 53.89 Kpps
  Load-Interval #3: 5 seconds
    5 seconds input rate 367465200 bits/sec, 62072 packets/sec
    5 seconds output rate 632914320 bits/sec, 65310 packets/sec
    input rate 367.46 Mbps, 62.07 Kpps; output rate 632.91 Mbps, 65.31 Kpps
  RX
    1834246582655 unicast packets  283418861 multicast packets  45 broadcast packets
    1834530417310 input packets  1626688888225625 bytes
    659091065800 jumbo packets  0 storm suppression bytes
    0 runts  0 giants  35 CRC  0 no buffer
    35 input error  0 short frame  0 overrun   0 underrun  0 ignored
    0 watchdog  0 bad etype drop  0 bad proto drop  0 if down drop
    0 input with dribble  0 input discard
    0 Rx pause
  TX
    2051751641869 unicast packets  7338181449 multicast packets  41 broadcast packets
    2059093817881 output packets  2210037874377512 bytes
    1057043746858 jumbo packets
    3466622 output error  0 collision  0 deferred  0 late collision
    0 lost carrier  0 no carrier  0 babble  302895 output discard
    0 Tx pause

 

sh ver

Cisco Nexus Operating System (NX-OS) Software

Software
  BIOS: version 07.65
  NXOS: version 7.0(3)I7(6)
  BIOS compile time:  09/04/2018
  NXOS image file is: bootflash:///nxos.7.0.3.I7.6.bin
  NXOS compile time:  3/5/2019 13:00:00 [03/05/2019 23:04:55]


Hardware
  cisco Nexus9000 93180YC-EX chassis
  Intel(R) Xeon(R) CPU  @ 1.80GHz with 24633600 kB of memory.
  Processor Board ID FDO23190AP9

 

What might be the cause behind this behaviour?

 

Many thanks

 

 

6 Replies 6

Christopher Hart
Cisco Employee
Cisco Employee

Hello!

Nexus 9000 series switches utilize cut-through switching by default. This means that if a malformed frame enters the switch, the switch is unable to validate that the FCS field of the Ethernet frame is valid through a CRC check before portions of the frame are forwarded out of an egress interface. As a result, cut-through switches generally will increment two counters when they receive a malformed frame:

  1. The input errors and/or CRC error counter on the ingress interface
  2. The output errors counter on the egress interface

In your scenario, you've provided output from Ethernet1/1 showing a non-zero (and supposedly incrementing) output errors counter. Next, we need to validate whether there are any non-zero input errors counters on any other interfaces. In NX-OS, the easiest command to identify these interface is show interface counters errors non-zero - can you provide the output of this command?

As a side note, a detailed explanation of how Nexus 9000 series switches with the Cloud Scale ASIC (which is what the Nexus 93180YC-EX has) perform cut-through switching and react to CRCs can be found in the Nexus 9000 Cloud Scale ASIC CRC Identification & Tracing Procedure document.

I hope this helps - thank you!

-Christopher

Hi Christopher,

following the "show" you suggested:

 

show interface counters errors non-zero

--------------------------------------------------------------------------------
Port          Align-Err    FCS-Err   Xmit-Err    Rcv-Err  UnderSize OutDiscards
--------------------------------------------------------------------------------
Eth1/1                0         35    3466730         35          0      302895
Eth1/10               0       1196          0       1196          0           0
Eth1/14               0          1          0          1          0           0
Eth1/47               0          9          0          9          0           0
Eth1/48               0         23          0         23          0           0
Eth1/53               0        258          0        258          0           0
Eth1/54               0      66817          0      66817          0           0
Po1                   0         32          0         32          0           0
Po501                 0          1          0          1          0           0

--------------------------------------------------------------------------------
Port         Single-Col  Multi-Col   Late-Col  Exces-Col  Carri-Sen       Runts
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
Port          Giants SQETest-Err Deferred-Tx IntMacTx-Er IntMacRx-Er Symbol-Err
--------------------------------------------------------------------------------
Eth1/1             0          --           0     3466730           0          0

 

Many thanks for your feedback!

Hello!

Based upon this output, it appears that we have non-zero CRC error counters on a handful of interfaces:

  • Ethernet1/1
  • Ethernet1/10
  • Ethernet1/14
  • Ethernet1/47
  • Ethernet1/48
  • Ethernet1/53
  • Ethernet1/54

The total number of CRC errors does not add up to the total number of output errors on Ethernet1/1 - however, it's possible that counters were cleared on some interfaces in the recent past. To confirm that these CRC errors are incrementing and directly correlated with the output errors on Ethernet1/1, we need to observe non-zero counters multiple times within a short period of time.

To do this, can you provide the output of the below commands?

terminal width 511 ; terminal length 0 ; show interface counters errors non-zero ; sleep 30 ; show interface counters errors non-zero ; sleep 30 ; show interface counters errors non-zero ; sleep 30 ; show interface counters errors non-zero ; sleep 30 ; show interface counters errors non-zero

Thank you!

-Christopher

Hi Christopher,

I hope you are fine.

My Nexus 93180-YC-FX is incrementing output error on port channel and 10G interface.
follow the attached file.

Please, could you check.
Malformed frame is the problem cause in my nexus?

Thank you

 

f00z
Level 1
Level 1

Eh, this is usually a bad transceiver, or something like using an unsupported DAC length. seeing xmit mac error without any other related errors in my experience is usually that.  The switch sees the error in the transceiver but the frame is already out of the switch so it can't drop it at that point. The other side of the link should be incrementing FCS or CRC errors if you have access to see it.

 

f00z
Level 1
Level 1

Oh I forgot to mention, if the switch is in cut-through switching mode, input errors from other interfaces which egress out the erroring interface will incremement this counter.  To debug this, change the forwarding mode to store and forward and see if it still increments that counter.  Doing this will also give you more insight on which interfaces are having CRC errors on ingress.  It might not be the ideal solution depending on your traffic environment but it's a way to diagnose.  I noticed the counters are not always correct in cut-through mode , possibly software bugs.  Maybe there is a way to see real framing errors even in cut through mode with the new telemetry code but I haven't seen any. 

The issue is, the CRC/FCS is at the end of the ethernet frame, while the header is at the beginning, and cut-through only looks at the header.  Then the egress port has to re-calculate the FCS before it transmits so it can append it to the end of the frame, this is where it will notice something is wrong and increment the counter you are seeing.  There are some mechanisms to 'stomp' crc propagation where it purposely mangles the FCS on egress so the next switch it hits drops it.  The only downside of cut through is that your NMS system will report errors on every upstream interface until it is dropped, and it's hard to track where it comes from (That's why the stomp mechanism was put in place).

 

So, TLDR:   99% of the time it's either transceiver error, or corrupt frames getting passed from other ports which are doing cut-through

 

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: