08-13-2015 04:21 AM - edited 03-08-2019 01:21 AM
Currently we are building our new network and after a successful test off site we are now experiencing the following behavior we are not able to explain.
Situation :
We have 2 Cisco Nexus 56128P (running 7.1(1)N1(1) ) switches as our core layer, via the 10Gb interfaces on these switches we connect our WS-C2960X-48FPD-L (running 15.0(2a)EX5 ) access switches via SFP-10GBase-LR modules (both ends are using the same types of SFP modules).
All uplinks on the Core (Nexus) side are configured as a VPC and the access switches have 2 uplinks configured as a Port Channel (1 uplink to Core A and the other to Core B ) Port channels have a basic trunk configuration :
Access switches :
interface TenGigabitEthernet1/0/1
description Uplink to LEICSW001 eth 1/1
switchport trunk native vlan 999
switchport mode trunk
channel-group 10 mode active
end
Core switches :
interface Ethernet1/1
description Uplink to Leiasw001 ten1/0/1
switchport mode trunk
switchport trunk native vlan 999
channel-group 201 mode active
This all works perfect, BUT :
As soon as we connect our new core switches to our current network (buildings are located next to each other and we have one 10Gb fiber connection to each new core switches (other end are 2 Cisco Nexus 3064 both sides are configured as a VPC )
All the 10Gb uplinks on the C2960X-48FPD-L access switches start blinking Amber - Green and show errors on the 10Gb interfaces :
sh int ten 1/0/1 counters errors
Port Align-Err FCS-Err Xmit-Err Rcv-Err UnderSize OutDiscards
Te1/0/1 0 0 0 222756 0 0
Port Single-Col Multi-Col Late-Col Excess-Col Carri-Sen Runts Giants
Te1/0/1 0 0 0 0 0 0 222756
So we get Rcv-Err and Giants.
No errors on the New Nexus Core switches.
Setting the System MTU Jumbo size did solve the Giants counter but resulted in FCS-Err so did not do the trick.
So we know that this is probably caused by big layer 2 packets which originate from the other building, but these components will also be connected to our new network in the new situation.
So the 1 million dollar question is, how can we prevent the errors on the uplink ports of our access switches ? We are puzzled.
Blocking all VLANS does the trick but is not a workable situation.
08-13-2015 04:48 AM
Post the complete output to the following commands:
1. Nexus: sh interface Eth 1/1;
2. 2960X: sh interface Ten 1/0/1; and
3. 2960X: sh controll e Ten 1/0/1
08-13-2015 05:17 AM
here you go :
sh interface Eth 1/1 Ethernet1/1 is up Dedicated Interface Belongs to Po201 Hardware: 1000/10000 Ethernet, address: 8c60.4f77.fa68 (bia 8c60.4f77.fa68) Description: Uplink to Leiasw001 ten1/0/1 MTU 1500 bytes, BW 10000000 Kbit,, BW 10000000 Kbit, DLY 10 usec reliability 255/255, txload 1/255, rxload 1/255 Encapsulation ARPA, medium is broadcast Port mode is trunk full-duplex, 10 Gb/s, media type is 10G Beacon is turned off Input flow-control is off, output flow-control is off Rate mode is dedicated Switchport monitor is off EtherType is 0x8100 Last link flapped 1d01h Last clearing of "show interface" counters never 4 interface resets 30 seconds input rate 864 bits/sec, 1 packets/sec 30 seconds output rate 153368 bits/sec, 100 packets/sec Load-Interval #2: 5 minute (300 seconds) input rate 1.05 Kbps, 1 pps; output rate 149.67 Kbps, 91 pps RX 464335 unicast packets 102773 multicast packets 7 broadcast packets 568122 input packets 58907096 bytes 0 jumbo packets 0 storm suppression bytes 0 runts 0 giants 0 CRC 0 no buffer 0 input error 0 short frame 0 overrun 0 underrun 0 ignored 0 watchdog 0 bad etype drop 0 bad proto drop 0 if down drop 0 input with dribble 0 input discard 0 Rx pause TX 1012196 unicast packets 23715512 multicast packets 36133483 broadcast packets 62727212 output packets 10366138471 bytes 1897427 jumbo packets 0 output error 0 collision 0 deferred 0 late collision 0 lost carrier 0 no carrier 0 babble 0 output discard 0 Tx pause sh interface Ten 1/0/1 TenGigabitEthernet1/0/1 is up, line protocol is up (connected) Hardware is Ten Gigabit Ethernet, address is 40a6.e850.6933 (bia 40a6.e850.6933) Description: Uplink to LEICSW001 eth 1/1 MTU 1500 bytes, BW 10000000 Kbit/sec, DLY 10 usec, reliability 250/255, txload 1/255, rxload 1/255 Encapsulation ARPA, loopback not set Keepalive not set Full-duplex, 10Gb/s, link type is auto, media type is SFP-10GBase-LR input flow-control is off, output flow-control is unsupported ARP type: ARPA, ARP Timeout 04:00:00 Last input 00:00:00, output 00:00:01, output hang never Last clearing of "show interface" counters never Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0 Queueing strategy: fifo Output queue: 0/40 (size/max) 5 minute input rate 145000 bits/sec, 103 packets/sec 5 minute output rate 1000 bits/sec, 1 packets/sec 7228539 packets input, 1373671330 bytes, 0 no buffer Received 6957959 broadcasts (3075110 multicasts) 0 runts, 232512 giants, 0 throttles 232512 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored 0 watchdog, 3075110 multicast, 0 pause input 0 input packets with dribble condition detected 192496 packets output, 16463802 bytes, 0 underruns 0 output errors, 0 collisions, 1 interface resets 3000 unknown protocol drops 0 babbles, 0 late collision, 0 deferred 0 lost carrier, 0 no carrier, 0 pause output 0 output buffer failures, 0 output buffers swapped out sh controll e Ten 1/0/1 Transmit TenGigabitEthernet1/0/1 Receive 16469789 Bytes 1373868924 Bytes 141288 Unicast frames 270612 Unicast frames 51248 Multicast frames 3075481 Multicast frames 2 Broadcast frames 3883410 Broadcast frames 0 Too old frames 164766628 Unicast bytes 0 Deferred frames 324082746 Multicast bytes 0 MTU exceeded frames 500789070 Broadcast bytes 0 1 collision frames 0 Alignment errors 0 2 collision frames 0 FCS errors 0 3 collision frames 232548 Oversize frames 0 4 collision frames 0 Undersize frames 0 5 collision frames 0 Collision fragments 0 6 collision frames 0 7 collision frames 113685 Minimum size frames 0 8 collision frames 6028384 65 to 127 byte frames 0 9 collision frames 456604 128 to 255 byte frames 0 10 collision frames 518866 256 to 511 byte frames 0 11 collision frames 10706 512 to 1023 byte frames 0 12 collision frames 87202 1024 to 1518 byte frames 0 13 collision frames 0 Overrun frames 0 14 collision frames 0 Pause frames 0 15 collision frames 0 Excessive collisions 0 Symbol error frames 0 Late collisions 232548 Invalid frames, too large 0 VLAN discard frames 14056 Valid frames, too large 0 Excess defer frames 0 Invalid frames, too small 12566 64 byte frames 0 Valid frames, too small 167527 127 byte frames 6289 255 byte frames 0 Too old frames 5211 511 byte frames 0 Valid oversize frames 941 1023 byte frames 0 System FCS error frames 4 1518 byte frames 0 RxPortFifoFull drop frame 0 Too large frames 0 Good (1 coll) frames 0 Good (>1 coll) frames
08-13-2015 05:54 AM
reliability 250/255, txload 1/255, rxload 1/255
Layer 1 issue. Check the SFP+ on the 2960X or the fibre optic cable.
08-13-2015 06:05 AM
I am sure it's not a layer one issue, as soon as I remove the uplink to my "old" network or even just block all VLANS on the uplink trunk to the old network everything is fine
So we might have an issue on this switch on this uplink but we have like I mention in my initial post redundant links to each access switch ( ten 1/0/1 and ten 5/0/1 ).
And like mentioned this is just 1 switch I see the behavior on ALL 19 access switches and they all are fine when the uplink to the old network is removed instantly
Here's the output of another switch same behavior no reliablility issues :
LEIASW002#sh interface po10 Port-channel10 is up, line protocol is up (connected) Hardware is EtherChannel, address is 8890.8d00.29b3 (bia 8890.8d00.29b3) Description: Uplink to LEICSW001 po202 MTU 1500 bytes, BW 20000000 Kbit/sec, DLY 10 usec, reliability 255/255, txload 1/255, rxload 1/255 Encapsulation ARPA, loopback not set Keepalive set (10 sec) Full-duplex, 10Gb/s, link type is auto, media type is unknown input flow-control is off, output flow-control is unsupported Members in this channel: Te1/0/1 Te4/0/1 ARP type: ARPA, ARP Timeout 04:00:00 Last input 00:00:00, output 02:17:56, output hang never Last clearing of "show interface" counters never Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0 Queueing strategy: fifo Output queue: 0/40 (size/max) 5 minute input rate 2662000 bits/sec, 887 packets/sec 5 minute output rate 51823000 bits/sec, 4462 packets/sec 208377051 packets input, 127810547727 bytes, 0 no buffer Received 54526170 broadcasts (20294194 multicasts) 0 runts, 984049 giants, 0 throttles 984049 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored 0 watchdog, 20294194 multicast, 0 pause input 0 input packets with dribble condition detected 716719368 packets output, 989628384074 bytes, 0 underruns 0 output errors, 0 collisions, 1 interface resets 0 unknown protocol drops 0 babbles, 0 late collision, 0 deferred 0 lost carrier, 0 no carrier, 0 pause output 0 output buffer failures, 0 output buffers swapped out LEIASW002#sh interface 1/0/1 ^ % Invalid input detected at '^' marker. LEIASW002#sh interface ten 1/0/1 TenGigabitEthernet1/0/1 is up, line protocol is up (connected) Hardware is Ten Gigabit Ethernet, address is 8890.8d00.29b3 (bia 8890.8d00.29b3) Description: Uplink to LEICSW001 eth 1/2 MTU 1500 bytes, BW 10000000 Kbit/sec, DLY 10 usec, reliability 255/255, txload 1/255, rxload 1/255 Encapsulation ARPA, loopback not set Keepalive not set Full-duplex, 10Gb/s, link type is auto, media type is SFP-10GBase-LR input flow-control is off, output flow-control is unsupported ARP type: ARPA, ARP Timeout 04:00:00 Last input 00:00:06, output 00:00:00, output hang never Last clearing of "show interface" counters never Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0 Queueing strategy: fifo Output queue: 0/40 (size/max) 5 minute input rate 354000 bits/sec, 331 packets/sec 5 minute output rate 23150000 bits/sec, 2030 packets/sec 83754497 packets input, 55074043419 bytes, 0 no buffer Received 14875622 broadcasts (6756409 multicasts) 0 runts, 492159 giants, 0 throttles 492159 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored 0 watchdog, 6756409 multicast, 0 pause input 0 input packets with dribble condition detected 380112456 packets output, 502514199787 bytes, 0 underruns 0 output errors, 0 collisions, 1 interface resets 6114 unknown protocol drops 0 babbles, 0 late collision, 0 deferred 0 lost carrier, 0 no carrier, 0 pause output 0 output buffer failures, 0 output buffers swapped out LEIASW002#sh interface ten 5/0/1 ^ % Invalid input detected at '^' marker. LEIASW002#sh interface ten 2/0/1 TenGigabitEthernet2/0/1 is down, line protocol is down (notconnect) Hardware is Ten Gigabit Ethernet, address is 8890.8df7.c633 (bia 8890.8df7.c633) MTU 1500 bytes, BW 10000000 Kbit/sec, DLY 10 usec, reliability 255/255, txload 1/255, rxload 1/255 Encapsulation ARPA, loopback not set Keepalive not set Full-duplex, 10Gb/s, link type is auto, media type is Not Present input flow-control is off, output flow-control is unsupported ARP type: ARPA, ARP Timeout 04:00:00 Last input never, output never, output hang never Last clearing of "show interface" counters never Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0 Queueing strategy: fifo Output queue: 0/40 (size/max) 5 minute input rate 0 bits/sec, 0 packets/sec 5 minute output rate 0 bits/sec, 0 packets/sec 0 packets input, 0 bytes, 0 no buffer Received 0 broadcasts (0 multicasts) 0 runts, 0 giants, 0 throttles 0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored 0 watchdog, 0 multicast, 0 pause input 0 input packets with dribble condition detected 0 packets output, 0 bytes, 0 underruns 0 output errors, 0 collisions, 1 interface resets 0 unknown protocol drops 0 babbles, 0 late collision, 0 deferred 0 lost carrier, 0 no carrier, 0 pause output 0 output buffer failures, 0 output buffers swapped out LEIASW002#
08-13-2015 06:14 AM
984049 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored 492159 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
I beg to differ. If there was no issue there would be "0" input errors.
However, you could also potentially be correct since you're using 2960X. And the IOS for the 2960X is one of the buggiest I've ever seen to come out of Cisco.
08-13-2015 06:38 AM
the last output is from another access switch than the first output I posted, and has a full 255 reliability. besides that it's a new building all fibers are brand new and tested ok just a few weeks ago.
And how can one explain a layer one issue being solved by shutting down the uplink interface on my old network (which is thus 3 switches further)
From my perspective these facts combined rules out the layer one issue,
I do agree the input errors are an issue, but that is the exact question I asked in my initial post, what is causing these errors ??
08-13-2015 08:24 AM
Problem is solved, root cause was due to the fact that Jumbo frames where not implemented correctly on the Cisco Nexus 56128P core switches.
After adding :
policy-map type network-qos jumbo
class type network-qos class-default
mtu 9216
system qos
service-policy type network-qos jumbo
To the config of both CoreSwitches
This solved the Rcv errors on the 2960X access switches but the Giants where still increasing.
Adding this line to the config of the 2960X access switches solved the Giants issue :
system mtu jumbo 9198
So issue was caused by Jumbo frames not being implemented correctly on all affected switches.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide