08-09-2016 08:47 AM
Hello all,
I am seeing a large amount of "Link failures" on our Cisco N7706 / 2K chassis/FEX modules. We have the following hardware -
Hardware
cisco Nexus7700 C7706 (6 Slot)
Mod Ports Module-Type Model Status
--- ----- ----------------------------------- ------------------ ----------
1 48 1/10 Gbps Ethernet Module N77-F348XP-23 ok
2 48 1/10 Gbps Ethernet Module N77-F348XP-23 ok
3 0 Supervisor Module-2 N77-SUP2E active *
FEX FEX FEX FEX
Number Description State Model
------------------------------------------------------------------------
121 FEX_121 Online N2K-C2232TM-E-10GE
122 FEX_122 Online N2K-C2232TM-E-10GE
123 FEX_123 Online N2K-C2232TM-E-10GE
Software
BIOS: version 3.1.0
kickstart: version 6.2(10)
system: version 6.2(10)
The ESX servers are configured as "Route based on physical NIC load", so this should choose an uplink based on the current loads of physical NICs via our Distibuted Switch.
We have 2 NIC's on the ESX servers connecting to separate FEX modules, so NIC 1 to Core 1 - Fex1, and NIC 2 to Core 2, FEX 2. The links don't seem to be receiving the same amount of traffic for either the input/output packets.
Core 1
Ethernet111/1/18 is up
admin state is up
Hardware: 1000/10000 Ethernet, address: 2c3e.cf7d.7693 (bia 2c3e.cf7d.7693)
Description: cal-esx6 c1p1-e1c202
MTU 9216 bytes, BW 10000000 Kbit, DLY 10 usec
reliability 255/255, txload 1/255, rxload 1/255
Encapsulation ARPA, medium is broadcast
Port mode is trunk
full-duplex, 10 Gb/s
Beacon is turned off
Auto-Negotiation is turned on
Input flow-control is off, output flow-control is on
Auto-mdix is turned off
Switchport monitor is off
EtherType is 0x8100
Last link flapped 02:25:13
Last clearing of "show interface" counters 05:53:56
1 interface resets
Load-Interval #1: 30 seconds
30 seconds input rate 21280 bits/sec, 4 packets/sec
30 seconds output rate 383176 bits/sec, 223 packets/sec
input rate 21.28 Kbps, 4 pps; output rate 383.18 Kbps, 223 pps
Load-Interval #2: 5 minute (300 seconds)
300 seconds input rate 330152 bits/sec, 32 packets/sec
300 seconds output rate 486960 bits/sec, 237 packets/sec
input rate 330.15 Kbps, 32 pps; output rate 486.96 Kbps, 237 pps
RX
3783916 unicast packets 1494 multicast packets 4903 broadcast packets
3790313 input packets 2458593711 bytes
1162504 jumbo packets 0 storm suppression packets
0 runts 0 giants 0 CRC/FCS 0 no buffer
0 input error 0 short frame 0 overrun 0 underrun 0 ignored
0 watchdog 0 bad etype drop 0 bad proto drop 0 if down drop
0 input with dribble 0 input discard
0 Rx pause
TX
11111972 unicast packets 3444058 multicast packets 1214589 broadcast packets
15770619 output packets 13591515811 bytes
7692555 jumbo packets
0 output error 0 collision 0 deferred 0 late collision
0 lost carrier 0 no carrier 0 babble 0 output discard
1524 Tx pause
Core 2
Ethernet123/1/18 is up
admin state is up
Hardware: 1000/10000 Ethernet, address: 34db.fdd3.bc93 (bia 34db.fdd3.bc93)
Description: cal-esx6-c1p2-e10c201
MTU 9216 bytes, BW 10000000 Kbit, DLY 10 usec
reliability 255/255, txload 1/255, rxload 1/255
Encapsulation ARPA, medium is broadcast
Port mode is trunk
full-duplex, 10 Gb/s
Beacon is turned off
Auto-Negotiation is turned on
Input flow-control is off, output flow-control is on
Auto-mdix is turned off
Switchport monitor is off
EtherType is 0x8100
Last link flapped 3week(s) 4day(s)
Last clearing of "show interface" counters 05:56:01
0 interface resets
Load-Interval #1: 30 seconds
30 seconds input rate 4568592 bits/sec, 1552 packets/sec
30 seconds output rate 19436200 bits/sec, 2775 packets/sec
input rate 4.57 Mbps, 1.55 Kpps; output rate 19.44 Mbps, 2.78 Kpps
Load-Interval #2: 5 minute (300 seconds)
300 seconds input rate 1922448 bits/sec, 1249 packets/sec
300 seconds output rate 20060832 bits/sec, 2533 packets/sec
input rate 1.92 Mbps, 1.25 Kpps; output rate 20.06 Mbps, 2.53 Kpps
RX
16055862 unicast packets 3837 multicast packets 7117 broadcast packets
16066816 input packets 5049918886 bytes
1673929 jumbo packets 0 storm suppression packets
0 runts 0 giants 0 CRC/FCS 0 no buffer
0 input error 0 short frame 0 overrun 0 underrun 0 ignored
0 watchdog 0 bad etype drop 0 bad proto drop 0 if down drop
0 input with dribble 0 input discard
0 Rx pause
TX
24561735 unicast packets 3459152 multicast packets 1213680 broadcast packets
29234567 output packets 20079770544 bytes
10328879 jumbo packets
0 output error 0 collision 0 deferred 0 late collision
0 lost carrier 0 no carrier 0 babble 0 output discard
15710 Tx pause
Cables have been checked, and we are currently checking the NIC OS/error logs. Has anone had similar issues on their Nexus platforms, and if they managed to resolve the error message?
2016 Aug 9 14:03:02.005 cal-n7k-01-core %ETHPORT-5-IF_TRUNK_DOWN: Interface Ethernet111/1/18, vlan 1,176-178,400-402,405-407,412-413,430,432-433,442,603,605-606,610,900-903,905,908,910,921-923,930-931,950-955 down
2016 Aug 9 14:03:02.006 cal-n7k-01-core %ETHPORT-5-IF_DOWN_LINK_FAILURE: Interface Ethernet111/1/18 is down (Link failure)
2016 Aug 9 14:03:07.845 cal-n7k-01-core %ETHPORT-5-SPEED: Interface Ethernet111/1/18, operational speed changed to 10 Gbps
2016 Aug 9 14:03:07.845 cal-n7k-01-core %ETHPORT-5-IF_DUPLEX: Interface Ethernet111/1/18, operational duplex mode changed to Full
2016 Aug 9 14:03:07.845 cal-n7k-01-core %ETHPORT-5-IF_RX_FLOW_CONTROL: Interface Ethernet111/1/18, operational Receive Flow Control state changed to off
2016 Aug 9 14:03:07.845 cal-n7k-01-core %ETHPORT-5-IF_TX_FLOW_CONTROL: Interface Ethernet111/1/18, operational Transmit Flow Control state changed to on
2016 Aug 9 14:03:08.561 cal-n7k-01-core %ETHPORT-5-IF_TRUNK_UP: Interface Ethernet111/1/18, vlan 1,176-178,400-402,405-407,412-413,430,432-433,442,603,605-606,610,900-903,905,908,910,921-923,930-931,950-955 up
2016 Aug 9 14:03:08.564 cal-n7k-01-core %ETHPORT-3-IF_UP: Interface Ethernet111/1/18 is up in mode trunk
Thanks and best regards,
James
08-26-2016 01:38 AM
Hi James,
You could try to set the speed and duplex on the host interfaces manually and check if the problem persists. There is a hardware problem with some 2232 FEXs that results in auto-negotiation issues:
https://bst.cloudapps.cisco.com/bugsearch/bug/CSCux14029
Not sure if that is the problem you are facing here, but I think it is worth trying.
If that does not solve the issue you might want to think about configuring the uplink as a vPC.
Hope that helps.
Best regards,
Tim
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide