Nexus switch issues with Intel 10GbE cards and bonding/teaming?

lars.hecking · ‎08-29-2014

Hi, is anyone aware of any quirks or special configurations required when connecting Intel 10GbE NICs to Nexus switches? We've run into a number of problems. Have I missed anything obvious? Details below.

Setup no. 1: Nexus 4900M, two Dell R510 with dual-port Intel 82599EB (8086:151c), two Dell R720xd with 2x Intel X540-AT2 (8086:1528). The servers are running CentOS6.4 and the 10GbE interfaces are bonded in mode 4 (802.3ad/LACP).

Problem no. 1: The R720xd boxes occasionally lose network connection. One of the interfaces in the bond goes down for 4 seconds, then recovers. This happens at random. Sometimes, the other interface in the bond goes down as well, before the first one has recovered, and that's when the machine loses network connection. The R510 boxes, however, do not exhibit this behaviour.

Setup no. 2: The Nexus 4900M has been replaced with a Nexus 3064T switch.

Problem no.2: The R510s and R720xds have swapped roles. About a week after after the switch was replaced, the R510 NICs started flapping at an exorbitant rate, going down several times every minute.

Workaround: I have unretired the 4900 and moved the R510 machines to it. Since then, no interface flapping has been observed. I.e. the R510 are on the 4900, and the R720xd on the 3064.

Interface/teaming config on the 4900:

interface Port-channel1

switchport
switchport trunk allowed vlan 1,32
switchport mode trunk

!

interface TenGigabitEthernet2/3
switchport trunk allowed vlan 1,32
switchport mode trunk
channel-group 1 mode active
spanning-tree bpduguard enable
!

interface TenGigabitEthernet3/5
switchport trunk allowed vlan 1,32
switchport mode trunk
channel-group 1 mode active
spanning-tree bpduguard enable
!

Same for the 3064:

interface port-channel1
switchport mode trunk
switchport trunk allowed vlan 1,32
no negotiate auto

!

interface Ethernet1/1
switchport mode trunk
switchport trunk allowed vlan 1,32
spanning-tree port type edge
spanning-tree bpduguard enable
channel-group 1 mode active

!

interface Ethernet1/17
switchport mode trunk
switchport trunk allowed vlan 1,32
spanning-tree port type edge
spanning-tree bpduguard enable
channel-group 1 mode active

!

richbarb · ‎09-02-2014

Hi Lars,

Did you updated the firmware of all your hardware (NIC/BIOS) components?

lars.hecking · ‎09-02-2014

Yes, all Dell machines had their firmwares and BIOS updated to the latest available through OMSA. I also applied Intel's preboot updates, admittedly not the latest (v19 vs. 19.3).

I have now hard evidence that the problem has nothing to do with teaming as such. We have other machines here with the Intel 82599EB card (Proliant 360p g8), with only a single port in use, and they have exactly the same problem on the 3064. Yet another datapoint, another group of (custom-built) servers that have Intel X540-AT2 NICs do not show the problem. This exactly mirrors the behaviour of the Dells in the original post.

richbarb · ‎09-02-2014

Hi Lars,

Please check this out:

https://supportforums.cisco.com/discussion/12291366/intel-corporation-82599eb-10-gigabit-receive-missed-errors

lars.hecking · ‎09-02-2014

I don't think this applies. We observe this problem across driver versions, ranging from 3.9.15-k (latest CentOS6.4) over 3.17.3 to 3.18.7. The only correlation I have right now is the combination of NIC and switch.

82599EB + 4900 is good.

X540-AT2 + 3064 is good.

Other combinations are not.

Thanks.

lars.hecking · ‎09-03-2014

I'm currently thinking along the lines of "interesting"/buggy driver and behavioural differences between the two switch models. Does anyone know if the setting for RX/TX flow control on the switches must or should match the NIC settings? I found that the 3064 defaults to RX/TX flow control off for all interfaces, whereas the 4900 enables it for connected ports with link up (all NICs have it enabled, but I don't know whether that's how the 4900 decides to enable). Nothing related to flow control has been explicitly configured.

lars.hecking · ‎09-09-2014

While I still don't understand the cause of this problem, and still don't understand the specifics of flow control negotiation, some experimentation shows that I can work around the problem by either configuring all ports on the 3064 that link to 82599EB cards with

flowcontrol receive on
flowcontrol send on

or, configure those interfaces on the server to turn the pause options off (autoneg/rx/tx) with ethtool.