Hi Wes,

dsw_cisco · ‎04-04-2017

Hi all,

We are facing what seems to be a very weird problem. We have a TAC case and a separate VMware case, but up until this point there has been no idea as to what is causing it.

The problem

Our system will only come online when one side of the VPC is shutdown. This brings the link up. At this point, bringing the downed side of the VPC back up again does keep the box externally reachable, but we are no longer able to ping the default gateway from the box itself. Also, external packets into the box show a ~60% success rate, as if we are occasionally going down a "bad" path, or as if it was some sort of layer-2-based asymmetric routing.

vSphere does not failover correctly. All links are shown as up, the cables have been tested, as have the SFPs.

The MAC is seen is both N5k switches - the links are never suspended, compatibility checks all pass. The config looks like it should "just work".

The setup

This is a new install, and has never been functional - the vSphere host itself is empty, and not even in vCenter yet.

2x C220 M4S w/ VIC1225 - 2 uplinks plugged directly into N5k (A and B N5k) with SFPs - cisco-branded.
Standalone mode - no UCSM
Firmware 3.0(1c) - latest Cisco "starred" release
vSphere 6.5a-Cisco

VIC1225:

 vSwitch0 [ eth0 ( uplink0 ( vmnic0 ) ), eth1 ( uplink1 ( vmnic1 ) ) ] == Management - vmk0 - 192.168.10.10

 vSwitch1 [ eth2 ( uplink0 ( vmnic2 ) ), eth3 ( uplink1 ( vmnic3 ) ) ] == vMotion - vmk1 - x.x.x.x

 vSwitch2 [ eth4 ( uplink0 ( vmnic4 ) ), eth5 ( uplink1 ( vmnic5 ) ) ] == NFS - vmk2 - x.x.x.x

 vSwitch3 [ eth6 ( uplink0 ( vmnic6 ) ), eth7 ( uplink1 ( vmnic7 ) ) ] == VM Portgroup

The VIC defaults are used. We don't do anything esoteric. Each VIC is in TRUNK mode, with the untagging being done as the vSwitch level.

N5k (both A and B switches)

interface port-channel200
 description ** esxi-test-1 **
 switchport mode trunk
 switchport trunk allowed vlan 111-115,500,476
 vpc 200

As you can see, the port channel/vpc config is bog-standard.

vSwitch0

[root@esxi-test-1:~] esxcfg-vswitch -l
Switch Name Num Ports Used Ports Configured Ports MTU  Uplinks 
vSwitch0    7802      6          128              1500 vmnic0,vmnic1

   PortGroup Name     VLAN ID  Used Ports Uplinks 
   Management Network 111      1          vmnic0,vmnic1

[root@esxi-test-1:~] esxcli network vswitch standard policy failover get -v vSwitch0
 Load Balancing: iphash
 Network Failure Detection: link
 Notify Switches: false
 Failback: false
 Active Adapters: vmnic0, vmnic1
 Standby Adapters:
 Unused Adapters:

Our problem is focused on management access - we haven't even gotten the box into vCenter yet (well, not stable anyway) - so for the scope of this problem, we focus on vSwitch0.

We are using vpc port channels and not LACP. We have tried to use every failover policy / active/active/standby combination going... we've tried it all.

We can reproduce this problem on another C220 M4S with VIC1225, and even on vSphere 5.5. We therefore think it's either a bug in the VIC firmware, or a problem with our approach / understanding of the architecture.

Any advice appreciated

thanks

Wes Austin · ‎04-04-2017

Hello,

This sounds like a switching issue based on your description.

Our system will only come online when one side of the VPC is shutdown. This brings the link up. At this point, bringing the downed side of the VPC back up again does keep the box externally reachable, but we are no longer able to ping the default gateway from the box itself. Also, external packets into the box show a ~60% success rate, as if we are occasionally going down a "bad" path, or as if it was some sort of layer-2-based asymmetric routing.

While this is occurring, you need to understand the path of the packet. You need to understand if you are losing connectivity when you are staying in the same L2 domain or if you only have connectivity issues when you have to do inter-vlan routing or traverse the upstream network.

If you can isolate two VM on the same layer 2 domain that stay on the same fabric interconnect and don't have problems, you will need to focus on your layer 3 gateway and northbound to understand where the packet loss occurs.

The fact that this is reproducible on a different VIC firmware doesn't really prove that its a software issue, it actually leads me to believe its even more an upstream switching problem. Also, the fact that this is a new install leads me to think there is a misconfiguration on the N5K or where ever the L3 gateway is.

-Wes

dsw_cisco · ‎04-04-2017

Hi Wes,

Thanks for your reply.

You make good points, especially about understanding the path of the packets. I just wanted to add a couple of points which I maybe did not make clear:

1) This is a standalone C-series. It is not managed by UCSM.

2) There are no VMs running - we simply want to get a stable mgmt network out of it. We cannot even get this far.

3) This is reproducable on another C220 M4S - the firmware versions are 4.1(2d).

4) We currently have a large UCS Blade infrastructure (with Gen2 FIs, 6296) running on the same N5k pair, with VPC, and no problems with the main vSphere environment. (failover works, etc..).

The only different here is that we are using these C-series "direct" to the N5k. I am wondering if there is not some sort of limitation when using VIC interfaces direct to a N5k? (i.e, it's not VM-FEX - it's just "vanilla" VIC cards carved into multiple interfaces).

Cheers

Wes Austin · ‎04-04-2017

Thanks.

There is no limitation on the VIC connecting directing to the N5K. This should work without issues.

What teaming policy are you using for your vSwitches? We typically recommend:

Route Based on Originating Port ID
Route Based on Source MAC Hash

Do you only see issues when your server have both VIC adapters active? If this is the case, you should definitely investigate the upstream switch configuration to understand what is missing.

Wes

dsw_cisco · ‎04-04-2017

Hi,

We have used all of the failover policies.... portid, iphash, mac, explicit.. all the same behaviour.

I will have to go back to our network team... a hard sell though, as this N5k has been serving our UCS blade infrastructure (which runs our main vSphere environment) with VPC for a good 5 years now, without problems.

Can you recommend any debugging guides for this sort of problem? I do have access to the N5Ks with a lower-priv role (san admin).

Could it also be a problem on the upstream N7K vpc? On the N5k itself, the vpc never enters a failed state - it's always up/success/success

cheers

Wes Austin · ‎04-04-2017

I would recommend taking packet captures on the hosts and switches to understand where you are losing traffic.

http://www.cisco.com/c/en/us/support/docs/switches/nexus-7000-series-switches/113038-span-nexus-config.html

http://www.cisco.com/c/en/us/td/docs/switches/datacenter/nexus5000/sw/configuration/guide/cli_rel_4_0_1a/CLIConfigurationGuide/Span.pdf

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2051814

To simplify, you could set up a Port-ACL on the N5K and N7K interfaces to count ICMP packets. Then send a ping from your hosts to the gateway, etc. As the ICMP requests and replies traverse the network, they will be counted on the port-acl. The ACL should have an 'allow any any' at the end so it doesn't drop packets, but instead counts them. This way you can understand where the traffic is being lost (on the way to the destination or on the way back). If you see 4 ping requests make it out of the host, but only receive 3 ICMP replies, you know there is packet loss occurring in the upstream network.

HTH,

Wes

dsw_cisco · ‎04-05-2017

Hi Wes,

We did some of your suggestions, and setup a SPAN to watch the ARPs, etc...

What we found when we executed a ping from another subnet to the ESXi host:

1) When the A and B side were both up, an ARP request came from ESXi to the N5k on the A side (vmnic0).

2) The reply came from the N5k down the B side (vmnic1).

3) The ARP info was simply ignored. As if it was an unsolicited reply.

vSwitch0 is configured as active(vmnic0, vmnic1), load-balance(iphash), notify(false), failback(true).

I'm proceeding with a box swap now - a HP server with SFP and will try the same VPC with this.

cheers

Yasser Al-Alem · ‎04-05-2017

Hi,

I have seen similar behavior were the MGMT PG did not inherit the Vswitch load balancing.

try to check MGMT port Group configuration and check if it match Vswitch0

dsw_cisco · ‎04-06-2017

Hi Yasser,

That's very interesting - actually, yesterday we put a VM PG on vSwitch0 too, without vmk obviously, and failover worked without a problem.

This problem definitely seems to be isolated to MGMT traffic, and perhaps only when a vmk is attached (which, for MGMT, is likely to be every time).

Do you have any additional info about what you have seen?

cheers

Kirk J · ‎04-05-2017

For step 3, where you mention the ARP info was ignored, did you see the arp response make it back to the VMK, or did you have an esxi level capture going in parallel?

# pktcap-uw --uplink vmnicX

Thanks

Kirk...

C220 M4s standalone / vSphere 6.5a-cisco / N5k VPC == failover non-functional and strange behaviour