UCS FI to MDS FC port channel failover causes I/O Pauses

CloudySky · ‎12-19-2012

I am conducting testing on our Fibre Channel infrastructure before we bring some new systems online. During the FC testing, I noticed some odd behavior that I don't think should be happening. Configuration:

2x UCS 6248 FIs, 2.0(4a)

2x MDS9148 5.2(2a)

B200 M3 blades with mLOM

Each FI has 4x 8Gb port-channel in trunk mode to its respective MDS switch, for a 32-Gb port channel. Our storge array is of course connected to the MDS9148 switches. From both a physical UCS blade and an ESXi server, I have IOmeter running slamming the array with traffic. The array is active/active (3PAR), and each LUN has four concurrently active paths. Round robin I/O is configured in VMware and Windows, and all storage ports show equally balanced traffic.

The problem presents itself when I physically disconnect a *single cable* in the port channel on just one fabric. At this point I would think the FIs would detect the lost link and in less than a second re-route traffic over the three remaining port-channel links. But what happens, when monitoring the storage ports is a drop down to practically 0 KB/s for 10-60 seconds for one of the test hosts (happens to either ESXi and Windows) across both fabrics. Neither VMware nor Windows logs any path failures, as no paths are down since three port channel links are still up and I/Os can reach the array. If I look in vCenter at storage performance, it shows a large disturbance in I/Os when the cable is pulled.

Now if the host had to fail over because of an entirely failed fabric, then I would expect the host MPIO software to take <30 seconds to reconfigure around the failure. But pulling one of four port channel links is transparent to the hosts and storage array, so I can't understand the big drop in I/Os.

Something seems misconfigured to me.....ideas?

CloudySky · ‎12-27-2012

Bump..any ideas? Normal? Not normal?

kg6itcraig · ‎12-29-2012

Are you running FCoE? Are you running a vSAN for each vHBA?

I like to make a vSAN on for each FI to its storage. That way each vHBA could have at least 2 paths per FI in the same vSAN. UCS and MDS will handle load balancing across multiple links per FI.

Wrote something on my UCS blog that migth help (even if it is for Brocade setups).

http://realworlducs.com/cisco-ucs-fc-uplinks-brocade-fabric/

Craig

My UCS Blog http://realworlducs.com

CloudySky · ‎01-21-2013

We are not running FCoE upstream from the FIs, those are traditional Cisco MDS FC switches. We have two vSANs, one for each FI. Each server is provisioned two vHBAs, one for each vSAN.