One of my customers faced a really odd situation few days ago and I´d like to ask if someone has seen something similar.
They have a Vblock, meaning VMware on top of UCS with redundant FI, Nexus and MDS switches and an EMC VMAX array.
They reported they were unable to access the VMs in the ESXi hosts. Communication of VMs within the same host was working fine but anything that had to go out of the host didn´t work, so they suspected something was wrong with the FIs. The communication to/from VMs only restored after they failed over from FI-B to FI-A (customer executed the "cluster lead" command). We have been told by several support engineers that this command should not affect the data plane traffic, only the management plane. This could be only a coincidence but customer has the idea that there was something wrong with FI-B.
Support didn´t find any evidence of failure in the FIs or Nexus switches or even in the virtualization layer, so I just want to ask the community if someone has seen that "cluster lead" command had some effect on the FI traffic flow?
Any input will be appreciated.
Generally, the only times I have seen cluster lead changes impact dataplane:
Fi mgmt lead changes simply change the FI that owns the VIP, the active Web services providing the UCSM gui, etc.
If mgmt plane changes impact the data plan, then there was something already unhealthy with the dataplane services & mgmt.
Yes, I experienced an issue where I performed a lead change. and the VLAN Groups and VLANs were unassociated with the port channels. I was running UCS Central 2.0(1c) and UCS Manager 3.2.2(b). We found out it was a bug in UCS Central that caused it. We upgraded to UCS Central 2.0(1h) and now we do not have the issue anymore.