01-30-2017 07:04 AM - edited 03-01-2019 01:03 PM
Hi,
Were are running Cisco UCS.(5108 Chassis, IOM-2208XP, with M3/M4 blades connected to a FI 6296UP)
We want to go from 2 links to 4 links per IOM.
Once we add the 2 additional cables, we will re-ack 1 IOM at a time to make everything aware of the extra links. (we don't want to re-ack the whole chassis at once) So doing this should prevent any downtime from the blade's perspective. (assuming a re-ack will be enough the see the 2 extra links, instead of rebooting the IOM)
Question is: Since the VM traffic might be running through IOM-A or IOM-B, what will be the "ping timeout" from a Virtual Machine perspective when we re-acknowledge a IOM ?
In other words, how long before the vm traffic performs a failover from IOM-A -> IOM B?
Many thanks!
Solved! Go to Solution.
01-31-2017 04:44 AM
You are correct. If you are utilizing failover in the host operating system, then it would be responsible for balancing traffic in the event of a failure. In my experience, it is just as fast as fabric failover.
As best practice, do not utilize fabric failover (vNIC failover as you discussed above) and OS failover for failover, as it can actually cause issues with data traffic if software and hardware are both attempting to do the same thing.
https://pubs.vmware.com/vsphere-65/index.jsp?topic=%2Fcom.vmware.vsphere.networking.doc%2FGUID-D34B1ADD-B8A7-43CD-AA7E-2832A0F7EE76.html
-Wes
01-30-2017 12:14 PM
Hello,
Typically when a failover event occurs, we will drop a few pings in order for the gratuitous ARP to inform the upstream switch of the MAC move. I have seen instances where we drop a single ping on a VM until the failover occurs to the other fabric. It is fairly instantaneous.
http://www.cisco.com/c/en/us/td/docs/unified_computing/ucs/sw/gui/config/guide/2-0/b_UCSM_GUI_Configuration_Guide_2_0/b_UCSM_GUI_Configuration_Guide_2_0_chapter_0100.html
In addition, a cluster configuration actively enhances failover recovery time for redundant virtual interface (VIF) connections. When an adapter has an active VIF connection to one fabric interconnect and a standby VIF connection to the second, the learned MAC addresses of the active VIF are replicated but not installed on the second fabric interconnect. If the active VIF fails, the second fabric interconnect installs the replicated MAC addresses and broadcasts them to the network through gratuitous ARP messages, shortening the switchover time.
HTH,
Wes
01-31-2017 01:42 AM
Hi Wes,
Thank you for the quick answer!
The failover you talk about would be when failover is taken care of by UCS Correct? (vNIC hardware failover)
Colleague of mine pointed out that in our case, this should be taken care of by vmware instead of ucs.
Since we have 2 active adapters in vmware processing the vm traffic (over A and B side) my guess is now that we are mostly depending on vmware failover capability when we re-ack an IOM module. (which hopefully is in line with the same fast failover capability as ucs )
01-31-2017 04:44 AM
You are correct. If you are utilizing failover in the host operating system, then it would be responsible for balancing traffic in the event of a failure. In my experience, it is just as fast as fabric failover.
As best practice, do not utilize fabric failover (vNIC failover as you discussed above) and OS failover for failover, as it can actually cause issues with data traffic if software and hardware are both attempting to do the same thing.
https://pubs.vmware.com/vsphere-65/index.jsp?topic=%2Fcom.vmware.vsphere.networking.doc%2FGUID-D34B1ADD-B8A7-43CD-AA7E-2832A0F7EE76.html
-Wes
02-01-2017 11:00 PM
Thanks Wes !
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide