11-30-2010 02:11 PM
I was wondering if anybody could help me with this one.
I'm staging a new virtual infrastructure architecture for my company hosted onVMware ESX4.1 with N1KV 4.0(4) SV1 (3b) solution on top.
When testing the final design I've noticed that a VMware FT protected VM would lose connectivity for approx 8 seconds -actually losing TCP state!- upon a FT failover event while having the VM hosted under the N1K, hence defeating the whole purpose of the VMware FT.
However, while executing the very same operation having the VM hosted on the standard vSwitch on the same server, the failover event is completely transparent, i.e., no state loss..
Are there any known compatibility issues with VMware FT and the Nexus1000V product??
12-22-2010 03:33 AM
Roger,
Currently this is a known issue and Cisco & VMware are actively working towards a fix. There's an issues with the 1000v being able to bring up the FT VM's network interface until it receives a detach notification from the primary.
I will update this thread as soon as there is a permanent fix.
The only workaround for this is to use a "system vlan" on the FT VM's vEth interface.
Regards,
Robert
12-08-2011 08:40 AM
Dear Robert,
It has been more then a year to date. Any news on a permanent fix perhaps?
Rgrds,
Roger
12-08-2011 01:17 PM
Roger,
The bug we have opened currently is still open - reason being as VMware & Cisco need to jointly address this.
I did find out the following:
When using the FT failover test button, there a max 9 seconds delay. The delay is due to a 9 seconds delay in the FT test code to bring down the primary VM.
When simulating an actual ESX host outage for the primary VM (pull the power cord to bring down the ESX host where the primary VM is on), there is a max 6 seconds delay. Because VSM relies on heartbeat to detect the VEM is done and remove the attach for the primary VM.
So it does appear that a "test" FT failover event does not produce the actual result of a Host failure.
I've bumped my Dev team again to see if there have been any improvements to this since 1.3b.
The related bug is CSCtl04574. So far your TAC SR is the only case linked to this bug. I'll let you know what Development comes back with by end of the week.
Regards,
Robert
12-09-2011 12:41 AM
Yes, I find this very odd to say the least that we are abviously the only customer facing this very problem; at the time I thought that we perhaps were one of the early adopters running VMware FT in production, but to date, who isn't??
This workaround we have in place now >1year (using system vlan on non-system vlan portgroups which require FT functionality) is getting a little silly as basic switchport functionality like shutdown on a veth interface is not possbile due to this workaround.
N1KV01(config)# int veth59
N1KV01(config-if)# sh
ERROR: Cannot set port admin status to 'shutdown' for interface inheriting a system port-profile
We are currently running version4.2.1.SV1.4
12-12-2011 05:21 AM
Rogers,
In your instance, do you have any other features enabled in your FT profile? (Qos, ACLs etc)
We may have a solution, but these features may not be available. We're still working on a permanent fix also.
Regards,
Robert
12-12-2011 05:38 AM
Hi Robert,
thank you for chassing this, much appreciated!!
Our VM facing port-profiles all have no feature specifics, e.g.,
port-profile type vethernet DATABASE-TIER
vmware port-group
switchport mode access
switchport access vlan 831
no shutdown
system vlan 831
max-ports 32
state enabled
12-12-2011 05:40 AM
Thanks Roger.
This is hot on my radar - let me circle back with the dev team with this info and see what we can do here.
I WILL be back!
Regards,
Robert
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide