cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1389
Views
5
Helpful
2
Replies

ESX VMs Failover not working with dynamic pinning

lorenzobexer
Level 1
Level 1

Hello,

we have a problem with Virtual Machines running on ESXi 4.1 on UCS 1.4(i) Blades.

Configuration:

2 Fabric Interconnects

2 VNICs per Service Profile for VM traffic, NO pinning groups

2 Uplink ports per FI, one to each Core Switch


when we are doing failover tests by disabling Uplink ports on the FI, we lose connectivity to the VMs (those that are being repinned).

We are unable to ping the VMs from the outside until we send a ping from the VM itself to the outside network which causes an ARP refresh.

Shouldn't UCS initiate the ARP Refresh, when it repins the uplink or doesn't this work with Virtual Machines running under a hypervisor? Or is it needed to connect UCS Manager to our VCenter?

Any hints would be helpful, thanks in advance.

Regards.

2 Replies 2

Robert Burns
Cisco Employee
Cisco Employee

Lorenz,

[Updated]

Have you checked the option to "Enable Failover" on the ESX host's profile vNICs? With failover enabled, only the host vNIC is configured as a hot standby on the alternate fabric.  Since the actual Host NIC's MAC is never used for sourcing traffic (each VM and VMKernel interface have their own virtual MAC address) traffic originating from outside may be dropped until orginated from the inside outward.  This behavior was seen in version of UCS code prior to 1.4

To fix this problem MAC-SYNC function has been introduced in the 1.4.1 code of UCSM. Now each Fabric Interconnect notifies the other of their MAC tables. Only one FI will have the active MAC but by keeping a list of active MACs on both FIs it allows for fast failover. If an upstream failure is detected the MAC is made active on the other FI without needing to actually learn from the VM itself. The FI also now sends a GARP to the upstream switches to let them know about the change.

There's a great video by Brad Hedlund here illustrating Inter-FI & Intra-FI failover.  See video 2 & 3:

http://bradhedlund.com/2010/06/22/cisco-ucs-networking-best-practices/

1. Can you describe the vNIC configuration of your vSphere Service Profile within UCS:

    Connect to the CLI of UCSM and capture

  show configuration

    Please attach the config output and identify the service profile name of your affected host.

3. From the NXOS context, provide the output of:

  show platform fwm info replmac 

   We're looking to ensure CFS is enable and the handshake is complete.


Also rather than disabling the upstream ports, can you instead "shut" the ports from the upstream device or even "unplug" them?  This would simulate a real world failover scenario.  I'll have to test the behavior with disabling the ports as you're doing in your tests to confirm expected behavior.

Regards,

Robert

Robert,

Is it correct that a vNIC template is not applied to a service profile if the vNICs under that service profile have "Unbind from a Template" grayed out (see attachment)? 

If a vNIC template is only created but not applied AND fabric failover isnt selected when the vnic template is created does that mean dynamic pinning of vNICs is occurring still?  Furthermore, would vNICs dynamically move to fabric interconnect B's uplinks if  A's uplinks failed?

Review Cisco Networking for a $25 gift card

Review Cisco Networking for a $25 gift card