Showing results for 
Search instead for 
Did you mean: 

Nexus 1000v Failover Issue

Hi all,

We are in the process of deploying Nexus 1000v on a VMware 5.1 cluster. VMware vSphere is installed on 4 Cisco UCS blades. UCS is configured in End-host mode. Upstream Switches are standalone two Cisco 4500 and are not configured as a VSS cluster.

Two Port-channels from each Fabric Interconnect goes to each 4500 switch.

I have used vlan 82, vlan 83 & vlan 8 as the Nexus Packet, Nexus Control & Nexus Management. All the vlans are defined in Cisco Fabric Interconnects and the upstream switches.

I have created a separate vmkernel interface on each ESXi server on control vlan in Nexus, and changed the svs mode in Nexus to control 0 interface. Nexus 1000v VSM (Primary & Secondary) and vCenter sits on the Standard DVS, where as vmkernel interfaces from each ESXi server is connected to the control vlan in Nexus DVS.

Each ESXi server has two uplinks from the Nexus to two fabric interconnects. System works fine when UPLINKS ethernet port-profile does not contain any channel-group configuration. I have tested VSM failover, vmotion, etc with this & system works fine.

But as per the Cisco best practices, UPLINKS port-profile should be configured with a channel group.

So when I use either port-channel config;

1. channel-group auto mode on sub-group cdp

2. channel-group auto mode on mac-pinning

On failover to secondary VSM (When primary VSM is powered off), VEM modules from each ESXi server gets removed with error "%VEM_MGR-2-VEM_MGR_REMOVE_NO_HB: Removing VEM 3 (heartbeats lost)".

When the primary boots up, it stuck at boot up. I had to remove the port-channel config and readd the ESXi servers to the Nexus DVS to get the setup working again. Primary needs to be rebooted as well.

Any help would be appreciated.

Configurations, Diagram & Errors are attached.



Content for Community-Ad