03-12-2010 09:01 PM
As part of our testing, I published two new port-profiles for one of our vSphere hosts to use. The one profile carries our VMotion, SC/Mgmt, Control, Packet, and Guest traffic. The second uplink contains simply the iSCSI network. The objective was to split them up temporarily to do some iSCSI performance testing.
We removed the first VMNIC from the uplink port-profile that previously was applied, and popped it onto the new one.
We then put the second VMNIC on the new iSCSI port profile, and noticed the VEM dropped off the face of the earth and never came back. This is the error I saw... and wondering if anyone has any idea what is going on (more towards the bottom). Did we do something horribly stupid by breaking the port-channel and putting them on separate uplink port profiles?
2010 Mar 12 15:19:31 N1KV-VSM1 %VIM-5-IF_DETACHED: Interface Ethernet3/5 is detached
2010 Mar 12 15:19:31 N1KV-VSM1 %ETH_PORT_CHANNEL-5-PORT_DOWN: port-channel1: Ethernet3/5 is down
2010 Mar 12 15:19:31 N1KV-VSM1 %ETHPORT-5-IF_DOWN_MODULE_REMOVED: Interface Ethernet3/5 is down (module removed)
2010 Mar 12 15:19:32 N1KV-VSM1 %ETHPORT-5-IF_DOWN_INTERFACE_REMOVED: Interface Ethernet3/5 is down (Interface removed)
2010 Mar 12 15:19:32 N1KV-VSM1 %VIM-5-IF_DETACHED: Interface Ethernet3/8 is detached
2010 Mar 12 15:19:32 N1KV-VSM1 %ETH_PORT_CHANNEL-5-PORT_DOWN: port-channel1: Ethernet3/8 is down
2010 Mar 12 15:19:32 N1KV-VSM1 %ETH_PORT_CHANNEL-5-FOP_CHANGED: port-channel1: first operational port changed from Ethernet3/8 to none
2010 Mar 12 15:19:32 N1KV-VSM1 %ETHPORT-5-IF_DOWN_PORT_CHANNEL_MEMBERS_DOWN: Interface port-channel1 is down (No operational members)
2010 Mar 12 15:19:32 N1KV-VSM1 %ETHPORT-5-IF_DOWN_MODULE_REMOVED: Interface Ethernet3/8 is down (module removed)
2010 Mar 12 15:19:32 N1KV-VSM1 %ETHPORT-5-IF_DOWN_PORT_CHANNEL_MEMBERS_DOWN: Interface port-channel1 is down (No operational members)
2010 Mar 12 15:19:33 N1KV-VSM1 %ETHPORT-5-IF_DOWN_PORT_CHANNEL_MEMBERS_DOWN: Interface port-channel1 is down (No operational members)
2010 Mar 12 15:19:33 N1KV-VSM1 %ETHPORT-5-IF_DOWN_INTERFACE_REMOVED: Interface Ethernet3/8 is down (Interface removed)
2010 Mar 12 15:19:33 N1KV-VSM1 %VIM-5-IF_ATTACHED: Interface Ethernet3/5 is attached to vmnic4 on module 3
2010 Mar 12 15:19:33 N1KV-VSM1 %VIM-5-IF_ATTACHED: Interface Ethernet3/8 is attached to vmnic7 on module 3
2010 Mar 12 15:19:38 N1KV-VSM1 %PLATFORM-2-PFM_VEM_REMOVE_NO_HB: Removing VEM 3 (heartbeats lost)
2010 Mar 12 15:19:38 N1KV-VSM1 %PLATFORM-2-MOD_REMOVE: Module 3 removed (Serial number )
2010 Mar 12 15:19:47 N1KV-VSM1 %ETHPORT-5-IF_SEQ_ERROR: Error (0x6e) while communicating with component MTS_SAP_PORT_CLIENT opcode:MTS_OPC_LC_PORT_CLIENT_CONFIG (for:RID_MODULE: 2)
2010 Mar 12 15:19:47 N1KV-VSM1 %PORTPROFILE-3-PORT_PROFILE_CHANGE_VERIFY_REQ_FAILURE: Process (SAP=175) has returned failure while processing update for port-profile iSCSI-Uplink
2010 Mar 12 15:19:56 N1KV-VSM1 %ETHPORT-5-IF_SEQ_ERROR: Error (0x6e) while communicating with component MTS_SAP_PORT_CLIENT opcode:MTS_OPC_LC_PORT_CLIENT_CONFIG (for:RID_MODULE: 2)
2010 Mar 12 15:19:56 N1KV-VSM1 %PORTPROFILE-3-PORT_PROFILE_CHANGE_VERIFY_REQ_FAILURE: Process (SAP=175) has returned failure while processing update for port-profile SLIVS01-Uplink.
-
N1KV-VSM1# sh svs neighbors
Active Domain ID: 91
AIPC Interface MAC: 0050-56ba-2b03
Inband Interface MAC: 0050-56ba-7f80
Src MAC Type Domain-id Node-id Last learnt (Sec. ago)
------------------------------------------------------------------------
0002-3d40-5b02 VEM 91 0302 31606.30
0002-3d40-5b03 VEM 91 0402 196707.16
0002-3d40-5b04 VEM 91 0502 196707.16
Here are the actual port profiles (matching on physical switch)
Note: VLAN91 is VSM management/Service Console. 95 & 96 are control/packet.
port-profile type ethernet iSCSI-Uplink
vmware port-group
switchport mode trunk
switchport trunk allowed vlan 93
no shutdown
system vlan 93
state enabled
port-profile type ethernet VS01-Uplink
vmware port-group
switchport mode trunk
switchport trunk allowed vlan 91-92,94-96
no shutdown
system vlan 91-92,95-96
state enabled
Thanks,
Ryan
03-15-2010 08:59 AM
Resolution to this:
Created vSwitch1 and added vmnic0 to it - we were using 4 and 7 for the 1KV - and changed the Service Console IP to something else. We then jumped into vcenter since we were locked out on the old IP (SC was on the 1KV), and moved vmnic4/7 back to the original port-profiles. Rebooted, deleted vSwitch 1, rebooted again, and it was fixed.
03-22-2010 09:13 AM
We seem to be able to replicate this at will...
Anyone know if there is an actual bug filed for this, or is this a procedural error on our part? Really all I am trying to do is break the port-channel by using different port-profiles, and use the uplinks for separate purposes to do some testing.
If I ever needed to juggle the uplinks for something (rare, but possible I suppose) outside of doing manual subgroup pinning, I'd cripple one of my hosts attempting this.
03-22-2010 09:29 AM
Hi Ryan,
Your port-profile configuration is missing the port-channeling option.
Are your upstream switches clustered or not ?
Anyway I am not sure which code release you are using, but you should use SV1(2) and then configure the mac-pinning option under the uplink port-profile.
Right now you won't have any kind of HA between your uplink so if you remove a port yes the chances that the VEM never come back up are really high.
So under both port-channel please add the config channel-group auto mode on mac-pinning.
Please let me know if it helps.
Cheers
03-22-2010 10:03 AM
Hi there,
Hopefully I can clarify.
Under normal circumstances I run a port-channel with mac pinning configured. That works OK.
What I was trying to do was "break" that port-channel, using each uplink for a separate purpose (no redundancy temporarily). When I did that, despite having all of my appropriate VLANs (packet, control, mgmt) across one of the two uplinks, I could not communicate to the VEM and saw those errors.
The port-profiles I posted are not supposed to be part of a port-channel, which is why I left the channel-group commands out.
Backend switches are a pair of Nexus 5020s without vPC configured (just trunks into the physical NIC uplinks). I am on SV1(2).
Hopefully Jason doesn't mind that I link to his blog in this thread, but it seems mighty similar to the 2nd of the two bugs here:
http://jasonnash.wordpress.com/2010/03/10/two-annoying-bugs-in-the-cisco-nexus-1000v/
03-22-2010 03:03 PM
Ryan,
Did you configured your upstream N5K ports with "Portfast"?
Robert
03-23-2010 05:31 AM
Hi Robert,
I do have "spanning-tree port type edge trunk" on both of the switch ports that physically connect to the VM Host, in addition to the VLAN that is in the port-profile.
03-23-2010 12:56 PM
Hey Ryan -
This is a known bug. You should be able to find the information on it here: http://tools.cisco.com/Support/BugToolKit/search/getBugDetails.do?method=fetchBugDetails&bugId=CSCtc18601
Basically this bug is hit when the port-group is changed in a single step, meaning the vmnic is removed from the current port-group and added to the new port-group in one operation (without clicking "ok" in between). This can be avoided if the port-group change is broken up into multiple steps: from vCenter, go to Manage Physical adaptors, remove vmnic from the current port-group, click ok , then go back to manage physical adapters and add the vmnic to the new port-group.
This bug will be fixed in the next release of the Nexus 1000V.
Thanks,
Liz
03-23-2010 01:18 PM
Okay, thank you!
Wanted to make sure this was in fact what I was running into. It didn't seem like I was doing anything "wrong", per se, although it appears the 1000v didn't appreciate me trying to consolidate steps.
Thanks again for the follow-up.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide