03-16-2011 11:16 AM
Hi,
This is a long and boring post, but please bear with me
I have a vSphere environnement with around 20 ESXi. All ESXi have 2 Intel 10GB NICs connected to a Necus 1000V except 2 servers that have 4 NICs connected to the 1000V instead of 2. A few days ago I implemented QoS (1.4, CBWFQ) on the uplink and got this strange message on those 2 servers and lost them (NB: their VMKernel is held by the 1K, but the VSM and VC are on another ESXi with VSS) :
2011 Mar 10 16:02:45 N1K-VDI %VEM_MGR-2-VEM_MGR_DETECTED: Host FRCP00ESX0030 detected as module 30
2011 Mar 10 16:02:45 N1K-VDI %VIM-5-IF_ATTACHED: Interface Ethernet30/6 is attached to vmnic5 on module 30
2011 Mar 10 16:02:45 N1K-VDI %VIM-5-IF_ATTACHED: Interface Ethernet30/7 is attached to vmnic6 on module 30
2011 Mar 10 16:02:45 N1K-VDI %VIM-5-IF_ATTACHED: Interface Ethernet30/8 is attached to vmnic7 on module 30
2011 Mar 10 16:02:45 N1K-VDI %VEM_MGR-2-MOD_ONLINE: Module 30 is online
2011 Mar 10 16:02:45 N1K-VDI %IPQOSMGR-SLOT30-3-QOSMGR_DPA_MSG: DPA returned error message - QoS Agent: Only one queuing policy instance (per VEM) is supported
2011 Mar 10 16:02:45 N1K-VDI %ETH_PORT_CHANNEL-5-CREATED: port-channel3 created
2011 Mar 10 16:02:46 N1K-VDI %IPQOSMGR-SLOT30-3-QOSMGR_DPA_MSG: DPA returned error message - QoS Agent: Only one queuing policy instance (per VEM) is supported
2011 Mar 10 16:02:46 N1K-VDI %PORT-PROFILE-2-INTERFACE_QUARANTINED: Interface Ethernet30/7 has been quarantined due to Cmd Failure
2011 Mar 10 16:02:46 N1K-VDI %PORT-PROFILE-2-INTERFACE_QUARANTINED: Interface Ethernet30/7 has been quarantined due to Cmd Failure
2011 Mar 10 16:02:46 N1K-VDI %ETHPORT-5-IF_DOWN_PORT_PROFILE_INHERIT_ERR: Interface Ethernet30/7 is down (port-profile inherit error)
2011 Mar 10 16:02:46 N1K-VDI %IPQOSMGR-SLOT30-3-QOSMGR_DPA_MSG: DPA returned error message - QoS Agent: Only one queuing policy instance (per VEM) is supported
2011 Mar 10 16:02:46 N1K-VDI %PORT-PROFILE-2-INTERFACE_QUARANTINED: Interface Ethernet30/6 has been quarantined due to Cmd Failure
2011 Mar 10 16:02:46 N1K-VDI %PORT-PROFILE-2-INTERFACE_QUARANTINED: Interface Ethernet30/6 has been quarantined due to Cmd Failure
2011 Mar 10 16:02:46 N1K-VDI %ETHPORT-5-IF_DOWN_PORT_PROFILE_INHERIT_ERR: Interface Ethernet30/6 is down (port-profile inherit error)
And at the same time in the accounting log :
Thu Mar 10 16:02:45 2011:update:ppm.18743:admin:configure terminal ; interface Ethernet30/6-8 (SUCCESS)
Thu Mar 10 16:02:45 2011:update:ppm.18743:admin:configure terminal ; interface Ethernet30/6-8 ; switchport mode trunk (SUCCESS)
Thu Mar 10 16:02:45 2011:update:ppm.18743:admin:configure terminal ; interface Ethernet30/6-8 ; switchport trunk allowed vlan 1-3967, 4048-4093 (SUCCESS)
Thu Mar 10 16:02:45 2011:update:ppm.18743:admin:configure terminal ; interface Ethernet30/6-8 ; service-policy type queuing output policy-queuing (FAILURE)
All servers with 2 NICs seem to behave well, thgough.
I recovered those servers eventually, removed them from the N1K, rebooted both and resinserted them. Since them I have connectivity problems with them, they become unreachable a few times a day. The status right now is :
1 was reachable but not visible int the N1K (neither module not interface)
1 was unreachable permanently until I shut all ports on the upstream Nexus 2K and unshut them.
Each time I lose one of them, I have just this kind of message in the N1K event log :
2011 Mar 16 10:30:34 N1K-VDI %VEM_MGR-2-VEM_MGR_REMOVE_NO_HB: Removing VEM 30 (heartbeats lost)
2011 Mar 16 10:30:34 N1K-VDI %ETHPORT-5-IF_DOWN_VEM_UNLICENSED: Interface Vethernet82 is down (VEM unlicensed)
Here is the uplink configuration :
port-profile type ethernet uplink
vmware port-group
switchport mode trunk
switchport trunk allowed vlan 1-3967,4048-4093
service-policy type queuing output policy-queuing
channel-group auto mode on
no shutdown
system vlan 998-999
state enabled
NB: Meanwhile nothing shows in the 5K logs...
Do this kind of problem rings a bell to any of you ?
Any help will be greatly appreciated,
Cheers,
Vincent.
Solved! Go to Solution.