cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
3925
Views
0
Helpful
11
Replies

Problem with MAC Pinning and new VLAN

simon.geary
Level 1
Level 1

[Cross posting to Nexus 1000V and UCS forums]

Hi all, I have a working setup of UCS 1.4 (3i) and Nexus 1000V 1.4 and as per the best practice guide am using 'channel-group auto mode on mac-pinning' on the Nexus uplinks. I am having trouble when adding a new VLAN into this environment, and it is a reproducable problem across two different installations.

I go through the usual VLAN creating process on the Nexus, the upstream network and within UCS itself. I create the new vethernet port profile and set it as an access port in the new VLAN. However when I attach a VM (either existing or new) to this new vethernet port profile within vCentre the VM cannot communicate with anything. However, if I disable MAC pinning with 'no channel-group auto mode on mac-pinning', the VM instantly starts to talk to the outside world and the new VLAN is up and running. I can then turn MAC pinning back on again and everything continues to work.

So the question is, is this behaviour normal or is there a problem? Disabling MAC pinning does have a brief interruption to the uplink so is not a viable long-term solution for the customer when they want to add new VLANs. Is there a way to add new VLANs in this scenario without any network downtime, however brief?

Thanks

1 Accepted Solution

Accepted Solutions

Closing the loop on this.  You're hitting bug CSCto00715.

Symptom:
New MAC address is not learn on vem in the l2 table even though the mac address table is not overflow yet.
vemcmd show l2-emergency-aging-stats | grep "Number of entries that could not be inserted:" will show extreme large number. 

Conditions:
Nexus1000v VEM running on SV1.4 release.
There are two CPU cores on the host.
This issue may happen at race condition.

Workaround:
Reboot the ESX/ESXi host.

This is fixed in 1.4a release.

Regards,

Robert

View solution in original post

11 Replies 11

Robert Burns
Cisco Employee
Cisco Employee

Simon,

Let's gather some further outputs if you're able to recreate this (if this is prod, let me know and I'll whip this up in my lab).

Go through the process of setting up the new VLAN.  Once you confirmed your VM has no connectivity when using a PP tied to the new VLAN, gather the following outputs into a text file and attach to this thread:

From the VSM:

show vlan

show int trunk

show log last 15

module vem x execute vemcmd show vlan y (x = module of VEM where VM is running, y is the VLAN ID)

module vem x execute vemcmd show bd y (y = VLAN ID)

module vem x execute vemcmd show pc

module vem x execute vemcmd show port

module vem x execute vemcmd show pinning

module vem x execute vemcmd show trunk

From the CLI of UCS Interconnect:

connect NXOS

show vlan

show int trunk

Also please highlight or Identify:

- Problem VLAN ID #

- VM veth # in question (1000v)

- vEth # for each Service Profile vNIC used for your VEM uplinks (UCS)

We'll start with this and work from here.   Once I have these outputs I'll let you know if I need anything else.

Regards,

Robert

(I'll keep this as the main thread, and update the 1000v forum once we resolve)

Simon,

I just whipped this up in my lab with no issues.  Here's my procedure:

1. Create new VLAN 20 on upstream N7K core switches

2. Create SVI on VLAN 20 for testing 192.168.20.1

3. Create VLAN 20 in UCS

4. Add VLAN 20 to my two vNICs for my ESX Service Profile (make sure you don't miss this step!)

5. Create VLAN 20 on my 1000v VSM & add to allowed VLAN list on MAC Pinning uplink PP

6. Create 1000v vEth Port Profile using access vlan 20.

7. Attach my windows VM to the test vEth port profile, assign it with appropriate IP 192.168.20.2

8. Test ping - success.

If you followed the same procedure and you're still having issues, provide the outputs requested.

Regards,

Robert

Thanks for the help Robert. I have attached the requested output.

These outputs were taken with MAC Pinning enabled, which immediately caused the test VM to fall off the network. Once I had finished taking the outputs, MAC Pinning was disabled on the uplink again and the VM came straight back to the network. This is consistent with what I've been seeing, MAC pinning just seems to be causing problems somehow.

The VM used for the testing is called SD1M-VC01 and runs vCentre. It is on VLAN 501 using veth11 on VSM module 3.

The upstream switches are Cat 4900M. Each Fabric Interconnect connects to two different 4900Ms in a port channel.

FI A > 4900M A using eth17 and eth18 in Port Channel 11

FI A > 4900M B using eth19 and eth20 in Port Channel 12

FI B > 4900M A using eth17 and eth18 in Port Channel 13

FI B > 4900M B using eth19 and eth20 in Port Channel 14

Thanks

I looked at that output and the first thing that stood out was that you have a lot of port channels for each VEM. Why do you have so many connections to each VEM? I saw a VM name that looks like you might be using a VM based firewall.

When you say it works when you turn off mac pinning that tells me that  one side of the fabric probably does not see the vlan in question. When  two nics are connected to the same uplink with no channel group method  broadcasts will be received by both nics. This is bad but it also leads  to an indication that one side of the fabric is not learning or  connected to the vlan in question.

Are you following step 4 in Robs steps? There is currently no "all vlans" setting for network adapters.

Also how are you pinning the upstream traffic from the FI? Are you manually pinning or letting UCS do its own balancing? Are you using switching mode or end host mode?

Simon,

Can you please send the output of the following commands on VEM?

vemcmd show pc

vemcmd show pinning

vemcmd show l2

vemcmd show l2-emergency-aging-stats

thanks,

Naren

Closing the loop on this.  You're hitting bug CSCto00715.

Symptom:
New MAC address is not learn on vem in the l2 table even though the mac address table is not overflow yet.
vemcmd show l2-emergency-aging-stats | grep "Number of entries that could not be inserted:" will show extreme large number. 

Conditions:
Nexus1000v VEM running on SV1.4 release.
There are two CPU cores on the host.
This issue may happen at race condition.

Workaround:
Reboot the ESX/ESXi host.

This is fixed in 1.4a release.

Regards,

Robert

Thank you everyone for your input to this. An upgrade to 1.4a has indeed fixed the problem.

Not sure this is fixed totally. I'm running a UCS system connected to Dual 7k's as well. Also, running SV1.4a release. My issue is with with DHCP. After many hours of troubleshooting I've been able to isolae the problem down to MAC-Pinning as well. When I leave MAC-Pinning able and diable one of the two uplinks in my port-channel (doesn't matter which one) the VM is able to pull an IP from the DHCP server just fine. Also, when I disable MAC-Pinning on the DATA-Uplink port-profile VM's pull IP's with no problem as well. Ideas?

Thanks.

Jason,

A few questions:

How are you "disabling one of the two uplinks" exactly?

Where is the DHCP server located within UCS or external?

What version of UCS?

What is your uplink topology between UCS & the N7Ks (Straighthrough Port channel, single links, vPC?)

Regards,

Robert

Also forgot to mention it all works fine if I put the phyical uplinks on a regular vSwitch in Vlan 800 and move the VM's to it. Did that to validate the UCS and 7k config.

Thanks,

Jason

Hi Robert,

Thanks for the quick response. I just opened TAC Case as well.

* To disable one of the links in the port-channel I'm removing one of the uplinks from the Host configuration within the DVS section of the host configuration within Vsphere. As I mentioned I tried removing both to ensure it's not an issue with one side of the fabric.

* The DHCP server is located on the UCS as well just in a different vlan. We're doing option #82.

* We're running version 2.0(s) of UCS.

* We're running VPC's to the 7k's.

I should also mention this problem seems to be isolated to our Win2k8 VM's. The Win2k3 VM's are not having a problem getting an IP when they're provisioned. Strange indeed.

Thanks,

Jason

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: