Solved: Re: %FWM-6-MAC_MOVE_NOTIFICATION: MAC flapping between vPC host - Page 2

mahbvh · ‎01-10-2012

Hi,

This has been bugging me for some time. We have VMware ESXi connected in vPC mode on a pair of N5K (through FEX). Dozens of time per day we were seeing the following errors :

2011 Nov 18 16:24:34 Canal_auber_5548_6258 %FWM-2-STM_LOOP_DETECT: Loops detected in the network among ports Po100 and Po40 vlan 395 - Disabling dynamic learn notificationsfor 180 seconds

This used to happen only on 2 ESXi running VDI payload (where a lot of VMs are instanciated). Since this was causing a lot of disruption to others serveurs connected to the N5Ks we decided to take both ESXi out until we know why this happens.

Then we enabled mac-move notification to see whether the problem was still there. Although we don't have anymore the LOOP message, we still have this (still on an ESXi running VDI payload) :

Nov 20 07:07:08 canal_auber_5548-6258 : 2011 Nov 20 07:07:08 CET: %FWM-6-MAC_MOVE_NOTIFICATION: Host 0050.5693.0416 in vlan 395 is flapping between port Po100 and port Po31

What I don't get is why the N5K would complain about seeing a MAC address flapping between the a vPC member port and the vPC peer link (I espect seeing virtual machines MAC on both sides since the ESXi is load balancing based on IP hash on both sides of the vPC)

Here is part of the configuration (same on both N5K). po100 is the vPC link, po40 is the vPC to one of the ESXi, all ESXi have the same configuration) :

interface Ethernet104/1/1

description Slot40-A1 ESX-vmnic

switchport mode trunk

switchport trunk allowed vlan 15,18,65,71,200,312-314,317-321,325-326,328,330,332-341,343,349-350,352-357,363,369,374,376-381,383-385,390-4

01,411-412,440,460,462,468-469,475,996-999,2024,2026,2701,2801

spanning-tree port type edge trunk

channel-group 40

interface port-channel40

description Slot40 ESX

switchport mode trunk

vpc 40

switchport trunk allowed vlan 15,18,65,71,200,312-314,317-321,325-326,328,330,332-341,343,349-350,352-357,363,369,374,376-381,383-385,390-4

01,411-412,440,460,462,468-469,475,996-999,2024,2026,2701,2801

spanning-tree port type edge trunk

speed 10000

interface port-channel100

description VPC Link

switchport mode trunk

vpc peer-link

spanning-tree port type network

speed 10000

And some log output :

N5K# sho vpc brief

vPC Peer-link status

---------------------------------------------------------------------

id Port Status Active vlans

-- ---- ------ --------------------------------------------------

1 Po100 up 1,13,15,18,65,71,200,312-314,317-321,325-326,328,3

30,332-341,343,349-350,352-357,363,369,374,376-386

,390-401,411-412,440,460,462,468-469,475,996-999,1

002-1005,2024,2026,2701,2801

vPC status

----------------------------------------------------------------------------

id Port Status Consistency Reason Active vlans

------ ----------- ------ ----------- -------------------------- -----------

--- snip ---

40 Po40 up success success 15,18,65,71

,200,312-31

4,317-321,3

25-326,328,

330,332....

Any idea would be greatly appreciated.

Regards,

Vincent.

mahbvh · ‎01-16-2012

Hello Prashanth,

This certainly looks promising, especially the related bug information part :

N5K Bcast Packets flooded out of ingress vPC, vPCM out of sync with FWM.

Symptoms On the impacted device, the port-channel belonging to the VPC is considered non-vpc internally, causing unknown unicast traffic arriving from VPC peer to be forwarded towards the local port-channel Typical symptoms include: -Broadcast traffic is seen to be flooded back out of the ingress vPC (but by the peer-device). -MAC address tables in correctly point towards a north or eastbound vPC for southbound attached hosts/devices. Conditions The specific triggers are not currently known.Workaround Currently the only way to recover from this is via a reload

Thanks for the hint !

Vincent.

mahbvh · ‎04-16-2012

Hello Prashanth,

FYI, we encountered another variant of this bug which impacted our platform even more (massive flooding because vPC wouldn't learn MAC addresses), so we finally decided to upgrade. As a consequence, it seems that MAC flapping does not occur anymore.

Thank you for your help on this case.

Cheers,

Vincent.

mahbvh · ‎06-14-2012

Hi,

A little more on this. The root cause of the problem was that although the port channels were up, the vPC status was down due to an inconsistent state as shown below.

Canal_auber_5548_6258# show vpc brief | inc Po30

30 Po30 up success success 15,18,200,3

Canal_auber_5548_6258# sho int po30

port-channel30 is up

vPC Status: Down, vPC number: 30 [packets forwarded via vPC peer-link]

MTU 1500 bytes, BW 10000000 Kbit, DLY 10 usec,

Canal_auber_5548_6258# show vpc consistency-parameters vpc 30

Legend:

Type 1 : vPC will be suspended in case of mismatch

Name Type Local Value Peer Value

------------- ---- ---------------------- -----------------------

STP Port Type 1 Edge Trunk Port Edge Trunk Port

STP Port Guard 1 None None

STP MST Simulate PVST 1 Default Default

Shut Lan 1 No No

VTP trunk status 2 Enabled Enabled

mode 1 - on

Native Vlan 1 - 1

Port Mode 1 - trunk

MTU 1 - 1500

Duplex 1 - full

Speed 1 - 10 Gb/s

Canal_auber_5548_6258# show int po30 switchport

Operational Mode: trunk

Access Mode VLAN: 1 (default)

Trunking Native Mode VLAN: 1 (default)

Canal_auber_5548_6258# show port-channel summary | inc Po30

30 Po30(SU) Eth NONE Eth103/1/1(P)

After the upgrade to 5.0(3)N1(1c) the problem remained I guess because the VPC were not reset during the ISSU, however a reboot of the hosts attached to the faulty VPCs solved it.

Since then our VPCs are stable and all is well !

Cheers,

Vincent.

Nikona20 · ‎10-30-2018

Do you know if this solution also applies in the case that you have trunk ports instead Ethernet Channels?

interface Ethernet100/1/12

switchport mode trunk

switchport access vlan 408

switchport trunk allowed vlan 408, 435, 472, 484-485

duplex full

%FWM-6-MAC_MOVE_NOTIFICATION: MAC flapping between vPC host port and vPC peer-link