01-10-2012 07:41 AM - edited 03-07-2019 04:15 AM
Hi,
This has been bugging me for some time. We have VMware ESXi connected in vPC mode on a pair of N5K (through FEX). Dozens of time per day we were seeing the following errors :
2011 Nov 18 16:24:34 Canal_auber_5548_6258 %FWM-2-STM_LOOP_DETECT: Loops detected in the network among ports Po100 and Po40 vlan 395 - Disabling dynamic learn notificationsfor 180 seconds
This used to happen only on 2 ESXi running VDI payload (where a lot of VMs are instanciated). Since this was causing a lot of disruption to others serveurs connected to the N5Ks we decided to take both ESXi out until we know why this happens.
Then we enabled mac-move notification to see whether the problem was still there. Although we don't have anymore the LOOP message, we still have this (still on an ESXi running VDI payload) :
Nov 20 07:07:08 canal_auber_5548-6258 : 2011 Nov 20 07:07:08 CET: %FWM-6-MAC_MOVE_NOTIFICATION: Host 0050.5693.0416 in vlan 395 is flapping between port Po100 and port Po31
What I don't get is why the N5K would complain about seeing a MAC address flapping between the a vPC member port and the vPC peer link (I espect seeing virtual machines MAC on both sides since the ESXi is load balancing based on IP hash on both sides of the vPC)
Here is part of the configuration (same on both N5K). po100 is the vPC link, po40 is the vPC to one of the ESXi, all ESXi have the same configuration) :
interface Ethernet104/1/1
description Slot40-A1 ESX-vmnic
switchport mode trunk
switchport trunk allowed vlan 15,18,65,71,200,312-314,317-321,325-326,328,330,332-341,343,349-350,352-357,363,369,374,376-381,383-385,390-4
01,411-412,440,460,462,468-469,475,996-999,2024,2026,2701,2801
spanning-tree port type edge trunk
channel-group 40
interface port-channel40
description Slot40 ESX
switchport mode trunk
vpc 40
switchport trunk allowed vlan 15,18,65,71,200,312-314,317-321,325-326,328,330,332-341,343,349-350,352-357,363,369,374,376-381,383-385,390-4
01,411-412,440,460,462,468-469,475,996-999,2024,2026,2701,2801
spanning-tree port type edge trunk
speed 10000
interface port-channel100
description VPC Link
switchport mode trunk
vpc peer-link
spanning-tree port type network
speed 10000
And some log output :
N5K# sho vpc brief
vPC Peer-link status
---------------------------------------------------------------------
id Port Status Active vlans
-- ---- ------ --------------------------------------------------
1 Po100 up 1,13,15,18,65,71,200,312-314,317-321,325-326,328,3
30,332-341,343,349-350,352-357,363,369,374,376-386
,390-401,411-412,440,460,462,468-469,475,996-999,1
002-1005,2024,2026,2701,2801
vPC status
----------------------------------------------------------------------------
id Port Status Consistency Reason Active vlans
------ ----------- ------ ----------- -------------------------- -----------
--- snip ---
40 Po40 up success success 15,18,65,71
,200,312-31
4,317-321,3
25-326,328,
330,332....
Any idea would be greatly appreciated.
Regards,
Vincent.
Solved! Go to Solution.
01-16-2012 07:09 AM
Hello Prashanth,
This certainly looks promising, especially the related bug information part :
N5K Bcast Packets flooded out of ingress vPC, vPCM out of sync with FWM. |
Symptoms On the impacted device, the port-channel belonging to the VPC is considered non-vpc internally, causing unknown unicast traffic arriving from VPC peer to be forwarded towards the local port-channel Typical symptoms include: -Broadcast traffic is seen to be flooded back out of the ingress vPC (but by the peer-device). -MAC address tables in correctly point towards a north or eastbound vPC for southbound attached hosts/devices. Conditions The specific triggers are not currently known.Workaround Currently the only way to recover from this is via a reload |
Thanks for the hint !
Vincent.
04-16-2012 02:51 AM
Hello Prashanth,
FYI, we encountered another variant of this bug which impacted our platform even more (massive flooding because vPC wouldn't learn MAC addresses), so we finally decided to upgrade. As a consequence, it seems that MAC flapping does not occur anymore.
Thank you for your help on this case.
Cheers,
Vincent.
06-14-2012 04:57 AM
Hi,
A little more on this. The root cause of the problem was that although the port channels were up, the vPC status was down due to an inconsistent state as shown below.
Canal_auber_5548_6258# show vpc brief | inc Po30
30 Po30 up success success 15,18,200,3
Canal_auber_5548_6258# sho int po30
port-channel30 is up
vPC Status: Down, vPC number: 30 [packets forwarded via vPC peer-link]
Canal_auber_5548_6258# show vpc consistency-parameters vpc 30
Legend:
Type 1 : vPC will be suspended in case of mismatch
Name Type Local Value Peer Value
------------- ---- ---------------------- -----------------------
STP Port Type 1 Edge Trunk Port Edge Trunk Port
STP Port Guard 1 None None
STP MST Simulate PVST 1 Default Default
Shut Lan 1 No No
VTP trunk status 2 Enabled Enabled
mode 1 - on
Native Vlan 1 - 1
Port Mode 1 - trunk
MTU 1 - 1500
Duplex 1 - full
Speed 1 - 10 Gb/s
Canal_auber_5548_6258# show int po30 switchport
Operational Mode: trunk
Access Mode VLAN: 1 (default)
Trunking Native Mode VLAN: 1 (default)
30 Po30(SU) Eth NONE Eth103/1/1(P)
After the upgrade to 5.0(3)N1(1c) the problem remained I guess because the VPC were not reset during the ISSU, however a reboot of the hosts attached to the faulty VPCs solved it.
Since then our VPCs are stable and all is well !
Cheers,
Vincent.
10-30-2018 10:39 AM
Do you know if this solution also applies in the case that you have trunk ports instead Ethernet Channels?
interface Ethernet100/1/12
switchport mode trunk
switchport access vlan 408
switchport trunk allowed vlan 408, 435, 472, 484-485
duplex full
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide