cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
12003
Views
10
Helpful
27
Replies

Puzzle - Impossible MAC Flap and Random Forward/Blocking SPANTREE

mriksman
Level 1
Level 1

Hi,

 

Firstly, please bare with me - I am certainly no expert, but I am the custodian of our beastly network, which covers a large, remote area utilising fiber and Wi-Max.

 

The (simplified) diagram of the bits I am trying to understand is as follows;

MacFlap_Architecture.JPG

 

Randomly, 3 or 4 times a day, I get the following events across the above switches;

SAEQL250126/03/2019 6:02:46%SPANTREE-6-PORT_STATE: Port Gi1/0/4 instance 0 moving from forwarding to blockingKhepri Link
SAEQL250126/03/2019 6:02:46%SPANTREE-6-PORT_STATE: Port Gi1/0/4 instance 0 moving from blocking to forwardingKhepri Link
SUCRL20126/03/2019 6:02:46%SPANTREE-6-PORT_STATE: Port Gi0/1 instance 0 moving from forwarding to blockingCrossover SUCRL201 to SUCRL202
SUCRL20126/03/2019 6:02:46%SPANTREE-6-PORT_STATE: Port Gi0/1 instance 0 moving from blocking to forwardingCrossover SUCRL201 to SUCRL202
SAEQL250126/03/2019 6:02:47%SW_MATM-4-MACFLAP_NOTIF: Host 2880.23a6.cc65 in vlan 195 is flapping between port Gi1/0/6 and port Gi1/0/1SA-ESVT-SRV1A FTE Green
SACRL20126/03/2019 6:02:47%SW_MATM-4-MACFLAP_NOTIF: Host 2880.23a6.cc65 in vlan 195 is flapping between port Fa0/5 and port Gi0/1SA-ESVT-SRV1A FTE Green
SACRL20126/03/2019 6:02:47%SW_MATM-4-MACFLAP_NOTIF: Host 000c.2994.c8e2 in vlan 195 is flapping between port Fa0/5 and port Fa0/23Zabbix
SAEQL250126/03/2019 6:02:49%SW_MATM-4-MACFLAP_NOTIF: Host 2880.23a6.cc65 in vlan 195 is flapping between port Gi1/0/6 and port Gi1/0/1SA-ESVT-SRV1A FTE Green
SAEQL250226/03/2019 6:02:49%SW_MATM-4-MACFLAP_NOTIF: Host 2880.23a6.cc65 in vlan 195 is flapping between port Gi1/0/1 and port Gi1/0/6SA-ESVT-SRV1A FTE Green
SACRL20126/03/2019 6:02:51%SW_MATM-4-MACFLAP_NOTIF: Host 000c.2994.c8e2 in vlan 195 is flapping between port Fa0/23 and port Fa0/5Zabbix
SACRL20126/03/2019 6:02:51%SW_MATM-4-MACFLAP_NOTIF: Host 2880.23a6.cc65 in vlan 195 is flapping between port Fa0/5 and port Gi0/1SA-ESVT-SRV1A FTE Green
 26/03/2019 6:02:51Timeout on PLC Comms (Alarmed on SA-ESVT-SRV1A) 
SACRL20126/03/2019 6:03:07%SW_MATM-4-MACFLAP_NOTIF: Host 000c.2994.c8e2 in vlan 195 is flapping between port Fa0/5 and port Fa0/23Zabbix

 

I enabled debugging on SAEQL2501, and gathered the following during on of these 'events';

Mar 27 16:46:27: STP: MST0 rx BPDU: config protocol = mstp, packet from GigabitEthernet1/0/4 , linktype IEEE_SPANNING , enctype 2, encsize 17
Mar 27 16:46:27: STP: enc 01 80 C2 00 00 00 D8 24 BD A3 33 01 00 69 42 42 03
Mar 27 16:46:27: STP: Data 000003027C8000D824BDA33300000000008000D824BDA3330080010000140001000F00
Mar 27 16:46:27: STP: MST0 Gi1/0/4:0000 03 02 7C 8000D824BDA33300 00000000 8000D824BDA33300 8001 0000 1400 0100 0F00
Mar 27 16:46:27: MST[0]: Gi1/0/4 state change forwarding -> blocking
Mar 27 16:46:27: STP SW: Gi1/0/4 new blocking req for 1 vlans
Mar 27 16:46:27: Found no corresponding dummy port for instance 0, port_id 128.4
Mar 27 16:46:27: %SPANTREE-6-PORT_STATE: Port Gi1/0/4 instance 0 moving from forwarding to blocking
Mar 27 16:46:27: STP: MST0 rx BPDU: config protocol = mstp, packet from GigabitEthernet1/0/4 , linktype IEEE_SPANNING , enctype 2, encsize 17
Mar 27 16:46:27: STP: enc 01 80 C2 00 00 00 D8 24 BD A3 33 01 00 69 42 42 03
Mar 27 16:46:27: STP: Data 000003027D8000001A2FD6B380000000008000001A2FD6B38080010000140001000F00
Mar 27 16:46:27: STP: MST0 Gi1/0/4:0000 03 02 7D 8000001A2FD6B380 00000000 8000001A2FD6B380 8001 0000 1400 0100 0F00
Mar 27 16:46:28: STP: MST0 rx BPDU: config protocol = mstp, packet from GigabitEthernet1/0/4 , linktype IEEE_SPANNING , enctype 2, encsize 17
Mar 27 16:46:28: STP: enc 01 80 C2 00 00 00 D8 24 BD A3 33 01 00 69 42 42 03
Mar 27 16:46:28: STP: Data 00000302791000000AB84E4600000000001000000AB84E460080010000140001000F00
Mar 27 16:46:28: STP: MST0 Gi1/0/4:0000 03 02 79 1000000AB84E4600 00000000 1000000AB84E4600 8001 0000 1400 0100 0F00
Mar 27 16:46:28: MST[0]: Gi1/0/4 state change blocking -> forwarding
Mar 27 16:46:28: STP SW: Gi1/0/4 new forwarding req for 1 vlans
Mar 27 16:46:28: Found no corresponding dummy port for instance 0, port_id 128.4
Mar 27 16:46:28: %SPANTREE-6-PORT_STATE: Port Gi1/0/4 instance 0 moving from blocking to forwarding
Mar 27 16:46:28: MST[0]: port Gi1/0/4 received internal tc
Mar 27 16:46:28: MST[0]: port Gi1/0/4 received internal tc
Mar 27 16:46:28: STP SW: VLAN1: topology change over - this bridge is root
Mar 27 16:46:28: STP SW: VLAN8: topology change over - this bridge is root
Mar 27 16:46:28: STP SW: VLAN16: topology change over - this bridge is root
Mar 27 16:46:28: STP SW: VLAN24: topology change over - this bridge is root
Mar 27 16:46:28: STP SW: VLAN192: topology change over - this bridge is root
Mar 27 16:46:28: STP SW: VLAN195: topology change over - this bridge is root
Mar 27 16:46:28: STP SW: VLAN210: topology change over - this bridge is root
Mar 27 16:46:28: MST[0]: flushing Gi1/0/1
Mar 27 16:46:28: MST[0]: flushing Gi1/0/2
Mar 27 16:46:28: MST[0]: flushing Gi1/0/5
Mar 27 16:46:28: MST[0]: flushing Gi1/0/6
Mar 27 16:46:28: MST[0]: flushing Gi1/0/8
Mar 27 16:46:28: MST[0]: flushing Gi1/0/9
Mar 27 16:46:28: MST[0]: flushing Gi1/0/10
Mar 27 16:46:28: MST[0]: flushing Gi1/0/12
Mar 27 16:46:28: STP: MST0 rx BPDU: config protocol = mstp, packet from GigabitEthernet1/0/4 , linktype IEEE_SPANNING , enctype 2, encsize 17
Mar 27 16:46:28: STP: enc 01 80 C2 00 00 00 D8 24 BD A3 33 01 00 69 42 42 03
Mar 27 16:46:28: STP: Data 00000302791000000AB84E4600000000001000000AB84E460080010000140001000F00
Mar 27 16:46:28: STP: MST0 Gi1/0/4:0000 03 02 79 1000000AB84E4600 00000000 1000000AB84E4600 8001 0000 1400 0100 0F00
Mar 27 16:46:28: MST[0]: port Gi1/0/4 received internal tc
Mar 27 16:46:28: MST[0]: port Gi1/0/4 received internal tc
Mar 27 16:46:29: STP: MST0 rx BPDU: config protocol = mstp, packet from GigabitEthernet1/0/4 , linktype IEEE_SPANNING , enctype 2, encsize 17
Mar 27 16:46:29: STP: enc 01 80 C2 00 00 00 D8 24 BD A3 33 01 00 69 42 42 03
Mar 27 16:46:29: STP: Data 00000302791000000AB84E4600000000001000000AB84E460080010000140001000F00
Mar 27 16:46:29: STP: MST0 Gi1/0/4:0000 03 02 79 1000000AB84E4600 00000000 1000000AB84E4600 8001 0000 1400 0100 0F00
Mar 27 16:46:29: MST[0]: port Gi1/0/4 received internal tc
Mar 27 16:46:29: MST[0]: port Gi1/0/4 received internal tc
Mar 27 16:46:29: MST[0]: flushing Gi1/0/1
Mar 27 16:46:29: MST[0]: flushing Gi1/0/2
Mar 27 16:46:29: MST[0]: flushing Gi1/0/5
Mar 27 16:46:29: MST[0]: flushing Gi1/0/6
Mar 27 16:46:29: MST[0]: flushing Gi1/0/8
Mar 27 16:46:29: MST[0]: flushing Gi1/0/9
Mar 27 16:46:29: MST[0]: flushing Gi1/0/10
Mar 27 16:46:29: MST[0]: flushing Gi1/0/12
Mar 27 16:46:30: %SW_MATM-4-MACFLAP_NOTIF: Host 2880.23a4.f63d in vlan 195 is flapping between port Gi1/0/6 and port Gi1/0/1
Mar 27 16:46:44: %SW_MATM-4-MACFLAP_NOTIF: Host 2880.23a4.f63d in vlan 195 is flapping between port Gi1/0/1 and port Gi1/0/6
Mar 27 16:47:03: MST[0]: tc timer expired

 

The two servers (SA-ESVT-SRV1A and SA-ESVT-SRV1B) are redundant, so only one is communicating with these external PLCs at one time. The PLC communications get disrupted during these 'events' - only briefly - but they log an alarm. When I fail over to server SA-ESVT-SRV1B I receive the same set of logs, but with the MAC address of the SA-ESVT-SRV1B instead of SA-ESVT-SRV1A;

SAEQL250127/03/2019 16:46:27%SPANTREE-6-PORT_STATE: Port Gi1/0/4 instance 0 moving from forwarding to blockingKhepri Link
SAEQL250127/03/2019 16:46:27%SPANTREE-6-PORT_STATE: Port Gi1/0/4 instance 0 moving from blocking to forwardingKhepri Link
SUCRL20127/03/2019 16:46:27%SPANTREE-6-PORT_STATE: Port Gi0/1 instance 0 moving from forwarding to blockingCrossover SUCRL201 to SUCRL202
SUCRL20127/03/2019 16:46:28%SPANTREE-6-PORT_STATE: Port Gi0/1 instance 0 moving from blocking to forwardingCrossover SUCRL201 to SUCRL202
SACRL20127/03/2019 16:46:28%SW_MATM-4-MACFLAP_NOTIF: Host 2880.23a4.f63d in vlan 195 is flapping between port Fa0/5 and port Gi0/1SA-ESVT-SRV1B FTE Green
SAEQL250227/03/2019 16:46:32%SW_MATM-4-MACFLAP_NOTIF: Host 2880.23a4.f63d in vlan 195 is flapping between port Gi1/0/1 and port Gi1/0/6SA-ESVT-SRV1B FTE Green
 27/03/2019 16:46:34Timeout on PLC Comms        (Note there was none on the events below) 
SACRL20127/03/2019 16:46:37%SW_MATM-4-MACFLAP_NOTIF: Host 000c.2994.c8e2 in vlan 195 is flapping between port Fa0/5 and port Fa0/23Zabbix
SAEQL250227/03/2019 16:46:44%SW_MATM-4-MACFLAP_NOTIF: Host 2880.23a4.f63d in vlan 195 is flapping between port Gi1/0/1 and port Gi1/0/6SA-ESVT-SRV1B FTE Green
SACRL20127/03/2019 16:46:44%SW_MATM-4-MACFLAP_NOTIF: Host 2880.23a4.f63d in vlan 195 is flapping between port Gi0/1 and port Fa0/5SA-ESVT-SRV1B FTE Green
SACRL20127/03/2019 16:46:50%SW_MATM-4-MACFLAP_NOTIF: Host 000c.2994.c8e2 in vlan 195 is flapping between port Fa0/5 and port Fa0/23Zabbix

 

The PLCs are located on port Fa0/4 of SACRL201 (not shown in the architecture diagram attached).

 

 

I have spent days looking at this, but I just cannot work out why;

  • Is there a MAC Flap? There are definitely no loops. I can't see how it is physically possible for any of these devices to flap between these ports/
    • On the VMware ESXi server, we have many VMs; two are Linux OS - built using the same template. Why is only one of them ("Zabbix") appear to have a MAC Flap issue?
    • Why would having a server set as master (SA-ESVT-SRV1A/B) have any impact on whether it gets caught MAC Flapping?
  • Why does it all start with the fiber crossover link between SUCRL201 and SUCRL202 changing from Forwarding to Blocking and back again? And why does it cause SAEQL2501's port Gi1/0/4 to do the same? All switches run MST, and the Wi-Max "Khepri Link" has STP turned off. Only our Cisco switches have STP turned on.

 

I'm hoping someone will read this and have a 'Ah hah!' moment. Because I sure haven't...

 

Thanks. Let me know what config files you need (what command to run to get that) and which switches.

27 Replies 27

Deepak Kumar
VIP Alumni
VIP Alumni

Hi,

As per details, It is looking at that port Gi1/0/4 is showing an issue. Any other switch which is connected to remote end is making STP issue:

 

ar 27 16:46:28: MST[0]: Gi1/0/4 state change blocking -> forwarding
Mar 27 16:46:28: STP SW: Gi1/0/4 new forwarding req for 1 vlans
Mar 27 16:46:28: Found no corresponding dummy port for instance 0, port_id 128.4
Mar 27 16:46:28: %SPANTREE-6-PORT_STATE: Port Gi1/0/4 instance 0 moving from blocking to forwarding
Mar 27 16:46:28: MST[0]: port Gi1/0/4 received internal tc
Mar 27 16:46:28: MST[0]: port Gi1/0/4 received internal tc
Mar 27 16:46:28: STP SW: VLAN1: topology change over - this bridge is root
Mar 27 16:46:28: STP SW: VLAN8: topology change over - this bridge is root
Mar 27 16:46:28: STP SW: VLAN16: topology change over - this bridge is root
Mar 27 16:46:28: STP SW: VLAN24: topology change over - this bridge is root
Mar 27 16:46:28: STP SW: VLAN192: topology change over - this bridge is root
Mar 27 16:46:28: STP SW: VLAN195: topology change over - this bridge is root
Mar 27 16:46:28: STP SW: VLAN210: topology change over - this bridge is root

Have you verified that what is switch priority at remote end switch and any loop in the network design over the wifi?

 

I would like to suggest some commands as:

interface Gi1/0/4
spanning-tree guard root

 Sametime we need some more details to identify the current issue:

 

show spanning-tree mst 0

show spanning-tree mst configuration

show spanning-tree active

 

Please share above commands output from switch 201, 202 and 2501, 2502

 

Regards,

Deepak Kumar

Regards,
Deepak Kumar,
Don't forget to vote and accept the solution if this comment will help you!

Hi Deepak,

 

The Wi-Fi (Wi-MAX) - the 'Khepri link' - is comprised of;

KhepriLink.png

They have no other wireless clients connecting to them, and are just linked together, point-to-point. 

 

The 'remote' switch (SUCRL201) shows the root switch to be SAEQL2501;

Spanning tree enabled protocol mstp
Root ID Priority 4096
Address 000a.b84e.4600
Cost 0
Port 1 (FastEthernet0/1)
Hello Time 1 sec Max Age 20 sec Forward Delay 15 sec

Bridge ID Priority 32768 (priority 32768 sys-id-ext 0)
Address d824.bda3.3300
Hello Time 1 sec Max Age 20 sec Forward Delay 15 sec

 

Regarding the 'root' switch. What happens if the link between SUCRL201 and SAEQL2501 is disrupted (the 'Khepri' Wi-Max link). If SUCRL201 can no longer see SAEQL2501 (which is the root), what does SUCRL201 do? Does it become the root? And then if the link is re-established just moments later, what happens then?

 

I have attached the output of the commands for each of those switches.

 

Thanks for taking the time to look at this for me; it's appreciated!

 

Mike

I hate being 'that' guy, but... *Bump*

 

@Deepak Kumar - any further ideas based on the information I provided?

 

Thanks.

Hi,

Here I am looking two issues in your network:

1. Mac address flapping

Mar 27 16:46:30: %SW_MATM-4-MACFLAP_NOTIF: Host 2880.23a4.f63d in vlan 195 is flapping between port Gi1/0/6 and port Gi1/0/1
Mar 27 16:46:44: %SW_MATM-4-MACFLAP_NOTIF: Host 2880.23a4.f63d in vlan 195 is flapping between port Gi1/0/1 and port Gi1/0/6

 The Mac address "2880.23a4.f63d" is flapping between Gi1/0/6 and port Gi1/0/1 on the switch 2501 means "FTP Green" and "FTP Yellow" ports are not configured correctly on the Server host.

What is a configuration on the server for those ports? Is is in EtherChannel?

 

2. Wireless Link flapping. That may actual wireless link flapping or BDPU loss due to Timeout. Check the logs on all Wireless devices and both switches for any disconnection.

 

 

 

Regards,
Deepak Kumar,
Don't forget to vote and accept the solution if this comment will help you!

I'll take a look.
Any issue with the Root configuration? If the Khepri link goes down, will SUCRL201 try and become root?

Hi,

 If the Khepri link goes down, will SUCRL201 try and become root?

Yes, if the current root bridge will not available then SUCRL201 or 202 anyone who will win the election process will become the root bridge. As soon as possible the link restored then it will revert to the current state. Because the current root bridge 2501 is having low priority compared to others.

Regards,
Deepak Kumar,
Don't forget to vote and accept the solution if this comment will help you!

Could this cause what we see at the start of the issue/event;

SAEQL250126/03/2019 6:02:46%SPANTREE-6-PORT_STATE: Port Gi1/0/4 instance 0 moving from forwarding to blocking 
SAEQL250126/03/2019 6:02:46%SPANTREE-6-PORT_STATE: Port Gi1/0/4 instance 0 moving from blocking to forwarding 
SUCRL20126/03/2019 6:02:46%SPANTREE-6-PORT_STATE: Port Gi0/1 instance 0 moving from forwarding to blocking 
SUCRL20126/03/2019 6:02:46%SPANTREE-6-PORT_STATE: Port Gi0/1 instance 0 moving from blocking to forwarding

 

And if so, how do we prevent that from happening? Being a wireless link, the connection may get disrupted briefly. 

 

Should SUCRL201 be made its own Root of that remote network? I think I remember reading about Regions...? The remote network is on its own VLAN 16.

Hi,

Please share "Show logging" output from the 2501 switch for getting into the details.

 

Regards,
Deepak Kumar,
Don't forget to vote and accept the solution if this comment will help you!

Attached show logging

Hello

 


@Deepak Kumar wrote:

Hi,

Here I am looking two issues in your network:

1. Mac address flapping

Mar 27 16:46:30: %SW_MATM-4-MACFLAP_NOTIF: Host 2880.23a4.f63d in vlan 195 is flapping between port Gi1/0/6 and port Gi1/0/1
Mar 27 16:46:44: %SW_MATM-4-MACFLAP_NOTIF: Host 2880.23a4.f63d in vlan 195 is flapping between port Gi1/0/1 and port Gi1/0/6

 The Mac address "2880.23a4.f63d" is flapping between Gi1/0/6 and port Gi1/0/1 on the switch 2501 means "FTP Green" and "FTP Yellow" ports are not configured correctly on the Server host.

What is a configuration on the server for those ports? Is is in EtherChannel?

 

2. Wireless Link flapping. That may actual wireless link flapping or BDPU loss due to Timeout. Check the logs on all Wireless devices and both switches for any disconnection.

 


Note- This may not be an issue it could just mean a the wifi client has roamed over to an other ap and as such its mac is being seen on another switchport


Please rate and mark as an accepted solution if you have found any of the information provided useful.
This then could assist others on these forums to find a valuable answer and broadens the community’s global network.

Kind Regards
Paul

Hi @paul driver 

This is a Server and it is connected on two different switches. As in the diagram, he has shown that it is working with two different IP address but not sure.

 


server.jpg

Regards,
Deepak Kumar,
Don't forget to vote and accept the solution if this comment will help you!

Hello

SUCRL201
Fa0/1 Root FWD 200000 128.1 P2p  <----- this is a stating a non edge root port that is forwaring

Fa0/6 Desg FWD 200000 128.6 P2p Edge
Fa0/8 Desg FWD 200000 128.8 P2p Edge
Fa0/10 Desg FWD 200000 128.10 P2p Edge
Fa0/14 Desg FWD 200000 128.14 P2p Edge
Fa0/15 Desg FWD 200000 128.15 P2p Edge
Fa0/16 Desg FWD 200000 128.16 P2p Edge
Gi0/1 Desg FWD 20000 128.25 P2p <----- this is a stating a non edge port that is forwarding

 

I would suggest to at least check FA0/1 why that would be a root port,  check the what connected to these ports


Please rate and mark as an accepted solution if you have found any of the information provided useful.
This then could assist others on these forums to find a valuable answer and broadens the community’s global network.

Kind Regards
Paul

Hi @paul driver 

I would suggest to at least check FA0/1 why that would be a root port,  check the what connected to these ports

A Port FA0/1 on SUCRL201 is root port and it is correct as per his high-level network diagram and logs. This port is working as uplink (over wifi) and connected to Root Bridge Switch SACRL2501. It seems that his WIFI connection was dropped or BPDU was timeout due to network delay or congestion.

Regards,
Deepak Kumar,
Don't forget to vote and accept the solution if this comment will help you!

Hello


@Deepak Kumar wrote:

Hi,

As per details, It is looking at that port Gi1/0/4 is showing an issue. Any other switch which is connected to remote end is making STP issue:

 

ar 27 16:46:28: MST[0]: Gi1/0/4 state change blocking -> forwarding
Mar 27 16:46:28: STP SW: Gi1/0/4 new forwarding req for 1 vlans
Mar 27 16:46:28: Found no corresponding dummy port for instance 0, port_id 128.4
Mar 27 16:46:28: %SPANTREE-6-PORT_STATE: Port Gi1/0/4 instance 0 moving from blocking to forwarding
Mar 27 16:46:28: MST[0]: port Gi1/0/4 received internal tc
Mar 27 16:46:28: MST[0]: port Gi1/0/4 received internal tc
Mar 27 16:46:28: STP SW: VLAN1: topology change over - this bridge is root
Mar 27 16:46:28: STP SW: VLAN8: topology change over - this bridge is root
Mar 27 16:46:28: STP SW: VLAN16: topology change over - this bridge is root
Mar 27 16:46:28: STP SW: VLAN24: topology change over - this bridge is root
Mar 27 16:46:28: STP SW: VLAN192: topology change over - this bridge is root
Mar 27 16:46:28: STP SW: VLAN195: topology change over - this bridge is root
Mar 27 16:46:28: STP SW: VLAN210: topology change over - this bridge is root

Have you verified that what is switch priority at remote end switch and any loop in the network design over the wifi?

 

I would like to suggest some commands as:

interface Gi1/0/4
spanning-tree guard root

 


Suggest to refine from applying any stp guardroot at this time , you could only caused further problems and negate connection.

Are you using MST throughout the estate, is your MST region connecting to any non-mst regions (non mstp mode switches)?

Make sure you have stp portfast/bduguard on all edge ports
Make sure you don't have any bpdu-filtering on access/trunks ports or access ports that are set to dynamic-desirable

On your root switch is bridge priority manually set and is it lower than any other switch in the estate


Please rate and mark as an accepted solution if you have found any of the information provided useful.
This then could assist others on these forums to find a valuable answer and broadens the community’s global network.

Kind Regards
Paul