Solved: Re: unicast flooding and network outages !!!

mohamed louhab · ‎01-10-2011

HI ,

I'm having an issue with my network, where we're are experiencing random and brief network outages. They happen a couple times a day and last 5-10 seconds. when I check my two backbone switches (4506 : Supervisor: WS-X4516-10GE ,IOS : cat4500-ipbase-mz.122-31.SGA8.bin), STP remains normal and no topology change occurs.

I installed a network's snifer (Wireshark) , and after diagnosis, broadcast and multicast remains normal.

when I drive to diagnose the problem, I started wireshark on my PC, it is connected to a switch access (4503).
Based on the analysis of capture I saw three types of packages:
The broadcast and multicast packets
Packets with either as my IP address as source or destination (10.2.240.59/22)
hundreds of unicast packets with address as source and destination are completelydifferent with my address. (10.2.240.146/22 <-> 10.2.240.225/22).

I know that the first two packages are noramles but I did not understand how I got thethird type.

Notes:

there multiple VLANs , but the most users are in the Vlan 1 (10.2.240.0/22) ( with servers ) ,at present , we move the users in the vlan 1 to the appropriate vlan.
there are two 4506 in the core , and eight 4503 in the access.
the CPU is normal during the slowness.
the are a lot of loss of pings ( when the are de slowness )

Thanks

Kevin Dorrell · ‎01-12-2011

Hi Mohamed,

That is interesting, and can point to complications. VMware can be configured in various ways, I beleive, but in my particular installation each virtual machine has its own MAC address, always starting with 00:50:56:... Is that the type of MAC address you have for 10.2.240.225? Check that the MAC address you see in the flooded traffic is the same as the MAC address you have in your PC's ARP cache when you ping the server. What are the MAC addresses concerned?

The interesting thing about this is that the virtual machine keeps its MAC address, even if it is moved from one physical server to another. So, during a vmotion, it is quite possible that some MAC unicast flooding will occur while the switches are updating their forwarding tables for the new location of the VM.

If you are using Microsoft NLB (Network Load Balancing), then things can get much more complicated. This can use unicast addresses at layer-3, but multicast addresses at layer-2, and that can cause a lot of "unicast flooding". Have a look for addresses starting 01:00:5e. In fact, if your server IP address is 10.2.240.225, and it is using Microsoft NLB, it will seem to have a MAC address 01:00:5e:7f:f0:e1. I don't think that is what you have, but just check it out.

Kevin Dorrell

Luxembourg

View solution in original post

glen.grant · ‎01-10-2011

I'll be watching this thread with interest as we are currently on the same type issue . It's probably due to some asymetric routing and the

switch is operating as designed when it floods the packet from other users . There are a couple of things you can check , if its from asymetric routing the reccomendation is to make the mac address aging time equal to or longer than 4 hours , this is what we are going to try to minimize this from happening .

http://www.cisco.com/en/US/partner/products/hw/switches/ps700/products_tech_note09186a00801d0808.shtml

mohamed louhab · ‎01-10-2011

Hi glen.grant ,

Thank you for your reply.

Firstly, thank you to inform you that the link you sent is invalid
Secondly, I read the contents of this link:
http://www.cisco.com/en/US/products/hw/switches/ps700/products_tech_note09186a00801d0808.shtml# topic1

in this link they explained that the "Asymmetric Routing" occurs if the two hosts in two different VLAN. but it is not the case for me.
my PC and both machines 10.2.240.146 and 10.2.240.225 are in the same VLAN.
Unicast flooding is what may be the cause of "Asymmetric Routing" although the Three Different Pc are in the same VLAN?

thank you for your reply

aljaloudi · ‎01-10-2011

Hi Mohammad, please provide us with topology diagram and have you tried to verfiy your Spanning tree steup.

Kevin Dorrell · ‎01-11-2011

Hi Mohamed.

As you rightly say, it is quite normal that you see broadcast and multicast packets, as well as packets destined for your own MAC address. Unicast packets are a different matter; normally you would expect to see very few of these, if any. You would see someone else's unicast packet only if it is transmitted in that brief interval between the CAM entry aging out and the next packet sourced from the particular MAC address.

I am interested to know whether you see only the packets destined for 10.2.240.225, or whether you are seeing all the packets on the VLAN for all destinations.

On thing that can cause unicast packets to flood is if a machine (e.g. 10.2.240.225) is sourcing packets from a MAC address that is not the same as the one it gives out in an ARP response. So, 10.2.240.146 would be sending its packets to the MAC address it gets from the ARP response, but 10.2.240.225 is sourcing packets from a different address. That means that the switches do not know where to send packets, so it floods them. This can happen in some virtualisation scenarios. However, I would discount this hypothesis for several reasons: (1), I think you are seeing all the VLAN 1 traffic and not just the unicasts to 10.2.240.225, (2) You would be seeing the problem all the time and not just bursts of 5-10 seconds, (3) The scenario is quite a rare one. (4) Even if the traffic were quite heavy, it is unlikely a single machine would generate enough unicast traffic to slow down your network significantly. This scenario is not so much asymetric routing as asymetric layer-2 forwarding.

No, this problem has the air of Spanning Tree not doing its job properly. So, what I would be looking for are things like:

Do you have any dumb ("supermarket") switches in the topology, intentionally or unintentionally?
If you have IP phones, has someone plugged both interfaces into the network?
Do you have bpdu-filter configured anywhere? (I still mistrust bpdu-filter, despite the safeguards that were introduced recently).
Do all your switches see the same root for VLAN 1?
Do you have portfast or "portfast trunk" configured on any inter-switch links - highly dangerous.
Do you have any EtherChannels configured as on/on rather than LACP or PAgP?

BTW, which Spanning Tree are you using: traditional 802.1D, PVST+, or MST?

Kevin Dorrell

Luxembourg

mohamed louhab · ‎01-11-2011

Hi Kevin ,zzgao and Osamah Aljaloudi

Thank you for your reply.

I inform you that the machine 10.2.240.225 is a machine Vmware , is it normal to have this type traffic (unicast fooding) with destination MAC address is for a Vmware machine???!!

I saw only in my PC the pair (10.2.240.146 and 10.2.240.225) but it's a lot of trafic.

Do you have any dumb ("supermarket") switches in the topology, intentionally or unintentionally?

the are two Switch in the Backbone ( 2 x 4506 Supervisor: WS-X4516-10G ).

If you have IP phones, has someone plugged both interfaces into the network?

I did not understand what you mean, but most PCs are connected to the LAN through IP phone (with Voice Vlan in the interfaces )

Do you have bpdu-filter configured anywhere? (I still mistrust bpdu-filter, despite the safeguards that were introduced recently).

no , never

Do all your switches see the same root for VLAN 1?

Yes , all switches see the same root for VLAN 1 and for other VLAN. ( and all the access Switchs has the highest priority)

Do you have portfast or "portfast trunk" configured on any inter-switch links - highly dangerous.

no , the uplink links are all in "trunk mode" with ISL protocol ( things that I inherited ).

Do you have any EtherChannels configured as on/on rather than LACP or PAgP?

no

BTW, which Spanning Tree are you using: traditional 802.1D, PVST+, or MST?

Traditional 802.1D.

Thanks

Zizhen Gao · ‎01-12-2011

VMWare itself would not cause unicast flooding, but there could be something special with its mac address -- is that a virtual mac address? is the mac address shared by multiple IPs? what type of traffic are you seeing and mainly, is that one-way traffic?

zz

Kevin Dorrell · ‎01-12-2011

Hi Mohamed,

That is interesting, and can point to complications. VMware can be configured in various ways, I beleive, but in my particular installation each virtual machine has its own MAC address, always starting with 00:50:56:... Is that the type of MAC address you have for 10.2.240.225? Check that the MAC address you see in the flooded traffic is the same as the MAC address you have in your PC's ARP cache when you ping the server. What are the MAC addresses concerned?

The interesting thing about this is that the virtual machine keeps its MAC address, even if it is moved from one physical server to another. So, during a vmotion, it is quite possible that some MAC unicast flooding will occur while the switches are updating their forwarding tables for the new location of the VM.

If you are using Microsoft NLB (Network Load Balancing), then things can get much more complicated. This can use unicast addresses at layer-3, but multicast addresses at layer-2, and that can cause a lot of "unicast flooding". Have a look for addresses starting 01:00:5e. In fact, if your server IP address is 10.2.240.225, and it is using Microsoft NLB, it will seem to have a MAC address 01:00:5e:7f:f0:e1. I don't think that is what you have, but just check it out.

Kevin Dorrell

Luxembourg

mohamed louhab · ‎01-13-2011

Hi Kevin , Hi Gao ,

Thank you for your help again

@Kevin , the Mac address of the 10.2.240.225 start with 00:50:56... , and today when i see the Mac adresse is it is still installed in the my CAM and ARP table in the Switch.

SWITCH#sh mac address-table interface gigabitEthernet 6/14

Unicast Entries

vlan mac address type protocols port

-------+---------------+--------+---------------------+--------------------

1 000c.292d.ab21 dynamic ip GigabitEthernet6/14

1 0050.56a6.0002 dynamic ip GigabitEthernet6/14

1 e41f.1366.6398 dynamic ip

GigabitEthernet6/14

Federateur-B#sh mac address-table add 0050.56a6.0002

Unicast Entries

vlan mac address type protocols port

-------+---------------+--------+---------------------+--------------------

1 0050.56a6.0002 dynamic ip GigabitEthernet6/14

SWITCH#sh ip arp 10.2.240.225

Protocol Address Age (min) Hardware Addr Type Interface

Internet 10.2.240.225 5 0050.56a6.0002 ARPA Vlan1

@ GAO : the Communication betwen 10.2.240.146 and 10.2.240.225 is one-way. (it's like backup applications , it's normal or not??!! is what backup applications causes the unicat flooding??!!)

Thanks you

Zizhen Gao · ‎01-11-2011

Do you know what those IP addresses are and who do they belong to? Are you seeing the traffic bi-directional between .146 and .225 when capturing the sniffer trace from your pc? Unicast flooding happens when the destination mac address is not in the mac table, which could be due to STP change, removed mac entry (e.g. asymmetric routing; one way traffic), or the mac address is not installed (e.g. 1-to-many mac/IP mapping).

In this case, if STP is stable and the flooded traffic is within the same vlan, I'd check if the flooded traffic is mostly one-way (e.g. backup applications).

mohamed louhab · ‎01-13-2011

HI GAO ,

the Communication betwen 10.2.240.146 and 10.2.240.225 is one-way. (it's like backup applications , it's normal or not??!! is what backup applications causes the unicat flooding??!!)

Thanks you

Zizhen Gao · ‎01-13-2011

Mohamed,

Yes, one way traffic could trigger unicast flooding since the mac address would timeout for the device that does not send traffic. A quick test would be starting a constant ping between the two servers and see if the flooding stops .

zz

mohamed louhab · ‎01-13-2011

Hi GAO ,

I do not understand why the one-way traffic causes unicast flooding.
theoretically, two Hosts in the same switch or in separet switchs the communication will be done peer to peer
example:

H(A)----> fa0/1(SwitchA)Fa0/2----->H(B)

is that you can explain

Thank you

Zizhen Gao · ‎01-13-2011

Mohamed,

When a switch receive a packet, it would learn the source mac and put that in the table in normal cases, and if the destination mac is not in its mac table, the default action is flooding.

Now when there's one-way traffic say from A-->B only:

H(A)----> fa0/1(SwitchA)Fa0/2----->H(B)

Switch would learn the mac of A, but not the mac address of B if there's no traffic from B. When switch receives traffic from A, and it doesn't find B in the table, flooding happens. This is fairly common with backup applications which is usually one-way traffic.I've seen customers running little scripts ping between the hosts to make sure that mac table gets updated on the switch before the entries timeout.

zz

mohamed louhab · ‎01-13-2011

Hi , thank you for your explication

it can explain many of the things I noticed too in this network.

In this case the host H (A) makes no arp request for the host H (B). (this the job of ping's script yes)

If the traffic between the host H (A) and the host H (B) is very large (we're talking about 1 TB) is what the traffic can impact the network (eg outages).

thank you for your help