Packet Drop Issue - storm-control

pcclonescisco · ‎01-09-2012

Our Network Setup:

---------------------------

Internet----L3 Switch-------L2 Switch-1----L2 Switch-2-------Servers.

Internet---(SVI VLAN 50)----VLAN 50--------Default VLAN------Servers

--If i enable storm-control on the port where the L2 Switch-2 is connected, from the server, i am getting packet drop to the Gateway ( SVI which is configured in L3 Router). Here are the commands which i have configured for strom control.

storm-control broadcast level 0.01

storm-control multicast level 0.01

--If i remove this command, no packet drop to GW.

Please help me for the Root cause.

rsimoni · ‎01-09-2012

the root cause is the presence of the commands

you basically configured your switch to drop bcast and mcast traffic received on that port beyond 0.01% of link bandwidth.. which is pretty nothing! You have been quite strict.

Consider that in normal networks lots of lecit traffic is bcast or mcast (ARP requests, protocols etc) so by doing that basically you block lots of traffic. Likely also unicast traffic is affected as the bcast ARP requests which are normally sent/received in L2 segment don't make to the recipient and the ARP table will have lots of incomplete addresses, preventing unicast connectivity too.

Please read the following guide on Traffic storm-control to get a better understanding of how the feature works.

http://www.cisco.com/en/US/docs/switches/lan/catalyst6500/ios/12.2SXF/native/configuration/guide/storm.html

please rate question accordingly

Riccardo

pcclonescisco · ‎01-09-2012

Thank you for the reply. Let me go through the Doc and update you.

pcclonescisco · ‎01-12-2012

I have checked the doc.

when i controlling only on the BC and MC, why the unicast packets are getting dropped ? that means, just a ping to GW is getting dropped.

rsimoni · ‎01-13-2012

Hi there,

as I wrote above if you are too strict with storm-control you risk to drop licit BC packets. For instance you might drop ARP request sent by a host to discover the L2 address (MAC) which are sent BC. If they get dropped the recipient host will never reply and the sender will have an incomplete ARP for a given IP, therefore affecting unicat connectivity too.

Usually it is not a definitive lack of connectivity but a temporary one, just the time that some ARP request makes through. However if yours is a big L2 domain you might see the connectivity issue quite often as statistically you will have lots of BC.

Bottom line your configuration is too strict and you should change it.

Riccardo

pcclonescisco · ‎01-13-2012

Any recommended value ?

rsimoni · ‎01-13-2012

there is no recommended value as all depends on your actual traffic pattern.

there are networks where 30-40% of traffic is BC or MC due to the way the local applications works and other where a value above 10% is considered abnormal.

You should assess your network first by capturing samples of traffic on different vlans at different times of the day (and night) and by confronting with applications designers/engineers to have a good understanding about the way their applications works.

Through understanding of actual utilized bandwidth is also needed to be able to determine the ratio between unicast and bcast/mcast l2 traffic.

Once you all of this info you can get a value of what is subbost to be the percentage of lecit bcast and mcast traffic. You add some 'buffer' percentage to be sure you don't starve some flows during peaks or emergency times (the value depending on how much tolerant or strict you want to be) and then, very important, you start testing and tuning until you find a value which is good for yur network.

So after all the only valid rule is to apply common sense.

Please rate and close the question if happy/helpful.

Riccardo

Peter Paluch · ‎01-14-2012

Riccardo,

A nice reply. Please allow me to somewhat "steal" the thread: With storm control, are the multicast and broadcast traffic classes totally disjoint, i.e. does the the storm-control multicast consider only multicast traffic without being influenced by broadcasts (as the broadcast can be seen as a special case of multicast)? If they are truly independent, has it always been that way? I faintly remember that some controls that related to multicast also implicitly included broadcast into the same traffic class as multicast but I am not sure if I just imagined it or if it really existed.

Thanks!

Best regards,

Peter

rsimoni · ‎01-14-2012

Hi Peter,

thank you.

mcast and bcast storm control are fully independent now but you remember right as they were once addressed by the same feature. In the past the bcast storm control feature was taking care of L2 mcast and bcast suppression at the same time (as indeed l2 mcast can be considered a subset of l2 bcast).

I got some guide showing (partially) this:

http://www.cisco.com/en/US/customer/docs/routers/7600/ios/12.1E/configuration/guide/bcastsup.html#wp1020384

The question is ... how do you remember that? You made me do a long research to find out that the 2 types of suppressions were divided back in 2002 when we were talking about 12.1E software. After that you started to have the CLI to differentiate between mcast and bcast storms to reflect the same capability that was already avaiable on CatOS implementation (please don't ask me when it was introduced there as vast majority of catos related docs are now archived ).

However since the feature works at port ASIC level we need to take into consideration the specifities of each LC to see which kind of storm control can be implemented.

From an architecture point of view whether a L2 frame is unicast, bcast or mcast is determined at ASIC level. The differentiation of the type of suppression required the capability to configure a particular ASIC register to take different action based on ucast, bcast or mcast. Not all LCs have this capability.

This leaded to the fact that on some old LCs bcast storm control was also filtering mcast while on other it did not.

This ASICs differences is also why you can have full mcast suppression on some LCs (newer) only while on other you risk to drop BPDU's as well which are instead skipped by the storm control logic on some cards. The ideal behavior is that BPDUs are 'punted' to the SP before any other logic is applied (before suppression) but not all the LC's are able to do so and suppression kicks in also for BPDU's.

Disclaimer: my reasoning above applies to cat6500 and 7600 only, 2 platfoms I am more familiar with. I don't actually know the genesis of such feature on other switches as behavior could be different.

Hope it helps

Riccardo

Peter Paluch · ‎01-14-2012

Hi Riccardo,

Thank you very much for your extensive response!

The question is ... how do you remember that?

I think I stumbled across it when I was preparing for my CCIE R&S. Either some preparatory materials mentioned that, or Peter Mesjar made me aware of this gotcha. As it was quite some time ago, I really do not recall any details. But thank you for confirming that!

What does the acronym LC mean please?

Best regards,

Peter

rsimoni · ‎01-15-2012

Hey Peter,

LC stands for Line Card

Riccardo

Peter Paluch · ‎01-15-2012

Hi Riccardo,

Thank you!

Best regards,

Peter

Joseph W. Doherty · ‎01-14-2012

Disclaimer

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

Just wanted to add to what Riccardo has described, that if you're doing something like IGMP or PIM snooping, multicast flooding is unlikely to become an issue as the switch should suppress forwarding undesired multicast.

As Riccardo further describes, what's "normal" for multicast or broadcast volume on any network is specific to a network. As part of "normal", even with analysis, you might miss a high water burst that's also "normal". For that Riccardo describes providing a "buffer" allowance. This is all correct, but not mentioned, is "normal" can change overtime, I.e. a new app might define a new normal.

What's this means, in practice, it's difficult to tune the ideal settings and on-going "care and feeding" can be maintenance intensive. However, another approach, besides not implementing any storm control at all, is to try to configure storm control to avoid such extremely high levels that the network because unusable due to this type of packet storm.

Oh, and guess I should add size of your L2 domains is important, both to somewhat influence the likeliness of a storm issue and containment of any that might happen.

rsimoni · ‎01-15-2012

I agree with Joseph's addendum, thanks

Joseph W. Doherty · ‎01-15-2012

Disclaimer

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

Thank you Riccardo, but have one more addendum . . .

Riccardo has already described this but want to re-emphasis it, that when something like storm control engages it doesn't distinguish between "good" vs. "bad" broadcast/multicast packets. So, when I said storm control might be set to keep a network usable, often usable will not be business-as-usual. However, it might be usable enough that you can track down the source of flood and do something to mitigate it, perhaps shutting a port.

Also keep in mind, "normal" traffic is very unlikely to create a broadcast storm, which is one of the reasons so many networks are not configured with this feature. You're more likely to see this happen as a denial of service attack, intentional or not. In other words, you might want this feature on a high risk network, such as one that allows uncontrolled hosts on it, but for a network where all hosts are managed (application installation is controlled, up-to-date anti-virus, etc.), the risk of this happening can be very low.

Lastly, there are other Cisco features that are similar, e.g. CoPP, where actual risk of the problem occurring is low but misapplication of the feature can be almost as damaging to the functioning of your network; often you might be surprised, "oh, didn't think of that". (Reason I specifically mention CoPP, was experimenting with it, and after much analysis and testing, thought I had all bases covered until the day another engineer said, "hey what's wrong with this 6500, trying to upload a new IOS and it's crawling [as in minutes to get one ! response]". "Oh, didn't consider tftp to the device itself - oops.)