cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
642
Views
5
Helpful
2
Replies

Directed broadcast packet loss with high traffic volume

m.kafka
Level 4
Level 4

We have an exotic network issue: A src sends high volume directed broadcast traffic towards a destination network, where one or more receivers process the data. We experience high packet loss for directed broadcasts, the same traffic volume as unicast is forwarded on the same path without issues.

Industry is energy, application is some sort of monitoring, which means chances to change network configuration or behavior is next to zero (most likely changes in the source code would be required).

Simplified diagram:

unicast--------------------------------------------->  OK

                                     |
dir.BC------------------------------>| --> loc.BC--->  Issues
                                     |
                                              /---------\
+-----+     /-------------\     +-------+    |           |--> BC listener
| src |--->| transit netw. |--->| L3-sw |--->| dst.netw. |--> BC listener
+-----+     \-------------/  |  +-------+ |  |           |--> BC listener
                             |            |   \---------/
                             V            |
                         capture1         V 
                                      capture2

L3-switch: Cat3560
capture1:  all unicasts and bursts of directed broadcasts are seen.
capture2:  all unicasts but only small/moderate bursts of 
           directed broadcasts are seen, excess is dropped.
transit:   redundant paths including MPLS-VPN connectivity

Situation:

(Information extracted from packet captures and traffic generator/receiver)

  1. Small bursts (back-to-back, zero delay) of directed broadcasts are delivered end-to-end:
    • 8192 byte packets: bursts of around 15 packet are delivered
    • 4096 byte packets: bursts of around 30 packet delivered
    • 1400 byte packets: bursts of 50 packets and longer are delivered.
  2. Traffic exceeding these limits experiences 100% packet loss and L3-switch CPU goes to 100%
  3. Same traffic but unicast is delivered end-to-end without packet loss (tested with traffic generator/receiver).

According to the Cisco Switch Guide only a few platforms support directed broadcast in hardware, Cat6k5, Nexus5k, Nexus3k etc...

So the behavior is not surprising: Directed broadcasts are "CEF exception/control plane transit" on a Cat3560 and as long as the burst fits in the buffers of the control plane the traffic can be delivered eventually (most likely with a noticeable delay, but delivered) and CPU will go up.

Does anyone see a solution other than spending some 100k USD list or more (we need 4 devices because of redundant paths in the transit network)?

Best regards,

MiKa

2 Replies 2

Peter Paluch
Cisco Employee
Cisco Employee

Hi MiKa,

This is a tough problem.

Ideally, the broadcast should be replaced by a multicast. However, is it possible for the BC listeners to move over to a particular multicast group?

If converting the BC listeners to multicast is not feasible or possible, it would perhaps be possible to create a software application running on a sufficiently powerful server directly connected to the destination network, and this software application would act as a unicast-to-broadcast converter. The source would be sending its stream to that server's unicast address, and it would simply hairpin the traffic out to its network's broadcast address. Effectively, it would offload the task of "exploding" the broadcast from the Catalyst switch. It would perhaps be even possible to do this trick using Linux iptables and NAT - although I have never tried to NAT an incoming stream to the same network's broadcast address - but it could still work. What is the volume of the stream from the source? Is it a UDP-based communication, or perhaps raw IP? Do the BC listeners only listen, or do they also respond?

Best regards,
Peter

Howdy, "neighbor"

thank you for your thoughts!

The BC listeners are indeed the toughest problem. I was thinking along this line: configure an ip-helper for the UDP port in use and "explode" the unicast in software close to the listeners (thanks for that expression, I haven't heard it for a long time). But then I had doubts: most likely the ip-helper function is also "CEF exception/control plane transit" and I would only shift the networking issue to the originating end...

I'm not sure about switching to a MC-group. configuring PIM-sparse shouldn't be an issue. Next week there is a conf-call with all involved parties and hopefully we get more details about the networking functions of the system, whether it can be reconfigured or reprogrammed.

Not quite sure whether IP-tables is capable to "NAT" a broadcast/multicast style but even if that's not possible, a small demon running on a minimal Linux-kernel should do the trick.

We might need two software-based components: One on the originating side, picking up the directed broadcast and sending it as a unicast to the destination application, which re-sends it as a local broadcast.

I like the Idea because modern data center equipment should be capable to handle quite a traffic load and we wouldn't need to bother the relatively weak Catalyst CPUs.

Best regards, MiKa

 

Review Cisco Networking for a $25 gift card