cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
798
Views
5
Helpful
9
Replies
Highlighted
Participant

6509 transmit discards every 15 minutes impacting UDP video traffic on VLAN

Hi,

I am getting many transmit discards on 6509 interfaces on a specific vlan every 15 minutes, generally within a minute of 3:00, 3:15, 3:30 ect...  I have confirmed these discards are impacting video traffic on that vlan.  We get a burst of CC errors at those times the discards increment.  Wondering if anyone has had this issue before, or knows what type of service occurs every 15 minutes that could be bursting?  Just doesn't make sense.  The discards are equal across trunk interfaces, no other errors occur.  These interfaces are connected to 3750/3560 with no errors or discards on the the 1Gb interfaces.

 

#6509 s int g4/26
GigabitEthernet4/26 is up, line protocol is up (connected)
Hardware is C6k 1000Mb 802.3, address is 0015.2bfc.ceb1 (bia 0015.2bfc.ceb1)
Description: Multiviewer Video Switch
MTU 1500 bytes, BW 1000000 Kbit, DLY 10 usec,
reliability 255/255, txload 9/255, rxload 46/255
Encapsulation ARPA, loopback not set
Keepalive set (10 sec)
Full-duplex, 1000Mb/s, media type is 10/100/1000BaseT
input flow-control is off, output flow-control is off
Clock mode is auto
ARP type: ARPA, ARP Timeout 04:00:00
Last input 00:00:34, output 00:00:43, output hang never
Last clearing of "show interface" counters 01:49:59
Input queue: 0/2000/0/0 (size/max/drops/flushes); Total output drops: 899171
Queueing strategy: fifo
Output queue: 0/40 (size/max)
5 minute input rate 181054000 bits/sec, 16566 packets/sec
5 minute output rate 38911000 bits/sec, 3697 packets/sec
109682946 packets input, 149806385250 bytes, 0 no buffer
Received 109672306 broadcasts (109668625 multicasts)
0 runts, 0 giants, 0 throttles
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
0 watchdog, 0 multicast, 0 pause input
0 input packets with dribble condition detected
24110121 packets output, 31846536069 bytes, 0 underruns
0 output errors, 0 collisions, 0 interface resets
0 babbles, 0 late collision, 0 deferred
0 lost carrier, 0 no carrier, 0 PAUSE output
0 output buffer failures, 0 output buffers swapped out

 

3560 GigabitEthernet0/50 is up, line protocol is up (connected)
Hardware is Gigabit Ethernet, address is 001d.703b.8e32 (bia 001d.703b.8e32)
Description: uplink to Optimus_65000
MTU 1500 bytes, BW 1000000 Kbit, DLY 10 usec,
reliability 255/255, txload 46/255, rxload 9/255
Encapsulation ARPA, loopback not set
Keepalive not set
Full-duplex, 1000Mb/s, link type is auto, media type is 10/100/1000BaseTX SFP
input flow-control is off, output flow-control is unsupported
ARP type: ARPA, ARP Timeout 04:00:00
Last input 00:00:00, output 00:00:09, output hang never
Last clearing of "show interface" counters never
Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
Queueing strategy: fifo
Output queue: 0/40 (size/max)
5 minute input rate 38743000 bits/sec, 3712 packets/sec
5 minute output rate 180945000 bits/sec, 16584 packets/sec
254853961789 packets input, 338401325653985 bytes, 0 no buffer
Received 254349424484 broadcasts (0 multicasts)
0 runts, 0 giants, 0 throttles
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
0 watchdog, 2748471090 multicast, 0 pause input
0 input packets with dribble condition detected
1041656017536 packets output, 1422460209826163 bytes, 0 underruns
0 output errors, 0 collisions, 1 interface resets
0 babbles, 0 late collision, 0 deferred
0 lost carrier, 0 no carrier, 0 PAUSE output
0 output buffer failures, 0 output buffers swapped out

 

 

 

 

6509 s int g4/28
GigabitEthernet4/28 is up, line protocol is up (connected)
Hardware is C6k 1000Mb 802.3, address is 0015.2bfc.ceb3 (bia 0015.2bfc.ceb3)
Description: Multiviewer Video Switch
MTU 1500 bytes, BW 1000000 Kbit, DLY 10 usec,
reliability 255/255, txload 24/255, rxload 87/255
Encapsulation ARPA, loopback not set
Keepalive set (10 sec)
Full-duplex, 1000Mb/s, media type is 10/100/1000BaseT
input flow-control is off, output flow-control is off
Clock mode is auto
ARP type: ARPA, ARP Timeout 04:00:00
Last input 00:00:02, output 00:00:23, output hang never
Last clearing of "show interface" counters 01:50:30
Input queue: 0/2000/0/0 (size/max/drops/flushes); Total output drops: 899171
Queueing strategy: fifo
Output queue: 0/40 (size/max)
5 minute input rate 341675000 bits/sec, 31275 packets/sec
5 minute output rate 95133000 bits/sec, 8960 packets/sec
206425809 packets input, 281947121946 bytes, 0 no buffer
Received 206409333 broadcasts (206404946 multicasts)
0 runts, 0 giants, 0 throttles
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
0 watchdog, 0 multicast, 0 pause input
0 input packets with dribble condition detected
58713634 packets output, 78109010931 bytes, 0 underruns
0 output errors, 0 collisions, 0 interface resets
0 babbles, 0 late collision, 0 deferred
0 lost carrier, 0 no carrier, 0 PAUSE output
0 output buffer failures, 0 output buffers swapped out

 

3750 GigabitEthernet1/0/48 is up, line protocol is up (connected)
Hardware is Gigabit Ethernet, address is f025.7250.bf30 (bia f025.7250.bf30)
Description: Uplink to Optimus_6500
MTU 1500 bytes, BW 1000000 Kbit, DLY 10 usec,
reliability 255/255, txload 87/255, rxload 24/255
Encapsulation ARPA, loopback not set
Keepalive set (10 sec)
Full-duplex, 1000Mb/s, media type is 10/100/1000BaseTX
input flow-control is off, output flow-control is unsupported
ARP type: ARPA, ARP Timeout 04:00:00
Last input 00:00:00, output 00:00:00, output hang never
Last clearing of "show interface" counters 3y32w
Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
Queueing strategy: fifo
Output queue: 0/40 (size/max)
5 minute input rate 95469000 bits/sec, 8997 packets/sec
5 minute output rate 342085000 bits/sec, 31325 packets/sec
255704357662 packets input, 339898282855373 bytes, 0 no buffer
Received 241340027273 broadcasts (0 multicasts)
0 runts, 0 giants, 0 throttles
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
0 watchdog, 81415712 multicast, 0 pause input
0 input packets with dribble condition detected
2197561502342 packets output, 3001359042730041 bytes, 0 underruns
0 output errors, 0 collisions, 0 interface resets
0 babbles, 0 late collision, 0 deferred
0 lost carrier, 0 no carrier, 0 PAUSE output
0 output buffer failures, 0 output buffers swapped out

9 REPLIES 9
Highlighted
VIP Mentor

Hi
Could be oversubscription of the linecard if there all connected same line card , what linecards are you seeing this on and what supervisor is in place ? some have high ratios like 8:1 , port 26 and 28 are on same ASIC may be worth moving 1 to another line card , could be in the blade itself
is qos running on the ports or just set as default
Highlighted

the fact the drop rate is identical on each port i nearly think the blade is oversubscribed on the back
Highlighted

Looks like you're right, there are many more interfaces with high discards and they are all 4/x.  Why every 15 minutes on the dot though is strange.   No QoS, default.

Highlighted

TAC would probably have an answer for that , they may be able to see exactly whats causing issues on backplane if it is that and pinpoint it , they have a lot of internal hardware commands, even some of the guys here may have them too, you could also run the show tech through the cli analyzer tool see if it gives anything useful back , it can point to errors or issues in hardware even bugs

 

i just seen something like this before on some of my 6509s , we stuck load of servers in one ASIC block , first few ports in a row  and we started having huge drops after a week  , we realized after some checks the blade was not able to handle the load as the ratio was  4:1 i think good while back , when we moved 2 of the servers to a free blade the drops didnt disappear but greatly slowed down and we were not seeing interrupts in critical app traffic

 

Thats why the nexus i find are great , line rate ports you dont see these issues as much generally on them as they dont get as easily oversubscribed

 

different blades have different ratios , you can check the data sheets to see what yours are capable off

Highlighted
Hall of Fame Community Legend


@Larry Sullivan wrote:

there are many more interfaces with high discards and they are all 4/x.


What? 

Are the value of discards the same to all ports? 

Move a few to different slots and observe if they're getting discards or not. 

What is the complete IOS version?

Highlighted
Hall of Fame Community Legend


@Larry Sullivan wrote:

I am getting many transmit discards on 6509 interfaces on a specific vlan every 15 minutes, generally within a minute of 3:00, 3:15, 3:30 ect... 

Total output drops: 899171


I am seeing the same Total Output Drops in both Gi 4/26 and Gi 4/28.  Are these two ports going to the same downstream device? 

The description of the problem that it occurs every 15 minutes more sounds like the downstream device cannot process the rate of the steam of data. 

Try configuring the ports (both) to 100 Mbps (Command:  speed auto 100) and see if this improve things?

Highlighted

Different devices, and I can't configure 100 Mbps as the traffic on the interfaces are over 100Mbps. 

Highlighted

I think I might have the same problem.

6509 where we switch 5.5 Gbit of multicast video through. (over a 2x 10Gbit port channel) (slot 3)

Then a DCM downlinked with multiple 1Gbit cards. 1 card receives 800Mbps, sends 350Mbps and the other receives 400Mbps, sends 900Mbps.

They are connected to port 3 and 21 on the same linecard in the 6509 (slot 1).

Fabric utilization is fine, see attachment.

 fabric utilization.png

Slots run in Crossbar mode.

fabric channel-counters are 0, except for rxErrors which are 1 or 4 over an uptime of 3+ years that shouldn't be the problem.
Every 15min on the dot we get discards on the 2 1Gbit ports to the DCM. between 50 and 300 discards/second. for a few seconds and then it's fine again. this causes CC errors in the video streams. The total output drops increases during these moments.

All streams are CBR so a very constant rate bandwidth. None of the other ports on the switch have any discards.

 

The total bandwidth this switch is handling is around 20Gbit (10in and 10out). far below what it should be able to handle.

 

We got a few of these setups. a 6509 with 10Gbit nics and 1Gbit nics and one or more 1Gbit DCM's connected to it. only one other DCM experiences a bit of discards from the switch.
The only difference between the 6509's is the amount of traffic going through them. One handles 5Gbit in and 10Gbit out and the DCM connected experiences some discards from the switch (around 0.015 discards/second TX on switch port) not enough to notice in video, but more then all others except 1.

Another 6509 handles 2.5Gbit in and out with the DCM connect here not experiencing errors.

All the 6509's and DCM's are connected to the same VLAN that handles 5.5Gbit in total, only the first mentioned 6509 actually sees all this traffic because it switches through, the others see less of it.

 

We can't figure out why it's every 15min. 

We also don't know why the switch doesn't seem to be able to handle it.

If this problem was tackled, we would love to know how.

Highlighted

We ended up doing CoS prioritizing all video ingress ports as class 4, this included on all attached access switches (mostly 2960s and 3000 series switches).  We also did DSCP to CoS for IP video coming into our core 6509s from remote sites.  This was what TAC recommended and this is what seemed to have solved the issue.  It was a lengthy project to implement but it worked.  We still don't know why it happened every 15 minutes or why it only seemed to impact multicast video.

Content for Community-Ad