09-03-2009 07:16 AM - edited 03-04-2019 05:56 AM
We have two DS3 connections via BGP to our DR site.
BGP is configured to allow the two paths, and the host to host load is being distributed pretty evenly across the two DS3s.
We are using these links to sync our data from HQ to DR and the data flow is bursty from about 15-20% utilization on both links to 100%.
There are gigabit ports from the source devices, up to two gigabit ports uplinking to the HQ7206 to DR with a single FastE port linking from the 7206 to the destination device.
The traffic is 99% of the time from HQ to DR.
I am seeing output drops accumulate on both serial links on the source side even when the utilization is low.
It seems to be bursty, incrementing for a few seconds, then not for a couple of minutes, then repeat.
There are no drops on the ethernet links on either side.
There is a QoS policy outbound on the serial interfaces with the default class configured to "fair-queue"
I can understand drops during high bandwidth utilization, but not during low utilization. Is this something I should be concerened about?
It seems to be a small amount, could it be just nominal circuit issues?
#sh int summary
*: interface is up
IHQ: pkts in input hold queue IQD: pkts dropped from input queue
OHQ: pkts in output hold queue OQD: pkts dropped from output queue
RXBS: rx rate (bits/sec) RXPS: rx rate (pkts/sec)
TXBS: tx rate (bits/sec) TXPS: tx rate (pkts/sec)
TRTL: throttle count
Interface IHQ IQD OHQ OQD RXBS RXPS TXBS TXPS TRTL
------------------------------------------------------------------------
* GigabitEthernet0/1 0 0 0 0 9847000 2216 1517000 1178 0
* GigabitEthernet0/2 0 0 0 0 7000 1 0 0 0
* Serial1/0 0 0 0 168240 509000 455 6050000 2096 0
* Serial2/0 0 0 0 686957 1659000 1491 4572000 916 0
Serial1/0 is up, line protocol is up
Hardware is M2T-T3+ pa
Description: connected to MCI DS3 Disaster Recovery
MTU 4470 bytes, BW 44210 Kbit, DLY 200 usec,
reliability 255/255, txload 36/255, rxload 2/255
Encapsulation FRAME-RELAY IETF, crc 16, loopback not set
Keepalive set (10 sec)
Restart-Delay is 0 secs
LMI enq sent 0, LMI stat recvd 0, LMI upd recvd 0
LMI enq recvd 15917, LMI stat sent 15917, LMI upd sent 0, DCE LMI up
LMI DLCI 0 LMI type is ANSI Annex D frame relay DCE
FR SVC disabled, LAPF state down
Broadcast queue 0/256, broadcasts sent/dropped 82240/0, interface broadcasts 187718
Last input 00:00:00, output 00:00:00, output hang never
Last clearing of "show interface" counters 1d20h
Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 173244
Queueing strategy: Class-based queueing
Output queue: 0/1000/64/173244 (size/max total/threshold/drops)
Conversations 0/45/256 (active/max active/max total)
Reserved Conversations 0/0 (allocated/max allocated)
Available Bandwidth 24157 kilobits/sec
30 second input rate 455000 bits/sec, 413 packets/sec
30 second output rate 6250000 bits/sec, 2253 packets/sec
147250390 packets input, 17700821862 bytes, 0 no buffer
Received 5646 broadcasts, 0 runts, 0 giants, 0 throttles
0 parity
178 input errors, 153 CRC, 0 frame, 24 overrun, 0 ignored, 1 abort
309803920 packets output, 112379404547 bytes, 0 underruns
0 output errors, 0 applique, 0 interface resets
0 unknown protocol drops
0 output buffer failures, 0 output buffers swapped out
0 carrier transitions
rxLOS inactive, rxLOF inactive, rxAIS inactive
txAIS inactive, rxRAI inactive, txRAI inactive
09-03-2009 08:35 AM
Looking at your interface statistics, you have CRC errors on the link:
153 CRC
It looks like you cleared your counters out almost two days ago. Do you have this same thing on the other side? You shouldn't have any, so I would get in touch with the provider to see if they are seeing errors on their end as well.
HTH,
John
09-03-2009 08:57 AM
Thanks for the reply.
I was thinking that given the amount of traffic that has gone through that link, that 153 CRC errors was pretty much nothing.
And there are many more drops that CRC in comparison.
You think it could be an issue?
09-03-2009 09:01 AM
153 isn't a lot, but you should have 0. I don't know if that's causing your problem, but it definitely could go hand-in-hand. I had this problem with a DS3 once where I had to prove that it wasn't my problem with AT&T. I was getting 1 CRC every 5 minutes. We eventually moved loops throughout the network, and the farther they got from me, they found a switch that had a bad card that they had to move us from. That resolved the issue. I also wasn't able to get faster than 15Mbps on a 40Mbps line.
I would talk to the provider and have them throw a loop up to see if they can find in their path where the problem is in relation to where you are.
P.S. Having them do a loop will kill your connection while they're testing, so if this is production I would have them do after-hours.
HTH,
John
09-03-2009 10:22 AM
"I can understand drops during high bandwidth utilization, but not during low utilization. Is this something I should be concerened about?"
Perhaps; it's not so much a question of drops but whether there are too many drops. Type of traffic is an important consideration (TCP vs. non-TCP).
Low vs. high utilization, with regard to packet dropping, can be misleading. Utilization is based on average usage over some time period. Much can happen within milliseconds. If traffic is "bursty", as you note, it's possible the drops are due to transient congestion.
For many TCP implementations, too many drops, "average" utilization can actually decrease "hiding" a bandwidth oversubscription issue.
In other words, your expectations about drops being tied to low vs. high utilization don't always hold true.
"It seems to be a small amount, could it be just nominal circuit issues?"
Unlikely to be a "circuit issue", beyond transient congestion.
Packet dropping within TCP flows, on any network segment that can be oversubscribed, is a normal part of TCP bandwidth probing. (In fact, in theory it should be much more common then it is in practice, but also in practice, many host TCP implementations don't provide a large enough default RWIN, which often precludes the offered transfer rate from exceeding the available bandwidth.)
Depending on why you're seeing drops, it might be almost impossible to mitigate them on most Cisco devices, or you might be able to decrease them such there are almost none to none at all. For the latter, correct setting of queue depths and/or RED usage might impact your drops.
For your posted stats, the overall drop rate appears low enough it might not be worth the time, or much time, to try to decrease further. However, since you're already using CBWFQ FQ(?), for a T3, you might want to increase the queue depth from the default(?) of 64. You might allow enough packets to hold half to full BDP.
09-03-2009 12:47 PM
Thanks Joseph,
How do I change the threshold?
"hold-queue" at the interface level changed the max and not the threshold.
09-04-2009 06:38 AM
See if your platform/IOS supports "queue-limit" under class-default, and if so, whether the FQ is modified by it.
09-04-2009 11:13 AM
You are like an encyclopedia joseph.
09-02-2010 10:16 AM
Hey there, ran across this in a google search. I was seeing incrementing drops on a FastE 0/0 interface on a 2811 router, with only 137k/sec being output - obviously not a usage problem. Tracked it down to a policy map that permitted 128k/sec from a certain server to prevent update traffic from clogging our WAN links - the exceeded traffic that was dropped directly corrolated to the number of output drops we were seeing. So if you see output drops during low usage, check to see if you have a policy-map applied.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide