Troubleshooting Output Queue drops

wkw · ‎02-22-2005

Greetings,

I'm troubleshooting a 2Mb WAN link connecting our main campus to our branch campus. Main campus router is a Cisco 3640 router with a WIC-1T, branch campus is a cisco 3660 router with another WIC-1T. We are running EIGRP over the WAN link and I observed that the peer will go down and up again, similar to the log below:

Feb 23 12:31:37 GMT: %DUAL-5-NBRCHANGE: IP-EIGRP 1: Neighbor 172.19.x.3 (Serial0/0) is down: peer restarted

Feb 23 12:31:41 GMT: %DUAL-5-NBRCHANGE: IP-EIGRP 1: Neighbor 172.19.x.3 (Serial0/0) is up: new adjacency

Feb 23 12:31:51 GMT: %DUAL-5-NBRCHANGE: IP-EIGRP 1: Neighbor 172.19.x.3 (Serial0/0) is down: peer restarted

Feb 23 12:31:55 GMT: %DUAL-5-NBRCHANGE: IP-EIGRP 1: Neighbor 172.19.x.3 (Serial0/0) is up: new adjacency

Looking at the interface I noted that the output queue drops are abnormally high and ever increasing. sh int results:

Serial0/0 is up, line protocol is up

Hardware is QUICC Serial

Description: 2Mb Trunk Main-Branch

Interface is unnumbered. Using address of Ethernet0/0 (172.16.x.x)

MTU 1500 bytes, BW 2000 Kbit, DLY 20000 usec,

reliability 255/255, txload 214/255, rxload 91/255

Encapsulation HDLC, loopback not set

Keepalive set (10 sec)

Last input 00:00:00, output 00:00:00, output hang never

Last clearing of "show interface" counters 17:22:40

Input queue: 1/75/0/0 (size/max/drops/flushes); Total output drops: 190825

Queueing strategy: random early detection(RED)

5 minute input rate 717000 bits/sec, 255 packets/sec

5 minute output rate 1682000 bits/sec, 318 packets/sec

4881158 packets input, 1809150635 bytes, 0 no buffer

Received 7323 broadcasts, 0 runts, 0 giants, 0 throttles

4 input errors, 0 CRC, 1 frame, 0 overrun, 0 ignored, 3 abort

6128571 packets output, 4054447075 bytes, 0 underruns

0 output errors, 0 collisions, 2 interface resets

0 output buffer failures, 0 output buffers swapped out

0 carrier transitions

DCD=up DSR=up DTR=up RTS=up CTS=up

Traffic to and fro are mostly http traffic (legit). I tried the tips at http://www.cisco.com/warp/public/63/queue_drops.html#topic4

but to no avail. The link is not congested, yet it's dropping packets. Any ideas on how to troubleshoot further, what more to look at etc?

thanks

mhussein · ‎02-22-2005

There are frame and abort errors on the output of show interface. Frame errors indicate CRC errors on the near-side (this is different from CRC in the output of sho int, where CRC errors indicate problems on the far-end as well). Abort errors indicate clocking problems between the interface and the service provider's equipment.

From all the above, the problem could be clocking or cabling issue on the main campus.

Check your cabling and if possible run a loopback test to the csu/dsu or service provider's SmartJack/NIU/NHRU.

If that fails, you may need to contact the service provider.

References

1. Troubleshooting Serial Lines

http://www.cisco.com/univercd/cc/td/doc/cisintwk/itg_v1/tr1915.htm

HTH

Mustafa

allan.thomas · ‎02-23-2005

From the show interface above, it is apparent that you are running RED queueing strategy. Random Early Detection (RED) is a congestion avoidance mechanism that takes advantage of TCP's congestion control mechanism.

This is achieved by randomly dropping packets prior to periods of high congestion, RED tells the packet source to decrease its transmission rate.

Assuming the packet source is using TCP which HTTP is, then it will decrease its transmission rate until all the packets reach their destination, indicating that the congestion is cleared.

Suffice to say, although the interface txload inidcates it is approx 84% utilised at the time you ran the show interface, it is most probable that the load peaks above this, and hence RED comes into effect.

jroyster · ‎02-23-2005

My guess is you are filling the output queue by over utilizing the circuit - as noticed by your high TX load.

If the circuit is clean (no errors) that is really the only explanation. Try and find out what is causing the high utilization.

wkw · ‎02-23-2005

Thanks for the answers, previously I was using fair-queue on the interface and it was dropping a lot of packets, even when the load was not that heavy (around 50% according to the TX load).I put ip route-cache flow in and observed the traffic was to our proxy server (users going to the web) - the branch campus accesses the Internet via our WAN link - I changed to RED since this is TCP traffic, hoping the traffic congestion management method will work, and it seems to be better now.

One question, since the problem seem to cause our eigrp peering to drop and reform again, would it be recommended if i change the eigrp hold-time and hello interval to something longer? Would this reduce the peer loss from occurring more frequently?

thanks

woon