cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1170
Views
0
Helpful
3
Replies

Periodic bad responsetime on Cisco 2811 router

Ulrich Hansen
Level 1
Level 1

Hello,

I'm currently involved in what has evolved into a regular task-force situation. On of my customers are experiencing bad responsetime on an almost unprecedented scale (at least according to them). My task is to verify wether the network can be dismissed as the root cause or not, though the evidence so far does not point toward the network.

The customer has an HQ and several remote branches. With the exception of a few, each branch has an almost identical infrastructure. A 2811 router for WAN and x number of 29xx/35xx switches for LAN. One switch is connected to the router by FastEthernet and the subsequent switches are interconnected in a daisy chain fashion, though most sites has only one switch.

During my investigation, I've noticed periodic slow response/lost packets on the LAN-interface on the router. If I do an extended ping, e.g 500 packets at standard size, both the WAN and Loopback interface shows no packetloss/slow reponse, but the LAN interface is a completely different story. Im sitting at the central datacenter and perform theese tests across an mpls-provider network, but usually I'm not seeing rtt above 6-7 ms, avg. is around 5 ms. But the LAN-interface produces rtt above 200ms. This is not a consistent scenario, but appears every now and then.

I know, that icmp is not the most effective way to do real reponse-measurements and I'm planning on installing a router on the customer LAN and do a more comprehensive IP SLA end-to-end, both to the WAN router and the "IP SLA" router on the LAN, just to see, if there's a significant difference.

It's a relatively small LAN installation, with only one 2900XL switch (I know, should have been replaced ages ago) and 10-15 HP Thin Clients running embedded Win7. No servers or anything fancy, since all applications are executed in a XenApp enviroment. Ad-hoc rtt-icmp tests on a Thin client also shows bad responsetime.

I've checked the router for interface drops/crc errors, you name it, on both the LAN and WAN interface, but not a single packet has been dropped, nor do I see anything that suggests speed/duplex mismatch, faulty cable or other interface related errors. The LAN shows no sign of being saturated with broadcast og multicats traffic, though we've seen an increase in multicast since the introduction of Win7 clients. The router cpu log shows no signs of either small or longer lasting peaks, running at around 5-6% consistently. Enviroment shows no sign of fan-failure or high temperature. All in all, neither the router nor the switch appears to be in bad shape.

So my question is: Can anyone think of any reason why such bad responetime could be localised to the LAN-interface only and never on the WAN/Loopback interface? I've tried to remove any unnecessary configuration from the LAN-interface, but with no improvement. Has anyone experienced similar problems with similar symptoms?

Thanks

/Ulrich

3 Replies 3

vmiller
Level 7
Level 7

Fascinating. One would suspect the Wan first.

I would take a good hard look at the lan config, and how routes are shipped.

In other words, are the edge routers getting big routing updates?

stuff like that.

is there a relationship between the increase in response time and the number of switches at a location, do the sites at the one switch locations

experience the same issue?

I

Hi,

Thanks for your response.

Well, there's actually no heavy routing in the works here. The provider PE has a static route towards the customer router (2811) and vice versa on the 2811 router. The LAN behind the router is strictly Layer2, there no internal route propagation going on at theese sites.

As for any correlation between the number of switches at a given location and the severity of the problem, there's nothing to suggest that. Also, this is not a consistent scenario. Just this morning I did a series of rtt-tests against the LAN-interface and did not experience any reponsetime above 6ms. So it's really difficult to see, what would trigger this.

Given the simplicity of the design and the complete absense of anything that could cause a Loop, it's really beyond me. Spanning-tree is virtually never in use at theese sites, as the customer themself installs the switches and chooses to daisy-chain them, creating only a single L2-path across the LAN. I've tried to reproduce the same scenario at other customer sites, who all use similar setups, but no "luck" so far. Which leeds me to believe, that this is a local problem. Still, there's nothing to suggest, that the LAN is being saturated with broadcast or unknown unicast-flooding. The actual packet load on the routers LAN-interface (collected at a 30 sec interval), shows very low utilization. This mostly due to the fact, the 90% of all the traffic going in and out of the router is VoIP and ICA, with the occasional printjobs.

/Ulrich

Joseph W. Doherty
Hall of Fame
Hall of Fame

Disclaimer

The Author of this posting   offers the information contained within this posting without   consideration and with the reader's understanding that there's no   implied or expressed suitability or fitness for any purpose.    Information provided is for informational purposes only and should not   be construed as rendering professional advice of any kind.  Usage of   this posting's information is solely at reader's own risk.

Liability Disclaimer

In   no event shall Author be liable for any damages whatsoever (including,   without limitation, damages for loss of use, data or profit) arising  out  of the use or inability to use the posting's information even if  Author  has been advised of the possibility of such damage.

Posting

Ping tests, without supported IP SLA responders, can be misleading.

Since you mention MPLS, lots can go wrong there.  I've had, although uncommon, vendor provisioning errors of one kind or another.  (Also had one where WAN (a Tier 1) vendor, with my constant complaining over 3 months that it's not working right, found an internal MetroE line card with buggy firmware.)  Sometimes running various stress tests, where results are deterministic, can identify such issues.

Review Cisco Networking for a $25 gift card