Solved: Weird Bandwidth Issues

Dominique Demore · ‎06-27-2005

Greetings All,

This morning I received call from a user complaining about slow performance. While investigating the problem, I noticed that depending where the user is connected the download speed varie drastically based upon where the user is connected (ie: 150k/s ~ 5.0M/s)

If the user access from a downstream switch they will get ~4.5M/s, however if connected to a cluster at the core, they get ~150k/s

They are access a server farm on a seperate vlan (100). Downstream VLAN (101), port on cluster (103)

I don't see any errors from any logs or sh int.

Any thoughts.

davidamesz · ‎07-12-2005

We have probably identified the problem.

With a lot of sniffing and trail and error we found that only cif/smb traffic is affected (NT4/XP). Not normal tcp traffic (i.e. ftp/http).

When mls qos is enabled the trougput is about 150 to 900 kbps.

Without mls qos the speed is mor than 10mbps.

We tried various ios versions (122-20, 122-25.sea and 122-20.seb2).

We noticed that with the olde ios (122-20) the troughput is doubled compared to the newer ioses (122-25).

Disabling mls qos is no option (VoIP).

We also tested this with a stand-alone switch with the same results.

We also did the same tests with a 3550 with no problems at all.

View solution in original post

vincent.ritzer · ‎06-27-2005

Hi,

Is there some static information entered in the access-switches for this (these?) specific user? This might cause strange behaviour...

You might also want to check out what kind of route the packets take when going from the user to the cluster (or server farm? or is server farm=cluster?).

You can use tools like traceroute (tracert on windows ;-)) or even ping on the client to check that out.

Sometimes it might even help to use a package sniffing tool like Ethereal to find out what's happening on the network layer allthough based on your story I think you do not have to use it in this particular situation.

What kind of tools are you using to test the speeds? it might be that these counters are influenced by ACLs or something.

Hope it helps a bit (it's always hard to troubleshoot performance issues...)

Vincent

Dominique Demore · ‎06-28-2005

Hi Vincent.

Thanks for you reply. As for the static information, there isn't any, ACL aren't in place at the moment.

I guess I should clarify how the current setup is layed out.

Equipement:

3x3750-48TS-EMI (in a cluster)

6x2950G-48-E (Access switches)

Server VLAN (100)

Access ports on 3750 (VLAN 101,102,103,108)

Access switches (1 VLAN per switch - 104,109,110,111,112,113)

When a user on an access port from Access SwitchA(104) requestes a file from a server on vlan 100 which is attached to a gig port on the 3750, I get about 5Meg/s download which is what I would expect from a gig backbone. I can repeat this from any port on any Access layer switch.

However, when a user on the 3750 (fa1/0/10 VLAN 103), requests a files from a server on VLAN 100 connected to g3/0/4 on the 3750 I get about 150k/s - 200k/s

The 3750's are performing L3 routing for all the vlans.

To verify the downloads, we have a 600Meg test file which resides on the server. A user can request that file through a browser,ftp,etc.

I have attached a copy of our configs for review.

Thanks.

-- Dominique

PS. All access layer switches are configured identically except for vlan information.

Dominique Demore · ‎06-28-2005

I guess I should attach the configs also :).... still too early.

-- Dominique

Dominique Demore · ‎07-05-2005

Well, I have manage to narrow the problem even further. It seems that a user connect to a port on the 3750 (FastEthernet) trying to access a server on a GBIC is experiencing the performance hit.

Fa<->Fa 4-5Mb/s

Gb<->Gb 20-25Mb/s

Fa<->Gb 150-200Kb/s

Now that is something that I can't wrap my head around.

It also doesn't matter if they are in the same Vlan or not.

Thanks.

-- Dominique

davidamesz · ‎07-06-2005

I can state this problem as I am investigating some kind of same problem on my site.

Equipment (all stacked with stackwise cables):

2 x 3750G-24TS-EMI

3 x 3750G-48TS-EMI

The 3750G-24-TS switches are used for the server farm with several channeled interfaces (VLAN50) and the 3 x 3750G-48TS are used as access switches (VLAN60). On the 3750G-48TS SFP ports there are two access stacks built with 3524-XL switches (VLAN61, VLAN62).

A client communicating with servers in VLAN50 from VLAN61 or VLAN62 doesn't seems to have any performance problem.

Any client with a fysical connection on the 3750G-48TS suffers the performance problem. This is also true when a client is connected with a port on the 3750G-24TS (port in VLAN60).

If the client port on any of the 3750G switches is configured in VLAN50 then there are no performance issues (layer2 traffic).

davidamesz · ‎07-12-2005

We have probably identified the problem.

With a lot of sniffing and trail and error we found that only cif/smb traffic is affected (NT4/XP). Not normal tcp traffic (i.e. ftp/http).

When mls qos is enabled the trougput is about 150 to 900 kbps.

Without mls qos the speed is mor than 10mbps.

We tried various ios versions (122-20, 122-25.sea and 122-20.seb2).

We noticed that with the olde ios (122-20) the troughput is doubled compared to the newer ioses (122-25).

Disabling mls qos is no option (VoIP).

We also tested this with a stand-alone switch with the same results.

We also did the same tests with a 3550 with no problems at all.

Dominique Demore · ‎07-15-2005

Hi David,

Thanks. "mls qos" was the issue for us also. Once it was removed, the speeds returned to normal.

This will become an issue for us shortly as we start deploying voip. I'll follow-up by opening a TAC case with cisco.

Once again, thanks to you and everyone who worked on it.

-- Dominique

lnguye · ‎07-18-2005

See another post

http://forums.cisco.com/eforum/servlet/NetProf?page=netprof&forum=Network%20Infrastructure&topic=LAN%2C%20Switching%20and%20Routing&CommCmd=MB%3Fcmd%3Ddisplay_location%26location%3D.1dd8cc42

Have you try global command 'sdm prefer routing' and reload the cat3750?

davidamesz · ‎07-20-2005

Hi,

Thanks for the responce. It wasn't the answer but it came near.

What we dit was enter the following command on an interface whitch is connected to a phone or other switch with phones connected to it:

auto qos voip trust

This results an some extra qos commands automatically generated by the switch.

The link we used:

http://www.cisco.com/univercd/cc/td/doc/product/lan/cat3750/12225sea/3750scg/swqos.htm

It seems this solved the problem.

This is the answer we got from Cisco:

I have looked just now into your configuration.

The reason why you have performance degradation once you enable mls qos is because you did not configure egress queues at all.

3550 and 3750 switches have different egress queues and different default egress QoS configuration. In 3750 default egress queue

configuration is that default queue 1 receives only 25% bandwidth

allocation and does not share it. In 3550 switch egress queues by default share unoccupied bandwidth by default.

David