01-06-2013 07:07 PM - edited 03-07-2019 10:56 AM
Hi Cisco Magi,
I have a serious issue here which goes on for a long time and I couldn`t find a solution for it so far.
The problem is affecting at least the 2960, 3560 and3750 series switches. No issues if all the interfaces on the switches are on 1G speed. However if I pick two laptops and one of them is running on 100Mbit and the other one is on gigabit I can`t get a 100Mbit throughput in both dierections at the same time. The traffic in one direction is alright (about 93 Mbit/s) but the other direction is about 7-10Mbit.
As a lab test I tried to isolate the problem with two 2960 switches (IOS - c2960-lanbasek9-mz.122-55.SE6, models: WS-C2960-48TT-L and WS-C2960-24TT-L ). Test topology: Laptop 1 --> 100Mbit switchport on switch 1 --> switch 1 --> gigabit connection --> switch 2 --> gigabit switchport on switch 2 --> Laptop 2
The test shows the same results if I use only one switch, with a 100Mbit and a Gigabit port. Please note that it affects only TCP traffic (tested with iperf and FTP). Nevertheless, If I manually configure the Gigabit port to speed 100 the problem disappears, I have perfect results in both directions.
Tried with and without mls qos, same results. Tried to adjust interface buffers, hold queues; queue-set buffers and thresholds, no success.
Maybe it is just a trivial config change that I`m not aware of, but it is really annoying. Any help/suggestion is appreciated.
Thanks,
Balazs
01-07-2013 02:36 AM
Disclaimer
The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.
Liability Disclaimer
In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.
Posting
What IOS are your hosts running?
Reason I ask, Windows XP, by default, sets TCP RWIN to 64 KB when connected to gig port, but only about 1/4 that when connected to 100 Mbps.
01-07-2013 03:05 AM
Balazs and Joseph,
Is there perhaps a possibility of the NICs using the MTU of 9K when working on 1Gbps, and 1.5K when working on lower speeds? In such case, it would result into MTU mismatches and possibly packet drops.
Best regards,
Peter
01-07-2013 03:15 AM
Disclaimer
The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.
Liability Disclaimer
In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.
Posting
Peter, that's an interesting question!
01-07-2013 02:09 PM
Hi Peter,
I double checked the MTUs on both interfaces, and they show 1500Kbyte. No df bits are not set in the packets, it is not simply an MTU problem unfortunately, nor a speed or duplex issue.
I made several tests on different switches (2960 series) using different laptops, no success...
02-18-2015 11:03 AM
Hi,
Did you ever find a solution to this?
Thanks
01-07-2013 01:59 PM
Hi Joseph,
Both computers run Debian 6.0 as an OS. I forgot to mention in my oroginal post that years ago I made test with laptops running Win XP and Ubuntu 9.10 on 3750 switches, unfortunately with the same results.
Moreover both laptops have Gig ports, and if I make a dual iperf test between them via a single cat5 cable I got 900/900Mbit/s.
I would say it is not related to the end host TCP windowing directly.
01-07-2013 02:26 PM
Hi Joseph,
The problem occures on the direction form the Gig interface towards the 100M port, basically the "download" of the laptop on the 100Mbit port is the faulty.
No drops in the interface stats on the FA port:
FastEthernet0/24 is up, line protocol is up (connected)
Hardware is Fast Ethernet, address is 001b.537f.3518 (bia 001b.537f.3518)
MTU 1600 bytes, BW 100000 Kbit, DLY 100 usec,
reliability 255/255, txload 1/255, rxload 1/255
Encapsulation ARPA, loopback not set
Keepalive set (10 sec)
Full-duplex, 100Mb/s, media type is 10/100BaseTX
input flow-control is off, output flow-control is unsupported
ARP type: ARPA, ARP Timeout 04:00:00
Last input never, output 00:00:01, output hang never
Last clearing of "show interface" counters never
Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 1816
Queueing strategy: fifo
Output queue: 0/40 (size/max)
30 second input rate 0 bits/sec, 0 packets/sec
30 second output rate 0 bits/sec, 0 packets/sec
3639376 packets input, 4425186564 bytes, 0 no buffer
Received 40 broadcasts (28 multicasts)
0 runts, 0 giants, 0 throttles
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
0 watchdog, 28 multicast, 0 pause input
0 input packets with dribble condition detected
2989503 packets output, 2594906882 bytes, 0 underruns
0 output errors, 0 collisions, 1 interface resets
0 babbles, 0 late collision, 0 deferred
0 lost carrier, 0 no carrier, 0 PAUSE output
0 output buffer failures, 0 output buffers swapped out
The only drops I see are in the queue but I believe that`s normal due to the inital burst of the gig traffic:
sh mls qos interface fa0/24 statistics
FastEthernet0/24 (All statistics are in packets)
dscp: incoming
-------------------------------
0 - 4 : 3639352 0 0 0 0
5 - 9 : 0 0 0 0 0
10 - 14 : 0 0 0 0 0
15 - 19 : 0 0 0 0 0
20 - 24 : 0 0 0 0 0
25 - 29 : 0 0 0 0 0
30 - 34 : 0 0 0 0 0
35 - 39 : 0 0 0 0 0
40 - 44 : 0 0 0 0 0
45 - 49 : 0 0 0 0 0
50 - 54 : 0 0 0 0 0
55 - 59 : 0 0 0 0 0
60 - 64 : 0 0 0 0
dscp: outgoing
-------------------------------
0 - 4 : 2940604 0 0 0 0
5 - 9 : 0 0 0 0 0
10 - 14 : 0 0 0 0 0
15 - 19 : 0 0 0 0 0
20 - 24 : 0 0 0 0 0
25 - 29 : 0 0 0 0 0
30 - 34 : 0 0 0 0 0
35 - 39 : 0 0 0 0 0
40 - 44 : 0 0 0 0 0
45 - 49 : 0 0 0 0 0
50 - 54 : 0 0 0 0 0
55 - 59 : 0 0 0 0 0
60 - 64 : 0 0 0 0
cos: incoming
-------------------------------
0 - 4 : 3639399 0 0 0 0
5 - 7 : 0 0 0
cos: outgoing
-------------------------------
0 - 4 : 2941529 0 0 0 0
5 - 7 : 0 0 0
output queues enqueued:
queue: threshold1 threshold2 threshold3
-----------------------------------------------
queue 0: 28 0 0
queue 1: 522537 3050 1707057
queue 2: 0 0 0
queue 3: 0 0 760998
output queues dropped:
queue: threshold1 threshold2 threshold3
-----------------------------------------------
queue 0: 0 0 0
queue 1: 367 0 1249
queue 2: 0 0 0
queue 3: 0 0 200
Policer: Inprofile: 0 OutofProfile: 0
I would say it is a queuing/memory problem or configuration fine tuning that I`m just not aware of. Maybe it is a well known issue but I could not find any reference for it. It is hard to believe that only I have experienced this thing so far.
I`ll provide any information or output that can lead toward a solution.
Thanks for the help on this so far guys.
01-07-2013 06:24 PM
Disclaimer
The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.
Liability Disclaimer
In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.
Posting
I've been thinking about this today, and what seems the most likely explanation, from gig to 100, the former is overrunning the latter, and encountering drops. I would expect performance to be less than 100 to 100 Mbps, but what's surprising is the amount of reduced transfer performance you're seeing. I would expect up to around 25% reduction, not 90% reduction.
Off the top of my head, I could see the sender waiting on timeouts greatly slowing the transfer rate but can't see a reason why you would be waiting on many timeouts.
Regarding TCP RWIN - that still might be a factor, although on the receiving gig to 100 Mbps host it might be too large. I.e. making in smaller might improve gig to 100 Mbps.
Every get a packet capture of one of these transfer tests?
01-08-2013 09:55 PM
Hi Jospeh,
I uploaded a pcap capture file if you want to have a look on it, it is 340Mbyte though...
The link is:
http://www.spidernet.net.au/iperfcapture.pcap
Thanks,
Balazs
01-07-2013 05:29 AM
Disclaimer
The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.
Liability Disclaimer
In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.
Posting
In your test topology, which direction is showning the low transfer rate? I.e. to the host with the 100 or to the host with gig?
Do any of the switch ports record drops, such as 100 Mbps egress?
01-14-2013 04:56 PM
Hi Guys,
Any other ideas? I saw a couple of similar posts, but without a correct answer for them. I would be keen to know if at least anyone was able to reproduce this fault.
01-14-2013 05:53 PM
One more thing. Maybe I haven`t emphasise enough yet, that the problem occurs only if the traffic flows in both directions at the same time. If I make an iperf test in a single direction the results are absolute perfect. The problem only appears when a dual test is issued.
01-14-2013 06:17 PM
Disclaimer
The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.
Liability Disclaimer
In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.
Posting
I haven't had a chance to look at your pcap file. Didn't have any ideas until seeing your last post. If you're transmitting at full capacity in both directions, gig to 100 Mbps should congest on egress, but from what you're now describing, you'll have contention between a main test flow and the other direction's TCP ACK flow. If you QoS protect the ACKs, I'm wondering whether you would see a difference.
01-14-2013 06:41 PM
Hi Joseph,
I was thinking the same according the TCP ACKs, maybe the ACKs don`t reach the other end causing retransmission. However as I mentioned in the original post if I issue a speed 100 command on the gig interface causing it to negotiate Fa with that laptop the problem disappears. If both ports are Fa ports I get approximately a 100Mbit/100Mbit during a dual test as expected from a Cisco switch. The results are good between two GIG pots as well. The only scenario when the traffic assymetric when one of the ports is on 1Gig speed an the other one is on 100Mbit.
I hope I could clarify the problem a liitle bit more.
I found some similar topic here:
http://www.sadikhov.com/forum/index.php?/topic/165691-me3750s-wred-threshold-es-port/
Unfortunately there was no solution there either. I tried dozens of threshold and buffer allocation combinations, no success...
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide