05-01-2018 07:09 AM - edited 03-08-2019 02:51 PM
Hi,
We have an L2 EPL ("Ethernet Private Line") between two sites. The headquarter site has a 5-member stack of various 3850 switches (2x 3850-12X48U-S and 3x 3850-48T-S with Stackwise with IOS 3.7.5E). The remote site has a single 3560-48TS-S with IOS 12.2(55)SE11.
Each site has internet access, the Private Line only route internal network traffic between subnets. The remote site only have one subnet, while headquarter has many. There are static routes for each subnet for now.
I am using IPERF3 to test bandwidth. Bandwidth of EPL should be 50 mbps up/down. Both ends of the EPL are plugged in to ports that are set 100 mbps full duplex.
The issue is network behave differently with different device and also different VLAN it seems. There is this particular VLAN where traffic goes fine back and forth, as long as the endpoint devices are virtual machines (ESXi with vmxnet virtual nic). If I connect a physical device, like a laptop, or a baremetal server, traffic going to the remote site slows down to 1/4 - 1/5 of 50 mbps (traffic going back to headquarter always good). On a different VLAN, traffic between virtual machines are just a little bit slower going to remote site with occasional dip to 30 mbps here and there. Physical device always have bad speed going to remote site.
Our VMware ESXi hosts are connected to the 10Gbps ports in trunk mode, while our physical devices are connected to access ports.
I checked TCP offload and flow control settings, they are all default (on).
When I did wireshark capture on each scenario, they always have some "TCP Dup ACKs" and retransmissions but when traffic is bad theres just more of those captured. Not sure what to make of that yet.
When I complained to the provider, they are holding on to the fact that traffic is "fine" on that particular scenario (particular VLAN and between virtual machines).
Traffic within the site network is fine. IPERF shows max bandwidth (1Gbps) between any VLANs with any devices. Our VMs have 10Gbps network, although we couldn't really reach 10Gbps iperf because of storage performance, on good devices we are hitting 2-4Gbps, with no TCP Dup or retransmissions. The switches don't see any errors/dropped packets.
I hope one of you can help me if I miss something really obvious? I'm a newbie in networking I'm sorry if I am ignorant of something.
Thanks for reading,
Regards,
05-01-2018 07:41 AM
Hi,
Is there any qos parameters configured on either 3850 stack or the 3560 interfaces connecting to the provider?
Also, from experience, this does not seem to be a provider issue as with some devices you get the speed (or close to it) you are paying for and with other devices (physical servers) you are not. Also, to be sure, you can ask the provider to run a test on your circuit. They will run it on their equipment only (end-to-end) and will give you the results.
HTH
05-01-2018 07:55 AM
Hi Reza,
Thanks for reply. There are no QoS parameters at this time (I had them removed to troubleshoot this issue, the QoS was for Avaya H323 / DSCP).
Thanks,
05-01-2018 08:07 AM
As additional info, these are the kind of traffic I am getting on IPERF3 (see screenshots):
Note: Below are traffic from headquarter to remote site (server on remote). The other way, the traffic is always slightly better at 47-48 mbps. I dunno if 4-5 mbps difference tells us anything.
Thanks
05-01-2018 09:59 AM
Is there any CRC or any other error on the interfaces connecting to the provider or inside the network?
05-01-2018 10:15 AM
There are no CRC errors from show interface, there are a some drops on the remote site (see attached)
05-02-2018 12:43 PM
We kind of found the problem after talking with the provider.
The provider had to configure port on their "ADVA" fiber/modem/router (not sure) device to Auto Negotiation. I'm not sure if that is directly the problem.
Previously the "ADVA" devices on each end were configured 100Mb/Full, and I had to manually configure router/firewall for 100Mb/Full as well. Then the weird things happen (transfers VM to VM was good; but physical to physical, and physical to VM suffered).
After setting the "ADVA" to auto negotiation, and my firewall/router to auto as well, they auto negotiate at 1000Mb/Full, and everything work fine at 50/50mbps.
I still don't understand why, if you experts can explain why that would be great. I don't understand why different client devices behave differently over that particular port configuration (when it was set 100Mb/Full)
Thanks
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide