LACP Link faster in one direction

toprock1970 · ‎02-18-2015

We have a two sites that are linked by a pair of 1Gb links aggregated in an LACP ether-channel.

When I run iperf with the server in Site B and my client in Site A, the speed is less than half than if I perform the test in the other direction.

There's no firewalls between these links.

Dropping link A or B between the sites (so only one is up) gives the same speed results as with both links up and it's consistant that Site B to Site A is always twice as fast as Site A to Site B.

Site A has a stack of 9 3750E switches running 15.0(1)SE2.

Site B has a pair of 3750G switches running 12.2(25)SEE4.

The etherchannel at Site A looks like this:

interface GigabitEthernet5/0/49
switchport trunk encapsulation dot1q
switchport trunk native vlan 101
switchport mode trunk
mls qos trust cos
channel-group 5 mode active

interface GigabitEthernet6/0/49
switchport trunk encapsulation dot1q
switchport trunk native vlan 101
switchport mode trunk
mls qos trust cos
channel-group 5 mode active

interface Port-channel5
switchport trunk encapsulation dot1q
switchport trunk native vlan 101
switchport mode trunk

Site B looks like this:

interface GigabitEthernet1/0/26
switchport trunk encapsulation dot1q
switchport trunk native vlan 101
switchport mode trunk
mls qos trust cos
channel-group 5 mode active

interface GigabitEthernet2/0/25
switchport trunk encapsulation dot1q
switchport trunk native vlan 101
switchport mode trunk
mls qos trust cos
channel-group 5 mode active

interface Port-channel5
switchport trunk encapsulation dot1q
switchport trunk native vlan 101
switchport mode trunk

Can anyone see anything wrong with this or have any suggestions on why this network could be behaving in this manor?

Any help would be greatly appreciated.

Regards
Jason

Vasilii Mikhailovskii · ‎02-24-2015

Hello.

Bandwidth allocation on port-channel depends on the load-balancing mechanism you are using.

Your iperf flow[s] could have been mapped to a single link -> that is why it never goes above 1 link capacity.

Regarding different speed -> it might be caused by any equipment on the path (or latency differ). And the best is to run iperf have the link been connected directly to the NIC.

Try to run iperf with different number of flows and large window size.

toprock1970 · ‎02-26-2015

Thanks for the reply Vasilii.

The different speed issue seems a little strange in that we have a pair of 1Gb fibre connections - both with the same telco.

Site A has two links with the NTEs having a fibre connection going to a switch stack. Each fibre terminating in one of the stack members.

Site B has the other two fibres and the NTEs also have a fibres going to a switch stack.

We basically have:

Site A Site B
[Switch Stack] [Switch Stack]
Link A(Switch Memeber)---------NTE-------------------------NTE--------(Swicth Memer)
Link B(Switch Memeber)---------NTE-------------------------NTE--------(Swicth Memer)

No other equipment is in between.
The speed is always faster when the test is fired from B to A.

While I appreciate that other setting could be used with iperf, I'd still expect to see very similar results in speed if using the same method for each test.

I'm debating having the teleco perform some tests on the lines but given the speed tests generate the same results regardless of which link is taken down, I feel the issue most likely lies outside of the teleco links.

Given the minimum number of devices linking site A to B, do you have any thoughts or suggtestions that or recommeded tests that might help isolate what is causing the difference in speeds?

Regards
Jason

Vasilii Mikhailovskii · ‎02-26-2015

Hello, Jason.

Just to clarify - are you sure the links are pure fiber and not L2VPNs?

toprock1970 · ‎03-02-2015

Hi Vasilii,

Having checked, I can see that it is fiber to the distribution box and then fiber to the switches.

I've had a look at the SNMP monitoring for this switch and can see that I'm getting a large number of drops on what looks like all the interfaces that are up.

Conversly, the switch at the secomdary site is not experiencing any drops.

Looking further into our monitoring I can see that the switch at the primary site's memory is using just over 40% and the CPU is 57%.

I think this issue might be at the switch and not the LACP.

I've looked at some of my other switch stack and it seems all the stacks that have IOS 15.0.(1)SE2 seem to be experiencing high levels of output drops. I think I might need to investigate if there is any known issues with this IOS.

Regards

Jason