11-28-2012 02:51 PM - edited 03-07-2019 10:18 AM
Hello,
we have a problem with the throughput over etherchannel in LACP with 2 or 4 ports.
It is iSCSI traffic (vmware esxi 4.1 U3) is going from 2 separate NICs (ports) to the etherchannel (with 2 or 4 ports) that has a SAN connected (Nexenta).
The SAN is configured in passive LACP and the switch is in active LACP. Actually it does not matter if we do LACP or just MODE ON, still same result: ~1GBit/s throughtput in either direction. Like already mentioned, 2 or 4 ports in the etherchannel make no difference, or the configuration of the etherchannel.
I will post some config data below, but here is the question: Why cann't we see traffic beyond 1GBit/s? Source and destination are capable of doing much more than that (vmware esxi RAID 5 of 1TB SATA; SAN 16 x 1TB NL-SAS). If we look with CNA, we can see that the traffic is balanced equally over the etherchannel ports. With or without QOS or flowcontrol, no difference. This whole traffic happenes on this switch. What are we missing?
Thank you for the answers!
!
port-channel load-balance src-dst-ip
!
interface Port-channel5
switchport access vlan 80
switchport mode access
flowcontrol receive desired
!
interface GigabitEthernet0/30
description vmware iscsi1
switchport access vlan 80
switchport mode access
srr-queue bandwidth share 10 10 60 20
srr-queue bandwidth shape 10 0 0 0
queue-set 2
mls qos trust cos
flowcontrol receive desired
!
interface GigabitEthernet0/32
description vmware iscsi2
switchport access vlan 80
switchport mode access
srr-queue bandwidth share 10 10 60 20
srr-queue bandwidth shape 10 0 0 0
queue-set 2
mls qos trust cos
flowcontrol receive desired
!
interface GigabitEthernet0/41
description SAN_aggr1
switchport access vlan 80
switchport mode access
srr-queue bandwidth share 10 10 60 20
srr-queue bandwidth shape 10 0 0 0
queue-set 2
mls qos trust cos
flowcontrol receive desired
channel-group 5 mode active
spanning-tree portfast
!
interface GigabitEthernet0/42
description SAN_aggr1
switchport access vlan 80
switchport mode access
srr-queue bandwidth share 10 10 60 20
srr-queue bandwidth shape 10 0 0 0
queue-set 2
mls qos trust cos
flowcontrol receive desired
channel-group 5 mode active
spanning-tree portfast
!
11-28-2012 03:15 PM
Hi,
Since physical links in your portchannel are 1 Gig, interfaces. The traffic will be load balanced, but the throughput will not go above the maximum of 1Gig.
Do you have Jumbo frame enables on the 3560?
HTH
11-28-2012 03:59 PM
Reza,
yes, Jumbo Frames are enabled and also used by the clients (esxi and san).
11-28-2012 03:16 PM
Hi
Whats the etherchannel load balancing algorithm being used? 'show ether load' on the switch? Also, what the bandwidth stats for the port-channel interface and the physical interfaces? Send a 'show interface | in rate' for the port-channel interface and the physical interfaces. You say you see equal load balancing, how are you getting this info?
Regards
Stephen
==========================
http://www.rConfig.com
A free, open source network device configuration management tool, customizable to your needs!
11-28-2012 04:01 PM
Stephen,
I posted the algorithm in the partial config. It is src-dst-ip.
iSCSI
#sh int gi 0/30 | in rate
Queueing strategy: fifo
5 minute input rate 68521000 bits/sec, 3071 packets/sec
5 minute output rate 96899000 bits/sec, 2737 packets/sec
#sh int gi 0/32 | in rate
Queueing strategy: fifo
5 minute input rate 77165000 bits/sec, 3461 packets/sec
5 minute output rate 104605000 bits/sec, 2910 packets/sec
SAN
#sh int gi 0/41 | in rate
Queueing strategy: fifo
5 minute input rate 5000 bits/sec, 2 packets/sec
5 minute output rate 98721000 bits/sec, 3003 packets/sec
#sh int gi 0/42 | in rate
Queueing strategy: fifo
5 minute input rate 154967000 bits/sec, 4801 packets/sec
5 minute output rate 103207000 bits/sec, 2874 packets/sec
They all go to max 500-550MBit/s. So I am getting ~1GBit/s.
We just ran another test where we were pushing data to two different SANs on the same switch and it looks like the problem should be on the iSCSI NICs (vmware esxi), as I see maxing out on about 660MBit/s on the CNA.
Do we have to setup something special on the iSCSI NIC ports on the switch?
Btw., we are using CNA (Cisco Network Assistant) to get the Port Statistics which is live. This is how we see what exactly is going on with balancing on the switch.
Thank you again for the help!
11-29-2012 11:47 AM
We ran some test with dd from linux vm (10GB file in 1MB bs) and from /dev/zero to hdd we do get full speed without problems.
It is the other way that we have issues and from the table below you can see the problem. Only one port is used on the SAN.
Could somebody point out where the problem may lay? Is it the esxi or the SAN?
Thank you for the answers!
Interface | Port Description | Tx Rate(Mbps) | Rx Rate(Mbps) | Tx BW Usage % | Rx BW Usage % | Tx Rate(pps) | Rx Rate(pps) | Tx Mcast/Bcast Rate(pps) | Rx Mcast/Bcast Rate (pps) | Discarded Pkts | Pkts with Errors |
Gi0/30 | ESXi iSCSI1 | 444.01717 | 5.0494 | 44.40172 | 0.50494 | 7619.9 | 6713.6 | 0.5 | 0 | 0 | 0 |
Gi0/32 | ESXi iSCSI2 | 412.66722 | 4.97219 | 41.26672 | 0.49722 | 6669.4 | 5810.7 | 0.5 | 0 | 0 | 0 |
Gi0/41 | nexenta00 | 4.68162 | 0.00151 | 0.46816 | 0.00015 | 6264.1 | 1 | 1 | 0 | 0 | 0 |
Gi0/42 | nexenta00 | 4.73874 | 862.62481 | 0.47387 | 86.26248 | 6275.2 | 14293.2 | 1.5 | 0 | 0 | 0 |
Transmit Statistics | |||||||||||
Interface | Port Description | Unicast | Multicast | Broadcast | Total Collisions | Excessive Collisions | Late Collisions | ||||
Gi0/30 | ESXi iSCSI1 | 373652812 | 795048 | 6289 | 0 | 0 | 0 | ||||
Gi0/32 | ESXi iSCSI2 | 371977503 | 795059 | 6263 | 0 | 0 | 0 | ||||
Gi0/41 | nexenta00 | 66356579 | 229057 | 322 | 0 | 0 | 0 | ||||
Gi0/42 | nexenta00 | 66918403 | 229209 | 957 | 0 | 0 | 0 | ||||
Receive Statistics | |||||||||||
Interface | Port Description | Unicast | Multicast | Broadcast | Discarded | Alignment Errors | FCS Errors | Collision Fragments | Undersized | Oversized | |
Gi0/30 | ESXi iSCSI1 | 522886909 | 0 | 108 | 0 | 0 | 0 | 0 | 0 | 0 | |
Gi0/32 | ESXi iSCSI2 | 519340193 | 0 | 123 | 0 | 0 | 0 | 0 | 0 | 0 | |
Gi0/41 | nexenta00 | 34336586 | 5649 | 350 | 0 | 0 | 0 | 0 | 0 | 0 | |
Gi0/42 | nexenta00 | 55085067 | 5653 | 344 | 0 | 0 | 0 | 0 | 0 | 0 |
11-29-2012 01:08 PM
Do you really need (auto)QoS on these ports? Which queue is this traffic goes into?
11-29-2012 01:22 PM
We have the same results with or without QoS. We think that looking into the table posted before, the Rx port on the SAN side (gi 0/42) is the port receiving data from the SAN.
We will notify the SAN company to ask for some more information.
11-29-2012 02:41 PM
Can you post the output to the command "sh controller ether
11-29-2012 03:04 PM
The output is below.
We just figured out that it was LACP setting on the SAN aggregate was set to passive. As soon as we set to active, the traffic was load-balanced across the SAN ports on the switch, but still just to 1GBit/s (on the port Rx of the SAN).
We do now get full speed with dd from RAM (esxi) to HDD (SAN). Full 240MBytes/s! Now we want the same the other direction. The speed of the SAN is way beyond 120MByte/s. Nexenta is a cache based system, so the data is held in RAM if it fits and it does. So we should see full speed over the network as it can be seen from the other direction.
Any thoughts?
#sh controllers ethernet-controller gi0/41
Transmit GigabitEthernet0/41 Receive
83735890 Bytes 3178869377 Bytes
87884519 Unicast frames 44880450 Unicast frames
241072 Multicast frames 5996 Multicast frames
354 Broadcast frames 393 Broadcast frames
0 Too old frames 3178076737 Unicast bytes
0 Deferred frames 767488 Multicast bytes
0 MTU exceeded frames 25152 Broadcast bytes
0 1 collision frames 0 Alignment errors
0 2 collision frames 0 FCS errors
0 3 collision frames 0 Oversize frames
0 4 collision frames 0 Undersize frames
0 5 collision frames 0 Collision fragments
0 6 collision frames
0 7 collision frames 61657 Minimum size frames
0 8 collision frames 26476377 65 to 127 byte frames
0 9 collision frames 361070 128 to 255 byte frames
0 10 collision frames 502141 256 to 511 byte frames
0 11 collision frames 168468 512 to 1023 byte frames
0 12 collision frames 97614 1024 to 1518 byte frames
0 13 collision frames 0 Overrun frames
0 14 collision frames 0 Pause frames
0 15 collision frames
0 Excessive collisions 0 Symbol error frames
0 Late collisions 0 Invalid frames, too large
0 VLAN discard frames 17219512 Valid frames, too large
0 Excess defer frames 0 Invalid frames, too small
64520 64 byte frames 0 Valid frames, too small
31903830 127 byte frames
1125313 255 byte frames 0 Too old frames
137643 511 byte frames 0 Valid oversize frames
256794 1023 byte frames 0 System FCS error frames
39415 1518 byte frames 0 RxPortFifoFull drop frame
54598430 Too large frames
0 Good (1 coll) frames
0 Good (>1 coll) frames
#sh controllers ethernet-controller gi0/42
Transmit GigabitEthernet0/42 Receive
2884287979 Bytes 3532812866 Bytes
89972969 Unicast frames 80461475 Unicast frames
242665 Multicast frames 6004 Multicast frames
1015 Broadcast frames 458 Broadcast frames
0 Too old frames 3532018251 Unicast bytes
0 Deferred frames 768512 Multicast bytes
0 MTU exceeded frames 29312 Broadcast bytes
0 1 collision frames 0 Alignment errors
0 2 collision frames 0 FCS errors
0 3 collision frames 0 Oversize frames
0 4 collision frames 0 Undersize frames
0 5 collision frames 0 Collision fragments
0 6 collision frames
0 7 collision frames 67378 Minimum size frames
0 8 collision frames 51493523 65 to 127 byte frames
0 9 collision frames 573981 128 to 255 byte frames
0 10 collision frames 852199 256 to 511 byte frames
0 11 collision frames 226562 512 to 1023 byte frames
0 12 collision frames 139093 1024 to 1518 byte frames
0 13 collision frames 0 Overrun frames
0 14 collision frames 0 Pause frames
0 15 collision frames
0 Excessive collisions 0 Symbol error frames
0 Late collisions 0 Invalid frames, too large
0 VLAN discard frames 27115201 Valid frames, too large
0 Excess defer frames 0 Invalid frames, too small
624128 64 byte frames 0 Valid frames, too small
33445111 127 byte frames
1101032 255 byte frames 0 Too old frames
164893 511 byte frames 0 Valid oversize frames
432415 1023 byte frames 0 System FCS error frames
159024 1518 byte frames 0 RxPortFifoFull drop frame
54290046 Too large frames
0 Good (1 coll) frames
0 Good (>1 coll) frames
11-29-2012 04:33 PM
Nothing wrong with your output.
11-30-2012 04:53 AM
On a "gigabit" etherchannel , a given conversation across the etherchannel will never be more than 1 gig . The way etherchannel works an given conversation gets sent across a single channel , this is the way etherchannel works . It does not get load balanced across all 4 links.Though you may have say a 4 gig etherchannel pipe which gives you a bigger pipe for all conversations a single given ip conversation goes across "one" of those links which is proper behavior so you will never see thruput more than 1 gig for a given conversation. If that's not high enough for then you have to go to the big expense of 10 gig cards.
11-30-2012 07:44 AM
Glen,
we do not have a single stream going on here. We have two NICs pumping iSCSI traffic from esxi server. We have RR policy and we can see the traffic splitting beautifully. Your comment is related to a single IP, single stream.
Still thank you for your response!
11-30-2012 07:42 AM
leolaohoo,
yes we know that nothing is wrong with the output.
How do you explain the behavior of the traffic on the etherchannel by switching from passive to active on the SAN? It looks like the SAN has the issue then, right?
Also, although we get full 2GBit/s on the inbound to the SAN, we are getting 1GBit/s inbound to the ESXi iSCSI, but this time it is load-balanced properly. We know we are on the right track, but I guess I would need to involve the SAN vendor to help here out, as we don't think this is switch related anymore, unless somebody had some similar experience.
Thank for help!
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide