With Cisco 1400 and 15000 series adapters using EtherChannel, TCP flows across the EtherChannel may not reach a throughput equal to the aggregate of the line rates of the member links.
In some cases, it is expected and in some cases it is not.
By default, EtherChannel load balancing uses MAC addresses and IP addresses and Layer 4 port numbers in its link selection hash algorithm. When the number of flows is less than 4 times the number of links through the EtherChannel, the throughput may be imperfectly hashed between the links. For example, with 4 flows, 3 may go to link 0 while just 1 flow to link 1. The extreme case is 1 flow where all traffic must go through 1 link, so the maximum throughput will be that of just the one link. It is expected that the bandwidth will be lower and inconsistent (depending on hash selection) with a very small number of flows and will improve in terms of consistency between tests and throughput as the number of flows is increased.
That said, even on a correctly tuned host and 8 or more TCP egress flows, aggregate line rate will not be obtained due to another issue. Throughput across a 2x25 EtherChannel is expected to be near 50Gb/s with the traffic split evenly between the 2 links. The issue reduces the aggregate bandwidth down to the 33-40Gb/s range, with the traffic split unevenly between the links. Similar percentage drops in throughput may be seen in other EtherChannel configurations. This issue is internal to the VIC ASIC and due to a queuing bottleneck. It will be addressed as a firmware patch in a later release. See bug CSCwj57044 for more details about this issue.