cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1237
Views
0
Helpful
3
Replies

Nexus 3164Q Port-channel balance issue if ingress is from a VXLAN-enabled VLAN

ss1
Level 1
Level 1

Hello,

I have recently detected a LACP balance issue which I seem unable to solve. That's why I decided to share what I found out so far in order to see if anybody else has also seen this. 
The switch is 3164Q with NX-OS 7.0.3.I7.6. Reproduced the same issue on another 3164Q with NX-OS 7.0.3.I6.1.
The topology is as follows:
Incoming unicast traffic -> switch 3164Q (1) and switch 3164Q (2) - both in a VPC -> Outgoing on Port-channel 1
   * Port-channel 1 is a simple switchport trunk interface allowing several VLANs on Layer 2 towards the destination device. It's 16x10G in a 160G bundle but I have also reproduced this with 8x10G in a 80G bundle. 

The issue:
   * If the "Incoming unicast traffic" comes from another Port-channel or switchport, there is perfect load balance between all ports just as expected.
   * If part of the "Incoming unicast traffic" is ingressing from a VXLAN-enabled VLAN through the nve1 interfaces, the result is bad balance leading to 2 out of 16 ports egressing twice as much traffic through the switchport LACP Port-channel 1.
It does NOT matter if the port-channel on egress is an orphan or not. I have reproduced the same in both cases.

See the image attached. It's a screen-shot from our Cacti graph system. This is showing 12x 10G ports egressing towards the switchport Port-channel. The peak egress load is approx. 40G towards the Port-channel, however there are two ports where we have twice as much compared to all others: 3.5G to 4G egress.
Approximately 10G out of 40G total egress load is ingressing the switch from a couple of VXLAN-enabled VLANs. The rest 30G are ingressing from another switchport trunks and/or Port-channels. 

The Load-balance algorithm is as follows:

# show port-channel load-balance  
System config:  
 Non-IP: src-dst mac  
 IP: src-dst ip-l4port  rotate 0 
Port Channel Load-Balancing Configuration for all modules: 
Module 1: 
 Non-IP: src-dst mac 
 IP: src-dst ip-l4port rotate 0

I'm aware of all circumstances when a Port-channel may not balance good enough. There is no one-to-one IP address communication which may cause this and also there is no multicast traffic involved in there. No excessive broadcast either. Just to be sure - I configured the same traffic to ingress the switches from a regular trunk switchport, it's all balancing on egress more than perfect regardless if the ingress is an orphan port or not. The issue starts always and as soon as I let the traffic come through nve1 as a VXLAN. 

Any ideas may be appreciated greatly. Somehow I suspect that the switch starts balancing based on source IP being the nve1 loopback interface rather than the actual source address found after decapsulation.

In case of any needed tests or further information I'm here to assist or clarify.

Thank you.

 

3 Replies 3

Sergiu.Daniluk
VIP Alumni
VIP Alumni

Hi @ss1 

Interesting scenario. What does the next command output shows?

show port-channel traffic interface port-channel 1  

Have you also tried modifying the rotate value? If yes, what is the result?

The rotate option causes the hash algorithm to rotate the link picking selection so that it does not continually choose the same link across all nodes in the network. It does so by influencing the bit pattern for the hash algorithm. This option shifts the flow from one link to another and load balances the already load-balanced (polarized) traffic from the first ECMP level across multiple links.

 

 

Stay safe,

Sergiu

 

Hi

Thanks for your message.

The rotate option does not improve the balance. It only causes the unbalanced traffic quantity to move to some other port. I will give you 4 examples now from show port-channel traffic command.

1. Port-channel 1 as an orphan link:

ChanId      Port Rx-Ucst Tx-Ucst Rx-Mcst Tx-Mcst Rx-Bcst Tx-Bcst
------ --------- ------- ------- ------- ------- ------- -------
     1  Eth1/11/1   6.19%   5.93%   6.24%  12.84%    0.0%   6.02%
     1  Eth1/11/2   6.10%   5.96%   6.24%   3.24%    0.0%   4.88%
     1  Eth1/11/3   6.12%   5.95%   6.25%   6.34%    0.0%   6.31%
     1  Eth1/11/4   6.47%   5.99%   6.25%   5.13% 100.00%   4.52%
     1  Eth1/9/1    6.65%   5.94%   6.25%   4.04%    0.0%   6.21%
     1  Eth1/9/2    6.24%   5.97%   6.24%   7.18%    0.0%  11.49%
     1  Eth1/9/3    6.27%   5.92%   6.24%   4.10%    0.0%   3.58%
     1  Eth1/9/4    6.21%   8.23%   6.25%   3.73%    0.0%   8.88%
     1  Eth1/13/1   6.16%   5.96%   6.25%   3.33%    0.0%   3.75%
     1  Eth1/13/2   6.23%   8.02%   6.24%   3.51%    0.0%   5.23%
     1  Eth1/13/3   6.21%   5.97%   6.24%  13.49%    0.0%   9.10%
     1  Eth1/13/4   6.43%   5.95%   6.25%   3.73%    0.0%   6.40%
     1  Eth1/7/1    6.18%   5.94%   6.25%   8.63%    0.0%   3.81%
     1  Eth1/7/2    6.04%   5.97%   6.25%   7.44%    0.0%   3.28%
     1  Eth1/7/3    6.30%   6.32%   6.25%   9.26%    0.0%   6.58%
     1  Eth1/7/4    6.10%   5.91%   6.25%   3.93%    0.0%   9.88%

2. Another port-channel which is also an orphan on the same switch:

ChanId      Port Rx-Ucst Tx-Ucst Rx-Mcst Tx-Mcst Rx-Bcst Tx-Bcst
------ --------- ------- ------- ------- ------- ------- -------
    22  Eth1/5/3   12.69%  11.15%  97.72%   2.57% 100.00%   7.92%
    22  Eth1/5/4   12.49%  11.12%   0.32%   1.43%    0.0%  17.64%
    22  Eth1/43/2  12.23%  11.14%   0.32%   1.44%    0.0%   0.98%
    22  Eth1/6/4   12.71%  20.53%   0.32%   1.43%    0.0%   3.37%
    22  Eth1/43/3  12.85%  11.45%   0.32%   1.43%    0.0%  23.41%
    22  Eth1/3/3   12.01%  11.29%   0.32%  87.66%    0.0%   4.51%
    22  Eth1/43/4  12.51%  11.18%   0.32%   2.51%    0.0%   0.00%
    22  Eth1/3/4   12.48%  12.09%   0.32%   1.48%    0.0%  42.13%

3. Another port-channel - a different switch, VPC node 1:

ChanId      Port Rx-Ucst Tx-Ucst Rx-Mcst Tx-Mcst Rx-Bcst Tx-Bcst
------ --------- ------- ------- ------- ------- ------- -------
     1  Eth1/9/1  14.30%  10.96%  12.78%  12.32%  99.99%   9.59%
     1  Eth1/9/2  12.59%  11.49%  12.51%   5.24%   0.00%   6.78%
     1  Eth1/9/3  12.99%  11.97%  12.78%  55.44%   0.00%  21.30%
     1  Eth1/9/4  13.41%  22.22%  12.51%   5.12%    0.0%   1.99%
     1  Eth1/7/1  11.59%  12.71%  12.57%   5.49%    0.0%  20.70%
     1  Eth1/7/2  11.32%   9.78%  12.12%   4.80%    0.0%  11.62%
     1  Eth1/7/3  11.99%  10.45%  12.57%   5.81%    0.0%  11.76%
     1  Eth1/7/4  11.75%  10.37%  12.12%   5.76%    0.0%  16.22%

4. The same port-channel - on a third different switch. VPC node 2 (the second and third switch are VPCs)

ChanId      Port Rx-Ucst Tx-Ucst Rx-Mcst Tx-Mcst Rx-Bcst Tx-Bcst
------ --------- ------- ------- ------- ------- ------- -------
     1  Eth1/11/1  10.37%  10.21%   8.66%   8.27%    0.0%   0.00%
     1  Eth1/11/2  10.72%  15.61%   8.66%   8.27%    0.0%   0.00%
     1  Eth1/10/1  19.21%  13.32%  20.67%  19.74%    0.0%  45.46%
     1  Eth1/10/2  19.13%  21.93%  20.66%  19.74%    0.0%   0.71%
     1  Eth1/10/3  20.06%  16.95%  20.66%  19.74%    0.0%   0.95%
     1  Eth1/10/4  20.48%  21.95%  20.66%  24.21%    0.0%  52.86%

Thank you.

ss1
Level 1
Level 1

Hi

Let me explain how I progressed with this - actually I did a lot of tryouts however no positive end result so far. I mitigated all orphan port-channels and ports and built connections from each leaf device to each vpc node. Also I upgraded the 3164Q to 7.0.3.I7.9. Neither of that brought any result unfortunately.

Let's share a quick topology of the latest thing I tried. This is the latest production I had to deploy and it also failed to balance unfortunately. 

supportforum.cisco.com.png

The traffic destination is receiving unbalanced traffic on ingress if the traffic comes as a VXLAN from the first DC to the second DC. Fortunately I have plenty of capacity between the datacenters and I was able to run a switchport between them. As soon as I ran the traffic through switchports, the switches at datacenter #2 started to balance more than perfect towards the traffic destination. I suspect that any traffic arriving as a VXLAN from DC 1 to DC 2 is being balanced based on source mac address rather than based on IP addresses but that might not necessarily be true.  

An OSPF with a /30 is used on each side of the routerport which is actually a port-channel between multiple 40Gigs. I don't know why I enabled LACP over multiple routerports rather than giving a different /30 to each port but that's easy to fix if recommended. 

Do I need to establish full-mesh BGPs between each node and each other in both DCs considering that topology? Currently there are EVPN BGPs established between node #1 DC #1 and node #1 DC #2, also the same between node #2 and node #2. A BGP over a VLAN interface through the VPC peer-link is also up for failover purposes as recommended. However I don't have any BGPs between node #1 DC #1 and node #2 DC #2 and the opposite as well. 

Thanks for any replies. Really appreciated.