07-15-2024 06:33 PM
We have an occurring issue across the board with our N9k's. Each site has a pair of N9k's connected redundantly to many catalyst switches, and all traffic going out to these switches only use one link, and the N9k's do not load balance the traffic going out to the connected switches. Fail over works as expected.
For example and for the easiest setup, we have a pair of N9k's that have redundant connection to a customer's Cisco router with the ports configured as access ports, the customer is a service provider. This provider offers free Wi-Fi to many customers. All traffic leaving our N9k's prefer switch 1, and does not load balance at all. We are contracted to provide a redundant 2Gbps (1Gbps + 1Gbps) service. We are currently maxing out one of the 1Gbps during peak times, resulting in drop packets. Again, failover works as expected. The customer router does not have 10Gbps ports.
We were on nxos.7.0.3.I7.10.1, we tried upgrading to nxos.9.3.9, and that did not change anything.
Both switches have this config:
interface port-channel60
description LAG-60: Trunk to XXXXX 2Gbps service
switchport access vlan 299
vpc 60
interface Ethernet1/30
description LAG-60 XXXXX 2Gbps Circuit
switchport access vlan 299
spanning-tree port type edge
channel-group 60 mode active
switch 1:
show port-channel load-balance
System config:
Non-IP: src-dst mac
IP: src-dst mac rotate 0
Port Channel Load-Balancing Configuration for all modules:
Module 1:
Non-IP: src-dst mac
IP: src-dst mac rotate 0
sh port-channel traffic int port 60
NOTE: Clear the port-channel member counters to get accurate statistics
ChanId Port Rx-Ucst Tx-Ucst Rx-Mcst Tx-Mcst Rx-Bcst Tx-Bcst
------ --------- ------- ------- ------- ------- ------- -------
60 Eth1/30 92.06% 100.00% 89.53% 93.95% 0.0% 0.0%
sh int e1/30
Ethernet1/30 is up
admin state is up, Dedicated Interface
Belongs to Po60
Hardware: 1000/10000 Ethernet, address: cc46.d6b3.9af1 (bia cc46.d6b3.9af1)
Description: LAG-60 MCCS 2Gbps Circuit
MTU 1500 bytes, BW 1000000 Kbit , DLY 10 usec
reliability 255/255, txload 138/255, rxload 7/255
Encapsulation ARPA, medium is broadcast
Port mode is access
full-duplex, 1000 Mb/s, media type is 1G
Beacon is turned off
Auto-Negotiation is turned on FEC mode is Auto
Input flow-control is off, output flow-control is off
Auto-mdix is turned off
Rate mode is dedicated
Switchport monitor is off
EtherType is 0x8100
EEE (efficient-ethernet) : n/a
admin fec state is auto, oper fec state is off
Last link flapped 00:52:22
Last clearing of "show interface" counters 00:42:48
0 interface resets
Load-Interval #1: 30 seconds
30 seconds input rate 28293864 bits/sec, 9282 packets/sec
30 seconds output rate 542656384 bits/sec, 52958 packets/sec
input rate 28.29 Mbps, 9.28 Kpps; output rate 542.66 Mbps, 52.96 Kpps
Load-Interval #2: 5 minute (300 seconds)
300 seconds input rate 29151240 bits/sec, 8958 packets/sec
300 seconds output rate 505885544 bits/sec, 50074 packets/sec
input rate 29.15 Mbps, 8.96 Kpps; output rate 505.89 Mbps, 50.07 Kpps
RX
24968459 unicast packets 92 multicast packets 0 broadcast packets
24968551 input packets 9663539018 bytes
0 jumbo packets 0 storm suppression packets
0 runts 0 giants 0 CRC 0 no buffer
0 input error 0 short frame 0 overrun 0 underrun 0 ignored
0 watchdog 0 bad etype drop 0 bad proto drop 0 if down drop
0 input with dribble 0 input discard
0 Rx pause
TX
130388484 unicast packets 346 multicast packets 0 broadcast packets
130388830 output packets 165826559252 bytes
0 jumbo packets
0 output error 0 collision 0 deferred 0 late collision
0 lost carrier 0 no carrier 0 babble 3254413 output discard
0 Tx pause
switch 2:
sh port-channel load-balance
System config:
Non-IP: src-dst mac
IP: src-dst mac rotate 0
Port Channel Load-Balancing Configuration for all modules:
Module 1:
Non-IP: src-dst mac
IP: src-dst mac rotate 0
sh port-channel traffic int port 60
NOTE: Clear the port-channel member counters to get accurate statistics
ChanId Port Rx-Ucst Tx-Ucst Rx-Mcst Tx-Mcst Rx-Bcst Tx-Bcst
------ --------- ------- ------- ------- ------- ------- -------
60 Eth1/30 76.56% 0.0% 65.21% 85.14% 0.0% 0.0%
sh int e1/30
Ethernet1/30 is up
admin state is up, Dedicated Interface
Belongs to Po60
Hardware: 1000/10000 Ethernet, address: cc46.d6b3.9e55 (bia cc46.d6b3.9e55)
Description: LAG-60 MCCS 2Gbps Circuit
MTU 1500 bytes, BW 1000000 Kbit , DLY 10 usec
reliability 255/255, txload 1/255, rxload 9/255
Encapsulation ARPA, medium is broadcast
Port mode is access
full-duplex, 1000 Mb/s, media type is 1G
Beacon is turned off
Auto-Negotiation is turned on FEC mode is Auto
Input flow-control is off, output flow-control is off
Auto-mdix is turned off
Rate mode is dedicated
Switchport monitor is off
EtherType is 0x8100
EEE (efficient-ethernet) : n/a
admin fec state is auto, oper fec state is off
Last link flapped 00:38:29
Last clearing of "show interface" counters 00:51:43
1 interface resets
Load-Interval #1: 30 seconds
30 seconds input rate 36408568 bits/sec, 9263 packets/sec
30 seconds output rate 456 bits/sec, 0 packets/sec
input rate 36.41 Mbps, 9.26 Kpps; output rate 456 bps, 0 pps
Load-Interval #2: 5 minute (300 seconds)
300 seconds input rate 45266168 bits/sec, 10345 packets/sec
300 seconds output rate 248 bits/sec, 0 packets/sec
input rate 45.27 Mbps, 10.35 Kpps; output rate 248 bps, 0 pps
RX
23668767 unicast packets 101 multicast packets 0 broadcast packets
23668868 input packets 8934126866 bytes
0 jumbo packets 0 storm suppression packets
0 runts 0 giants 0 CRC 0 no buffer
0 input error 0 short frame 0 overrun 0 underrun 0 ignored
0 watchdog 0 bad etype drop 0 bad proto drop 0 if down drop
0 input with dribble 0 input discard
0 Rx pause
TX
0 unicast packets 1560 multicast packets 0 broadcast packets
1560 output packets 142946 bytes
0 jumbo packets
0 output error 0 collision 0 deferred 0 late collision
0 lost carrier 0 no carrier 0 babble 0 output discard
0 Tx pause
Solved! Go to Solution.
07-21-2024 04:41 PM
Hello!
Based on this most recent topology, let's assign some names to specific devices to make the conversation easier.
Let's also assume that your traffic is flowing from your customers to the Internet (although you would likely see this same issue in the reverse direction, and I believe you've identified that's the case too).
When traffic from your customers enters Customer-Router, it will route the packet and choose the Port-channel60 interface (which has two members - Gi0/0/2 and Gi0/0/3) as the egress interface. It will subsequently hash the packet out of one of the Port-channel60 members (either Gi0/0/2 or Gi0/0/3).
Let's say Customer-Router chooses to route this packet out of Gi0/0/2, and let's also say Gi0/0/2 connects to Customer-N9K-1. When Customer-N9K-1 switches this packet/frame according to the packet's destination MAC address, it will choose vPC Po60 as the egress interface. Even though Po60 is a vPC, Customer-N9K-1 will choose to forward this frame out of interface Ethernet1/30; it will not attempt to load balance traffic across the vPC Peer-Link to Customer-N9K-2's vPC Po60.
This is a key point - when two Nexus switches are in a vPC domain, one vPC peer is not cognizant of what or how much data plane traffic the other vPC peer is forwarding. The other vPC peer could be forwarding several terabits of traffic, or none at all. Therefore, the two vPC peers will not forward data plane traffic across the vPC Peer-Link in an attempt to load balance data plane traffic between the two peers.
Therefore, the only way to solve this polarization issue is to focus on the routers sending traffic towards the Nexus switches. It is the duty of Customer-Router to balance traffic it forwards towards Customer-N9K-1 and Customer-N9K-2. We will need to investigate its hashing algorithm (as well as the profile of traffic it is trying to send towards the Internet) in order to resolve this issue.
I hope this helps explain this behavior in a bit more detail - thank you!
-Christopher
07-15-2024 10:27 PM
This is on the nexus side, how about other side device configured ?
that is suggested method always default
if you like to try different method you can check below :
You can configure the device to use one of the following methods to load-balance across the port channel:
Destination MAC address
Source MAC address
Source and destination MAC address
Destination IP address
Source IP address
Source and destination IP address
Source TCP/UDP port number
Destination TCP/UDP port number
Source and destination TCP/UDP port number
GRE inner IP headers with source, destination, and source-destination
07-16-2024 02:26 AM
Share
Show ether channel summary
Or
Show port channel summary
In both NSK
Show vpc brief
Show spanning tree
MHM
07-17-2024 12:50 AM
Here is the router side:
interface Port-channel60
ip address x.x.x.x m.m.m.m
negotiation auto
interface GigabitEthernet0/0/2
no ip address
no ip redirects
no ip unreachables
no ip proxy-arp
negotiation auto
channel-group 60 mode active
interface GigabitEthernet0/0/3
no ip address
no ip redirects
no ip unreachables
no ip proxy-arp
negotiation auto
channel-group 60 mode active
Currently the N9K's are set to - Source and destination MAC address
Switch 1:
show port-channel summary
Flags: D - Down P - Up in port-channel (members)
I - Individual H - Hot-standby (LACP only)
s - Suspended r - Module-removed
b - BFD Session Wait
S - Switched R - Routed
U - Up (port-channel)
p - Up in delay-lacp mode (member)
M - Not in use. Min-links not met
--------------------------------------------------------------------------------
Group Port- Type Protocol Member Ports
Channel
--------------------------------------------------------------------------------
10 Po10(SU) Eth LACP Eth1/53(P) Eth1/54(P)
60 Po60(SU) Eth LACP Eth1/29(D) Eth1/30(P)
show vpc brief
Legend:
(*) - local vPC is down, forwarding via vPC peer-link
vPC domain id : 10
Peer status : peer adjacency formed ok
vPC keep-alive status : peer is alive
Configuration consistency status : success
Per-vlan consistency status : success
Type-2 consistency status : success
vPC role : primary, operational secondary
Number of vPCs configured : 14
Peer Gateway : Enabled
Dual-active excluded VLANs : -
Graceful Consistency Check : Enabled
Auto-recovery status : Enabled, timer is off.(timeout = 240s)
Delay-restore status : Timer is off.(timeout = 30s)
Delay-restore SVI status : Timer is off.(timeout = 10s)
Operational Layer3 Peer-router : Disabled
Virtual-peerlink mode : Disabled
vPC Peer-link status
---------------------------------------------------------------------
id Port Status Active vlans
-- ---- ------ -------------------------------------------------
1 Po10 up 3-5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,39
,41,45,47,49,51,65-74,76,200-203,205-207,254,
299-322,360,400-401,500,610,700,800-801
vPC status
----------------------------------------------------------------------------
Id Port Status Consistency Reason Active vlans
-- ------------ ------ ----------- ------ ---------------
60 Po60 up success success 299
show spanning-tree vlan 299
VLAN0299
Spanning tree enabled protocol rstp
Root ID Priority 4395
Address 0023.04ee.be01
Cost 2
Port 4105 (port-channel10)
Hello Time 2 sec Max Age 20 sec Forward Delay 15 sec
Bridge ID Priority 33067 (priority 32768 sys-id-ext 299)
Address 0023.04ee.be0a
Hello Time 2 sec Max Age 20 sec Forward Delay 15 sec
Interface Role Sts Cost Prio.Nbr Type
---------------- ---- --- --------- -------- --------------------------------
Po10 Root FWD 1 128.4105 (vPC peer-link) Network P2p
Po20 Desg FWD 1 128.4115 (vPC) P2p
Po45 Root FWD 1 128.4140 (vPC) P2p
Po60 Desg FWD 1 128.4155 (vPC) P2p
Eth1/3 Desg FWD 4 128.3 P2p
Eth1/21 Desg FWD 2 128.21 P2p
Eth1/43 Desg FWD 2 128.43 P2p
Eth1/46 Desg FWD 2 128.46 P2p
**********Switch 2:
show port-channel summary
Flags: D - Down P - Up in port-channel (members)
I - Individual H - Hot-standby (LACP only)
s - Suspended r - Module-removed
b - BFD Session Wait
S - Switched R - Routed
U - Up (port-channel)
p - Up in delay-lacp mode (member)
M - Not in use. Min-links not met
--------------------------------------------------------------------------------
Group Port- Type Protocol Member Ports
Channel
--------------------------------------------------------------------------------
10 Po10(SU) Eth LACP Eth1/53(P) Eth1/54(P)
60 Po60(SU) Eth LACP Eth1/29(D) Eth1/30(P)
show vpc brief
Legend:
(*) - local vPC is down, forwarding via vPC peer-link
vPC domain id : 10
Peer status : peer adjacency formed ok
vPC keep-alive status : peer is alive
Configuration consistency status : success
Per-vlan consistency status : success
Type-2 consistency status : success
vPC role : secondary, operational primary
Number of vPCs configured : 14
Peer Gateway : Enabled
Dual-active excluded VLANs : -
Graceful Consistency Check : Enabled
Auto-recovery status : Enabled, timer is off.(timeout = 240s)
Delay-restore status : Timer is off.(timeout = 30s)
Delay-restore SVI status : Timer is off.(timeout = 10s)
Operational Layer3 Peer-router : Disabled
Virtual-peerlink mode : Disabled
vPC Peer-link status
---------------------------------------------------------------------
id Port Status Active vlans
-- ---- ------ -------------------------------------------------
1 Po10 up 3-5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,39
,41,45,47,49,51,65-74,76,200-203,205-207,254,
299-322,360,400-401,500,610,700,800-801
vPC status
----------------------------------------------------------------------------
Id Port Status Consistency Reason Active vlans
-- ------------ ------ ----------- ------ ---------------
60 Po60 up success success 299
show spanning-tree vlan 299
VLAN0299
Spanning tree enabled protocol rstp
Root ID Priority 4395
Address 0023.04ee.be01
Cost 1
Port 4140 (port-channel45)
Hello Time 2 sec Max Age 20 sec Forward Delay 15 sec
Bridge ID Priority 33067 (priority 32768 sys-id-ext 299)
Address 0023.04ee.be0a
Hello Time 2 sec Max Age 20 sec Forward Delay 15 sec
Interface Role Sts Cost Prio.Nbr Type
---------------- ---- --- --------- -------- --------------------------------
Po10 Desg FWD 1 128.4105 (vPC peer-link) Network P2p
Po20 Desg FWD 1 128.4115 (vPC) P2p
Po45 Root FWD 1 128.4140 (vPC) P2p
Po60 Desg FWD 1 128.4155 (vPC) P2p
Eth1/24 Desg FWD 2 128.24 P2p
07-17-2024 02:57 AM
60 Po60(SU) Eth LACP Eth1/29(D) Eth1/30(P) <<- this NSK-1
60 Po60(SU) Eth LACP Eth1/29(D) Eth1/30(P) <<- this NSK-2
interface GigabitEthernet0/0/2
no ip address
no ip redirects
no ip unreachables
no ip proxy-arp
negotiation auto
channel-group 60 mode active
interface GigabitEthernet0/0/3
no ip address
no ip redirects
no ip unreachables
no ip proxy-arp
negotiation auto
channel-group 60 mode active
in router there is two interface use for PO connect to both NSK but as I see there are four port in NSK (both) where it must be two ine for each NSK
MHM
07-18-2024 12:31 AM
Correct, but this isn't the reason for this behavior. We tried moving both ports from the router to a single N9k to see if would load balance on a single switch. I can remove these ports from the config, it doesn't change anything.
07-18-2024 08:19 PM
We have dual N9k's in many locations, with a very very basic config, only basic switching. We have dual N9k's with VPC ether-channel to other dual N9k's, dual N9k's to almost a 100 of Cat IOS switches, and this router, all with the same issue and always with outbound traffic.
07-19-2024 06:46 AM
I mention port because it can cabling issue ?
can you confirm that the router connect to each NSK vPC SW with one correct link ?
MHM
07-20-2024 10:58 AM
07-20-2024 07:52 PM - edited 07-20-2024 07:56 PM
Thanks again for continued support. I went ahead and took out E1/29 from the port channel on both N9Ks. There is no IGP between us and the customer, we are a L2 transport for them to the Internet.
Switch 1:
show port-channel summary
Flags: D - Down P - Up in port-channel (members)
I - Individual H - Hot-standby (LACP only)
s - Suspended r - Module-removed
b - BFD Session Wait
S - Switched R - Routed
U - Up (port-channel)
p - Up in delay-lacp mode (member)
M - Not in use. Min-links not met
--------------------------------------------------------------------------------
Group Port- Type Protocol Member Ports
Channel
--------------------------------------------------------------------------------
60 Po60(SU) Eth LACP Eth1/30(P)
show port-channel traffic int por 60
NOTE: Clear the port-channel member counters to get accurate statistics
ChanId Port Rx-Ucst Tx-Ucst Rx-Mcst Tx-Mcst Rx-Bcst Tx-Bcst
------ --------- ------- ------- ------- ------- ------- -------
60 Eth1/30 100.00% 100.00% 100.00% 100.00% 0.0% 0.0%
Switch 2:
show port-channel summary interface port 60
Flags: D - Down P - Up in port-channel (members)
I - Individual H - Hot-standby (LACP only)
s - Suspended r - Module-removed
b - BFD Session Wait
S - Switched R - Routed
U - Up (port-channel)
p - Up in delay-lacp mode (member)
M - Not in use. Min-links not met
--------------------------------------------------------------------------------
Group Port- Type Protocol Member Ports
Channel
--------------------------------------------------------------------------------
60 Po60(SU) Eth LACP Eth1/30(P)
show port-channel traffic int por 60
NOTE: Clear the port-channel member counters to get accurate statistics
ChanId Port Rx-Ucst Tx-Ucst Rx-Mcst Tx-Mcst Rx-Bcst Tx-Bcst
------ --------- ------- ------- ------- ------- ------- -------
60 Eth1/30 100.00% 0.0% 100.00% 100.00% 0.0% 0.0%
I have also tried the following commands:
port-channel load-balance src-dst mac rotate 32
port-channel load-balance src-dst l4port
port-channel load-balance src-dst l4port rotate 32
port-channel load-balance src-dst ip symmetric
port-channel load-balance src l4port
port-channel load-balance src ip-l4port rotate 32
port-channel load-balance src mac rotate 32
These did not help out at all. What is strange is there used to be a bug ID posted CSCvq26885, but it seems to be an internal ID that has been removed from the original posting.
07-21-2024 03:52 AM
OK,
so there is PO config with specific VLAN in both NSK
and there is L3 PO config in router
you config default route in both NSK toward L3 PO IP to access internet ?
MHM
07-21-2024 03:50 PM - edited 07-30-2024 08:33 PM
This is the basic setup. We have the polarization issue even between the 4 N9k's, always in the outbound direction towards the customer router.
07-21-2024 04:41 PM
Hello!
Based on this most recent topology, let's assign some names to specific devices to make the conversation easier.
Let's also assume that your traffic is flowing from your customers to the Internet (although you would likely see this same issue in the reverse direction, and I believe you've identified that's the case too).
When traffic from your customers enters Customer-Router, it will route the packet and choose the Port-channel60 interface (which has two members - Gi0/0/2 and Gi0/0/3) as the egress interface. It will subsequently hash the packet out of one of the Port-channel60 members (either Gi0/0/2 or Gi0/0/3).
Let's say Customer-Router chooses to route this packet out of Gi0/0/2, and let's also say Gi0/0/2 connects to Customer-N9K-1. When Customer-N9K-1 switches this packet/frame according to the packet's destination MAC address, it will choose vPC Po60 as the egress interface. Even though Po60 is a vPC, Customer-N9K-1 will choose to forward this frame out of interface Ethernet1/30; it will not attempt to load balance traffic across the vPC Peer-Link to Customer-N9K-2's vPC Po60.
This is a key point - when two Nexus switches are in a vPC domain, one vPC peer is not cognizant of what or how much data plane traffic the other vPC peer is forwarding. The other vPC peer could be forwarding several terabits of traffic, or none at all. Therefore, the two vPC peers will not forward data plane traffic across the vPC Peer-Link in an attempt to load balance data plane traffic between the two peers.
Therefore, the only way to solve this polarization issue is to focus on the routers sending traffic towards the Nexus switches. It is the duty of Customer-Router to balance traffic it forwards towards Customer-N9K-1 and Customer-N9K-2. We will need to investigate its hashing algorithm (as well as the profile of traffic it is trying to send towards the Internet) in order to resolve this issue.
I hope this helps explain this behavior in a bit more detail - thank you!
-Christopher
07-21-2024 05:06 PM
you have 4 NSK
2 NSK vPC pair
are you use same vpc domain ?
the PO connect two NSK vPC pair can you check it status in all four NSK and it STP status FWD or BLK
it can STP block one link and keep other or lacp not work and one link is P and other is S
if you can share output here
thanks
MHM
07-23-2024 12:27 AM
Mr. Hart,
Thanks for the explanation, that makes sense, and we will start looking more on the Internet devices, then share our findings.
MHM, I will follow up with your questions soon.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide