06-27-2023 11:53 AM
Hi ! I'm currently facing a network issue between Cisco Nexus 93108 (9.3.11) on an LACP port-channel configuration.
The replica below help us to reproduce the issue we had (using 4x Nexus 9000v qcow2 image version 9.3.9) as you can see my network topology is between 2 datacenters.
Configuration on each is similar : 1 Domain vPC between 2 switches (pretty basic 1 peerlink, 1 peerkeepalive, no layer 3, no peer gateway) and a WAN link through a vPC Po 100 between both datacenter that allow ALL vlans to transit.
vPC configuration is consistent and works well: SWC001 is directly link to SWC011 and SWC002 to SWC012, everything runs smoothly and got no issues. The thing is, in reality, there's an ISP between both Datacenters and the pain is coming... We only know from the ISP they use QinQ configuration in their own network, as a datacenter client we don't know which configuration neither devices they're using.
Also, before using Nexus, we had old HP core switch and we didn't set any particular configuration regarding to QinQ) After this HP=>Nexus migration, everything was fine except the PortChannel 100 status (so the extended LAN between both DCs)
To simulate an ISP in between, i set up a basic GNS3 switch configured with QinQ on e0 and e1 (VLAN 1 and Ethertype 0x88A8)
In my example Po100 is configured with only one physical interface (eth1/13), and so the current configuration on all eth1/13 on the 4 switch is :
version 9.3(9) Bios:version interface Ethernet1/13 lacp rate fast switchport mode trunk spanning-tree port type edge trunk spanning-tree bpdufilter enable channel-group 100 mode active interface port-channel100 switchport mode trunk spanning-tree port type edge trunk spanning-tree bpdufilter enable no lacp suspend-individual
If we're on a channel-group active/active configuration (after enabled feature lacp ), PortChannel protocol is LACP but it still goes on Switched and Down
SWC001# sh port-channel summary interface port-channel 100 Flags: D - Down P - Up in port-channel (members) I - Individual H - Hot-standby (LACP only) s - Suspended r - Module-removed b - BFD Session Wait S - Switched R - Routed U - Up (port-channel) p - Up in delay-lacp mode (member) M - Not in use. Min-links not met -------------------------------------------------------------------------------- Group Port- Type Protocol Member Ports Channel -------------------------------------------------------------------------------- 100 Po100(SD) Eth LACP Eth1/13(I)
If we set "channel group 100 mode on" on both side, Port CHannel protocol is none BUT interface is UP and Po Switched/Up
FRHD01SWC001(config-if)# sh port-channel summary interface port-channel 100 Flags: D - Down P - Up in port-channel (members) I - Individual H - Hot-standby (LACP only) s - Suspended r - Module-removed b - BFD Session Wait S - Switched R - Routed U - Up (port-channel) p - Up in delay-lacp mode (member) M - Not in use. Min-links not met -------------------------------------------------------------------------------- Group Port- Type Protocol Member Ports Channel -------------------------------------------------------------------------------- 100 Po100(SU) Eth NONE Eth1/13(P)
Now, if we have a look about lacp counters, from both side there's LACPDUs sent, but 0 received from each way.
SWC001# show lacp counters NOTE: Clear lacp counters to get accurate statistics ------------------------------------------------------------------------------ LACPDUs Markers/Resp LACPDUs Port Sent Recv Recv Sent Pkts Err ------------------------------------------------------------------------------ port-channel1 Ethernet1/11 28 24 0 0 0 port-channel100 Ethernet1/13 12 0 0 0 0
Finally, when i have a look at sh vpc brief , Po100 status is down whereas consistency is success
vPC Peer-link status --------------------------------------------------------------------- id Port Status Active vlans -- ---- ------ ------------------------------------------------- 1 Po1 up 1,5,10-11,13-14,22-23,30,41,44,50-51,55,80-81,90, 100,110-114,120-121,130-135,140-141,150-154, 160-174,176,180,201-205,230,256-257,999 vPC status ---------------------------------------------------------------------------- Id Port Status Consistency Reason Active vlans -- ------------ ------ ----------- ------ --------------- 100 Po100 down* success success -
We made different configuration on GNS3 to compare where the issue could be, active/active, active/passive or on/on , but we actually don't have any idea of the issue. We also have a look at the spanning tree part but it seems good. It seems that if anything is connected between 2x Nexus, it fails, even though link may be up (but there's no packet received)
Does anyone already have an issue like this ? with/without QinQ in between ?
(PS : i know GNS3 is an emulator but at the moment it's the best way to test configuration out of production, and it's a nice tool to show you great topology
06-27-2023 04:00 PM
if you remove the SW then PO100 interconnect two Sites DC1 and DC2 must be UP and port is "P"
06-27-2023 11:49 PM
Hi
I didn't mentionned it but yes it works in that way, that was my first try in GNS3, unfortunately this is not the reality, as I wrote , at the time we have any device between DC Po won't go up
06-28-2023 05:07 AM
Friend it not issue of vPC not GNS3,
the LACP need to see LACP frame from the same MAC, when you use two SW meaning you have two MAC, to make your network work you need to merge both SW to one virtual SW via VSS or stack.
or for Interconnect between DC ask ISP are they support mLAG, mLAG can merge two ISP SW/R to be virtual ONE and hence your PO between two sites UP and work.
06-28-2023 05:14 AM
the LACP need to see LACP frame from the same MAC, when you use two SW meaning you have two MAC, to make your network work you need to merge both SW to one virtual SW via VSS or stack.
With our old HP core switch, configuration was basic (1x LACP with 2 interfaces as you can see in the screen) and everything worked fine. I mean, i'd like to know what could cause the behaviour differences between HP Core switches and Nexus , as the configuration is pretty simple (from an interface perspective)
(about DC and ISPs i'll keep that in mind but unfortunately i know they won't change anything as datacenter host many clients.)
06-28-2023 05:19 AM - edited 06-28-2023 05:24 AM
HP core work fine with same SP ?
show lacp count
show lacp neighbor
I need to see both in real network
06-28-2023 05:45 AM - edited 06-28-2023 05:49 AM
Yes, HP Core switch works well with the same service provider here, that's why we don't understand why we got an issue with Nexus. Here's a screenshot from HP core switch + another one about lacp neighbors and counters from Nexus (unfortunately from GNS3 at the moment..)
06-28-2023 08:14 AM
for HP there is LACP send receive that why it work fine, the system-id in lacp I dont clear understand the photo you share but if the neighbor system-id is end with f0 00, then LACP see same neighbor in all interface.
for NSK in GNS3 you can see the partner system-id is 0 0 0 0 and the lacp counter is few send without receive any thing.
so as I mention before it is depend on SP.
NOTE:- in gsn3 try remove the SW and check lacp counter and lacp neighbor system-id
06-29-2023 02:06 AM
Yes i agree , it works on HP cause both receive LACPDUs from each other, and that's the issue on Nexusv, they send but never receive LACPDUs. Still in GNS3 it works when you connect 2 nexus without any equipment, as well as it works with 2 real Nexus directly connected. However, i put a switch or a third nexus between both to check why they can't receive LACPDUs.
06-29-2023 02:21 AM
NSK-1xSW-NSK <<- this what you run in your lab, this will never work SW in middle never bypass the LACP from one side to other.
NSK-1xCSR1000-NSK <<- this can work, if CSR1000 can work if we config bridge domain in CSR1000 (l2vpn) which make lacp bypass from one interface to other
NSK-2xCSR1000-NSK <<- this wok if both CSR1000 config with mLAG and with bridge domain.
you config two DC but you forget the SP how it config.
in real HP work because sure SP run l2vpn that bypass lacp from interface to other.
then what can I do in this case in my lab ?
1- use One SW between two DC
2- config LACP in SW for each side
2-A make sure DC and SW use same STP mode
2-B make sure the SW send receive lacp
this friend summary what you face
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide