08-30-2024 05:26 AM - edited 08-30-2024 05:33 AM
Hello and sorry for my bad english,
I have a lag/etherchannel between my Cisco C3850 and two Aruba 8100 in stack. I have troubles because there is a lot of packets discarded as you can see below on picture.
My configuration on Cisco is like this :
interface Port-channel32
description aruba
switchport trunk native vlan 999
switchport mode trunk
switchport nonegotiate
interface TenGigabitEthernet1/0/39
description Aruba A8100
switchport trunk native vlan 999
switchport mode trunk
switchport nonegotiate
channel-group 32 mode active
!
interface TenGigabitEthernet1/0/40
description Aruba A8100
switchport trunk native vlan 999
switchport mode trunk
switchport nonegotiate
channel-group 32 mode active
and on Aruba :
interface lag 1 multi-chassis
description LACP-to-Coeur
no shutdown
no routing
vlan trunk native 999
vlan trunk allowed 1-2,5,15,17,21-22,25-26,45,51,54,56,61-62,70,89,100,102,104,110,999
lacp mode active
spanning-tree bpdu-filter
spanning-tree rpvst-filter
interface 1/1/47
description aggr LAG1 to core
no shutdown
lag 1
!
interface 1/1/48
description aggr LAG1 to core
no shutdown
lag 1
do you think this configuration is bad ?
09-04-2024 07:46 AM
What we be better, on the STP variant to use, probably would be to use MST on Cisco too.
Actually, as your topology shows a loop, between the 3 logical switches, why you don't have a L2 loop is an interesting question. Unless you somehow have transient L2 loops, which would also be interesting.
BTW, I'm not against trying @MHM Cisco World taking down one of the two Other channels, and see what happens. (I would normally expect an increased load with increased drops.)
09-04-2024 08:05 AM
Aruba switch was configure by external company, i see that Aruba have rpvst capability
8100-48X-E011-124(config)# spanning-tree mode
mstp Multiple spanning tree mode(Default)
rpvst Rapid PVST mode
so the best way is to change MST to rpvst on Aruba ?
09-04-2024 08:31 AM - edited 09-04-2024 08:33 AM
What's the rest of the L2 infrastructure.
I.e. what other switches do the Cisco 3850s and Aruba 8100 connect to switch<>switch.
Certainly, the rapid-pvst would be simpler.
What's a good goal, is that all switches sharing L2 domain run the same STP variant.
PS:
Just say @MHM Cisco World latest reply.
He's 100% correct, that before changing Aruba to its rapid-pvst it's compatible with Cisco's.
09-04-2024 07:35 AM
". . . check if network be stable or not."
Is network unstable now?
From OP, the issue appears to be transient high drop rate that's concurrent with with high transmission loads. Are you making a case the high transient loads are due to transient L2 loops?
08-30-2024 09:38 AM
What are the port-channel's members interface stats? (I'm wondering how well you load sharing is working.)
09-02-2024 05:50 AM
hello, stats on members interfaces (4 interfaces
TenGigabitEthernet1/0/39 is up, line protocol is up (connected)
Hardware is Ten Gigabit Ethernet, address is 8024.8f7c.8527 (bia 8024.8f7c.852
7)
Description: Aruba A8100-48X-E011-123
MTU 1500 bytes, BW 10000000 Kbit/sec, DLY 10 usec,
reliability 255/255, txload 1/255, rxload 1/255
Encapsulation ARPA, loopback not set
Keepalive not set
Full-duplex, 10Gb/s, link type is auto, media type is SFP-10GBase-SR
input flow-control is on, output flow-control is unsupported
ARP type: ARPA, ARP Timeout 04:00:00
Last input 00:00:08, output 00:00:00, output hang never
Last clearing of "show interface" counters 10w3d
Input queue: 0/2000/0/0 (size/max/drops/flushes); Total output drops: 24231515
Queueing strategy: fifo
Output queue: 0/40 (size/max)
5 minute input rate 2412000 bits/sec, 612 packets/sec
5 minute output rate 2066000 bits/sec, 582 packets/sec
52663866631 packets input, 73508693037430 bytes, 0 no buffer
Received 7353380 broadcasts (1553286 multicasts)
0 runts, 0 giants, 0 throttles
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
0 watchdog, 1553286 multicast, 0 pause input
0 input packets with dribble condition detected
9953822663 packets output, 8547270682732 bytes, 0 underruns
Output 44229090 broadcasts (0 multicasts)
0 output errors, 0 collisions, 0 interface resets
0 unknown protocol drops
0 babbles, 0 late collision, 0 deferred
0 lost carrier, 0 no carrier, 0 pause output
0 output buffer failures, 0 output buffers swapped out
TenGigabitEthernet1/0/40 is up, line protocol is up (connected)
Hardware is Ten Gigabit Ethernet, address is 8024.8f7c.8528 (bia 8024.8f7c.852
Description: Aruba A8100-48X-E011-124
MTU 1500 bytes, BW 10000000 Kbit/sec, DLY 10 usec,
reliability 255/255, txload 5/255, rxload 2/255
Encapsulation ARPA, loopback not set
Keepalive not set
Full-duplex, 10Gb/s, link type is auto, media type is SFP-10GBase-SR
input flow-control is on, output flow-control is unsupported
ARP type: ARPA, ARP Timeout 04:00:00
Last input 00:00:00, output 00:00:04, output hang never
Last clearing of "show interface" counters 10w3d
Input queue: 0/2000/0/0 (size/max/drops/flushes); Total output drops: 59886024
4
Queueing strategy: fifo
Output queue: 0/40 (size/max)
5 minute input rate 110262000 bits/sec, 10666 packets/sec
5 minute output rate 196512000 bits/sec, 17708 packets/sec
71950330175 packets input, 96531463196932 bytes, 0 no buffer
Received 60760548 broadcasts (56798875 multicasts)
0 runts, 0 giants, 0 throttles
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
0 watchdog, 56798875 multicast, 0 pause input
0 input packets with dribble condition detected
48873720385 packets output, 55061793514407 bytes, 0 underruns
Output 42676441 broadcasts (0 multicasts)
0 output errors, 0 collisions, 0 interface resets
0 unknown protocol drops
0 babbles, 0 late collision, 0 deferred
0 lost carrier, 0 no carrier, 0 pause output
0 output buffer failures, 0 output buffers swapped ou
show interfaces Te2/0/37
TenGigabitEthernet2/0/37 is up, line protocol is up (connected)
Hardware is Ten Gigabit Ethernet, address is 7486.0ba1.48a5 (bia 7486.0ba1.48a
5)
Description: Aruba A8100-48X-E011-123
MTU 1500 bytes, BW 10000000 Kbit/sec, DLY 10 usec,
reliability 255/255, txload 1/255, rxload 1/255
Encapsulation ARPA, loopback not set
Keepalive not set
Full-duplex, 10Gb/s, link type is auto, media type is SFP-10GBase-SR
input flow-control is on, output flow-control is unsupported
ARP type: ARPA, ARP Timeout 04:00:00
Last input 00:00:03, output 00:00:01, output hang never
Last clearing of "show interface" counters 10w3d
Input queue: 0/2000/0/0 (size/max/drops/flushes); Total output drops: 32429118
Queueing strategy: fifo
Output queue: 0/40 (size/max)
5 minute input rate 40795000 bits/sec, 5814 packets/sec
5 minute output rate 72727000 bits/sec, 6762 packets/sec
65314505813 packets input, 83103437657245 bytes, 0 no buffer
Received 5631590 broadcasts (1388926 multicasts)
0 runts, 0 giants, 0 throttles
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
0 watchdog, 1388926 multicast, 0 pause input
0 input packets with dribble condition detected
21834228968 packets output, 24627283272095 bytes, 0 underruns
Output 35396480 broadcasts (0 multicasts)
0 output errors, 0 collisions, 0 interface resets
0 unknown protocol drops
0 babbles, 0 late collision, 0 deferred
0 lost carrier, 0 no carrier, 0 pause output
0 output buffer failures, 0 output buffers swapped out
show interfaces Te2/0/38
TenGigabitEthernet2/0/38 is up, line protocol is up (connected)
Hardware is Ten Gigabit Ethernet, address is 7486.0ba1.48a6 (bia 7486.0ba1.48a
6)
Description: Aruba A8100-48X-E011-124
MTU 1500 bytes, BW 10000000 Kbit/sec, DLY 10 usec,
reliability 255/255, txload 1/255, rxload 1/255
Encapsulation ARPA, loopback not set
Keepalive not set
Full-duplex, 10Gb/s, link type is auto, media type is SFP-10GBase-SR
input flow-control is on, output flow-control is unsupported
ARP type: ARPA, ARP Timeout 04:00:00
Last input 00:00:04, output 00:00:04, output hang never
Last clearing of "show interface" counters 10w3d
Input queue: 0/2000/0/0 (size/max/drops/flushes); Total output drops: 26497528
56
Queueing strategy: fifo
Output queue: 0/40 (size/max)
5 minute input rate 24644000 bits/sec, 2745 packets/sec
5 minute output rate 27632000 bits/sec, 3245 packets/sec
55200156641 packets input, 73933372003516 bytes, 0 no buffer
Received 4433600 broadcasts (1406606 multicasts)
0 runts, 0 giants, 0 throttles
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
0 watchdog, 1406606 multicast, 0 pause input
0 input packets with dribble condition detected
52053747777 packets output, 61789909460957 bytes, 0 underruns
Output 94955083 broadcasts (0 multicasts)
0 output errors, 0 collisions, 0 interface resets
0 unknown protocol drops
0 babbles, 0 late collision, 0 deferred
0 lost carrier, 0 no carrier, 0 pause output
0 output buffer failures, 0 output buffers swapped out
09-02-2024 09:07 AM
That's helpful.
However, OP has just T1/0/39 and 40 shown as member ports. Now you show 4 ports. Are these all in the same port-channel, or two different port-channels? If the latter, what ports belong to what port-channels? (From PC32 earlier stats I suspect two ports channels.) What's the LB algorithm being used? If two port-channels, why not just one port-channel?
Below is a spreadsheet of egress packets per interface and its overall drop rate percentage.
Interface | Total output drop | packets output | Drop % | Interface load % | Description | packets in | Interface load % |
TenGigabitEthernet1/0/39 | 24,231,515 | 9,953,822,663 | 0.24% | Aruba A8100-48X-E011-123 | 52,663,866,631 | ||
TenGigabitEthernet1/0/40 | 59,886,024 | 48,873,720,385 | 0.12% | Aruba A8100-48X-E011-124 | 71,950,330,175 | ||
TenGigabitEthernet2/0/37 | 32,429,118 | 21,834,228,968 | 0.15% | Aruba A8100-48X-E011-123 | 65,314,505,813 | ||
TenGigabitEthernet2/0/38 | 2,649,752,856 | 52,053,747,777 | 4.84% | Aruba A8100-48X-E011-124 | 55,200,156,641 |
As I, yet, don't know what the actual interfaces to port-channels mappings are, cannot say how well load sharing is happening, but overall drop percentages do vary per interface as do the overall packets egressing each interface. (Note: all interface counters are the same age [good].)
The one interface has a high overall drop percentage.
The Aruba device is a wireless controller?
BTW, any visibility of traffic stats from Aruba to Cisco?
In my prior reply, I mentioned it may not be possible to totally eliminate drops (as, by design, TCP [and some other protocols] is designed to discover maximum available bandwidth by exceeding actual available bandwidth - often resulting in drops). The impact of this can be mitigate (again possibly not eliminated) by dropping the fewest number of packets to obtain ideal flow rates (NB: BTW, this is a goal of RED, but it's tricky to get it "right") and/or insuring one flow's probing for maximum available bandwidth is not adverse to other concurrent flows (which it often is, when a shared FIFO egress queue is being used [which is common]).
Also BTW, low-end 3K switches (like the 3850) often lack the hardware resources of higher end models. So, like having more transit bandwidth, having a higher grade of switch often works better in demanding cases. That said, again with complex QoS (also again, can be limited by the feature of a switch) might be used to offset lack of hardware capacity.
You mention an Aruba stack. From the ports provided, your Cisco switch is a dual 3850 stack? What particular 3850 model and OS?
09-02-2024 11:20 PM - edited 09-02-2024 11:36 PM
To help you to understand, you can see the infrastructure :
The core switch is two Cisco WS-C3850-48XS-S connect with stackwise version 16.12.09 CAT3K_CAA-UNIVERSALK9
I have 4 Aruba 8100 stack two by two with VSX
I have 2 port channel on core switch, one port channel with 4 ports to first Aruba Stack, one other port channel with 4 ports connect to other Aruba stack. What i post before was stats interfaces of all ports of one port-channel. For LB algorithm, i see in global conf : port-channel load-balance src-dst-mac
I have a lot of drops packets on the two port-channels connects to Aruba
The Aruba are connect to Netapp storage not in production yet and also hypervisors ESXI in production
09-03-2024 03:26 PM
Are all the sources and destinations, for the traffic crossing between the Cisco 3850s and the Arubas, all have a single IP to single MAC relationship to those transit VLANs? I.e. no hosts have multiple IPs or multiple hosts are using a L3 interface MAC? If not, src-dst-ip might be a better LB choice.
How is the performance of PC33?
One reason I thought the four links you provided were two port-channels, is because the sum of the drop stats for ten1/0/39 and 40 seem to be just a bit more than the drop stats you earlier provided for the PC32 interface, but when you sum up all four ports, the drops for the 4 ports is much, much higher.(?)
Now that I know of all the member ports for the PC, the LB stats are interesting. Ingress LB (from Aruba) appear not to badly balanced:
Interface load % |
21.48% |
29.35% |
26.64% |
22.52% |
100.00% |
But not so for the 3850s:
Interface load % |
7.36% |
36.12% |
16.14% |
40.38% |
100.00% |
As these stats are across 10 plus weeks, they are poor LB ratios.
Do you know what Aruba is using for is load sharing algorithm? (Choices?: lacp hash [l2-src-dst | l3-src-dst | l4-src-dst])
(Possibly issue of 3850 port ratios might be they are still using the 3 bit modular, rather than the later 8 bit modular, and Aruba is either using a larger modular and/or looks at more packet attributes.)
Since the 3850 LB ratios are so poor, showing one link averaging 40% of the load, it's is no surprise that link also shows the highest drop percentage too.
Since you're using 48XS models, any possibility of adding more member links or using the 40G?
Cannot say what the underlying architecture is for your 3850, but on the earlier 3750, Cisco eventually documented they assigned buffer RAM to each bank of 24 edge ports and to the bank of 4 uplink ports. On those switches, to maximize physical buffer available to port, you could take into account how the port load was distributed to various port. (Also, this consideration wasn't new to the 3750s, similar considerations might apply to many network devices, so something similar might apply to the 3850s too.)
Cisco has a Tech Note Troubleshoot Catalyst 3850 Output Drops, but for your OS, that document has:
Note: 16.x.x and later QoS CLI command changes are documented in this guide Troubleshoot Output Drops on Catalyst 9000 Switches. This document is Catalyst 9000 Series, but shares the same ASIC as the 3850. Use this guide for 3850 on 16.x.x or later Cisco IOS® XE versions.
As I wrote earlier, with "complex" QoS configuration changes, I believe your drops might be further mitigated, but, if possible, I would suggest the simplest, and least likely to backfire, changes first. @MHM Cisco World suggested multiplier change (as in the 9K document), but I would like to see better load sharing ratios and more bandwidth.
If you want to pursue the QoS approach, we can use the 9K document, at least as a great source for what to examine stat wise.
09-04-2024 06:14 AM - edited 09-04-2024 06:14 AM
On Aruba side, algorithm LACP is l3-src-dst
statistiques of the second port-channel :
show interface po33
Port-channel33 is up, line protocol is up (connected)
Hardware is EtherChannel, address is 8024.8f7c.8526 (bia 8024.8f7c.8526)
Description: Aruba salle D011
MTU 1500 bytes, BW 40000000 Kbit/sec, DLY 10 usec,
reliability 255/255, txload 2/255, rxload 3/255
Encapsulation ARPA, loopback not set
Keepalive set (10 sec)
Full-duplex, 10Gb/s, link type is auto, media type is N/A
input flow-control is on, output flow-control is unsupported
Members in this channel: Te1/0/37 Te1/0/38 Te2/0/39 Te2/0/40
ARP type: ARPA, ARP Timeout 04:00:00
Last input 14w2d, output 00:00:01, output hang never
Last clearing of "show interface" counters 10w5d
Input queue: 0/2000/0/0 (size/max/drops/flushes); Total output drops: 51719398
990
Queueing strategy: fifo
Output queue: 0/40 (size/max)
5 minute input rate 564159000 bits/sec, 50717 packets/sec
5 minute output rate 381640000 bits/sec, 37807 packets/sec
91663407339 packets input, 112771052335781 bytes, 0 no buffer
Received 46632379 broadcasts (9355096 multicasts)
0 runts, 0 giants, 0 throttles
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
0 watchdog, 9355096 multicast, 0 pause input
0 input packets with dribble condition detected
171377497808 packets output, 233351651050432 bytes, 0 underruns
Output 208725507 broadcasts (0 multicasts)
0 output errors, 0 collisions, 0 interface resets
0 unknown protocol drops
0 babbles, 0 late collision, 0 deferred
0 lost carrier, 0 no carrier, 0 pause output
0 output buffer failures, 0 output buffers swapped out
"Since you're using 48XS models, any possibility of adding more member links or using the 40G?"
Not for the moment, i don't have SFP 40G
09-04-2024 07:28 AM
"On Aruba side, algorithm LACP is l3-src-dst"
Ah, well since Aruba LB appears to be working better than Cisco's (L2), this would appear to confirm using a similar algorithm on the Cisco side, as I suggested, may provide a better LB distribution too.
So, there's a drop issue on both Cisco<>Aruba Etherchannels, correct?
Okay, you don't have QSPF+ transceivers, but do the Aruba switches support them? If they do, are there free 40g ports on both Cisco and Aruba?
I assume your Cisco 40g ports are unused. If you have free 10g ports on the Arubas, you might use a 40g<>4x10g breakout cable.
Even if you don't have free 10g ports on the Aruba, there might be benefit to moving the 4 10g Cisco ports to such a 40g port (this assumes the 40g ports have more HW resources supporting them).
BTW, a 40g port, usually, performs better, possibly much better, than a 4x10g. What you lose is port redundancy.
Notwithstanding 40g options, do the Cisco and Aruba switches have free 10g ports allowing any increase to inter-switch bandwidth?
09-03-2024 07:35 AM - edited 09-03-2024 07:36 AM
also, one basic question :
On Aruba we are configure like this :
vlan trunk allowed 1-2,5,15,17,21-22,25-26,45,51,54,56,61-62,70,89,100,102,104,110,999
And from Cisco side where is connect the Aruba the configuration is :
switchport mode trunk
do you think from Cisco side, we have to configure like switchport trunk allowed 1-2,5,15,17,21-22,25-26,45,51,54,56,61-62,70,89,100,102,104,110,999
or it is not a problem ?
09-03-2024 09:46 AM
"do you think from Cisco side, we have to configure like switchport trunk allowed 1-2,5,15,17,21-22,25-26,45,51,54,56,61-62,70,89,100,102,104,110,999"
"have to", if Aruba works like Cisco, which I recall allowed blocks both ingress and egress, it shouldn't matter.
I.e. if prior is true, is should not be a problem (logically).
However, not knowing what other VLAN traffic might being sent down those links, only to be blocked by Aruba, it not only "wastes" bandwidth, but it could add, needlessly, to interface congestion, leading to, needless, extra drops.
I.e. so, yes, it could be a problem (physically).
What we don't know is what other VLANs are in play on the Cisco switches, possibly none, or if some, amount of traffic they would be adding to these interfaces.
09-04-2024 12:31 PM
Hi,
Can you please post the outputs when issuing the following commands:
show platform hardware fed switch 1 qos queue stats int te 1/0/39
show platform hardware fed switch 1 qos queue stats int te 1/0/40
show platform hardware fed switch 2 qos queue stats int te 2/0/37
show platform hardware fed switch 2 qos queue stats int te 2/0/38
Also you MAY consider to change the load-distribution method from "src-dst-mac" to "src-dst-ip" or "extended src-ip src-mac dst-ip dst-mac" and check if there is any improvement as for the output drops.
Thanks & Regards,
Antonin
09-04-2024 02:39 PM
Wow, I was unaware of the "extended" options! Also unaware of all the additional non-extended hashing options within that platform (i.e. within 3850 16.12.x - haven't checked earlier IOS versions [possibly this started in 16.x.x - as perhaps as did 9K like QoS too?])!
If I'm reading the documentation correctly, you can mix and mash some different attributes using extended, but the non-extended choices are pretty comprehensive.
As to the suggestion to use "extended src-ip src-mac dst-ip dst-mac", suspect that wouldn't provide a lot of extra variability because I would suspect src or dst IPs would have the same MACs.
Given the additional options, beyond src-dst-ip, I would suggest either:
port-channel load-balance src-dst-mixed-ip-port !useful for UDP/TCP flows between two hosts
or
port-channel load-balance extended src-ip src-port dst-ip dst-port l3-proto !may help with non-UDP/TCP flows, between hosts
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide