06-13-2024 08:59 AM
hi , we have cisco 6880-x core switch and connected to 1 Gbps wan circuit. Now encountering packet discard issue on the interface connected to the wan circuit. The max transmit rate reached 350Mbps based on Solarwind monitoriing. Can anyone please advise why the transmitted packets start to drop, the circuit bandwidth is 1Gbps, and interface of C6880-x is Tx port with 1G SFP? What is the max performance throughput of C6880-X? Thanks in advance!
06-13-2024 10:29 AM
You may not get 100% of 1GB due to other over heads, but you can reach at lease 80-90 % of interface speed. if this plain Layer 2 configuration.
what Sup card you have and what IOS Code running ? how is your Line card loaded - Fully populated on that blade ?
how is your port configured ? do you see any packet drop when issue show interface x/x ? is your port negotiated 1GB full duplex ?
if you connect PC and directly to ISP what speed you get ?
check some troubleshooting tips :
https://www.cisco.com/c/en/us/support/docs/switches/catalyst-6500-series-switches/12027-53.html
06-13-2024 07:20 PM
thanks @balaji.bandi for your advice! sup module PID is C6880-X-LE-SUP, software IOS v15.5. The interface is running on full duplex with 1G speed : Full-duplex, 1000Mb/s, media type is 1000BaseLH
interface config
06-14-2024 12:15 AM
When you connected to ISP, why do you have NAT Inside config, why you have MTU 1560 ?
check below example how INSIDE and OUTSIDE interface configured when you using Cat 6K switches :
is this possible you post show run removing confidential informaiton.
I am thinking you may have some kind of QOS configuration, also post show interface x/x (full output to look what kind of drops you see on the interface).
have you tested connecting PC directly ISP, what throughput you get ?
06-13-2024 10:32 AM - edited 06-13-2024 03:44 PM
Very likley, max transmission rate, on 6880 gig interface is gig, both short term and sustained.
If you're seeing drops, common cause is oversubscription of interface, i.e. sending more than gig to interface. Interface will queue excess, but there are limits to how much can be queued before excess is dropped.
As to why Solorwinds is showing a transmit rate of 340 Mbps, likely that's an average rate over time (5 minutes?). By definition, interface always sends frames at line-rate. Often short term micro bursts (microseconds to milliseconds), will overflow queue, but not be "seen" on longer term averages, often multi-second to multi-minute.
06-13-2024 07:23 PM
Thanks @Joseph W. Doherty . it is possible that the root cause is micros burst. the interval is 30 secs rather than 5mins we configured on the interface .
load-interval 30
06-14-2024 02:03 AM
Firstly, Solarwinds may be using its own load interval timer, not the value (30 seconds) set on the interface.
Secondly, micro bursts are generally not visible at 30 seconds. Again, they occur at the microsecond to millisecond time intervals (did you see the second reference I provided in my prior reply?).
BTW, from your later posted interface config, this is some form of private WAN link?
Can you have Solarwinds plot drops? Too many drops will reduce throughput performance although I suspect that's not the case.
If you want to verify your WAN can provide gig throughput, use a UDP traffic generator. With it you may be able to show interface can send at gig, and if p2p private link, far side receives gig. Note - without QoS, such testing tends to disrupt concurrent prod traffic.
06-13-2024 03:24 PM
@Herman2018 wrote:
The max transmit rate reached 350Mbps based on Solarwind monitoriing.
The 6840/6880 series family can do line rate at 1- and 10 Gbps.
If Solarwinds report maximum rate of 350 Mbps, this means the upstream provider is unable to support 1 Gbps.
06-13-2024 03:51 PM
"If Solarwinds report maximum rate of 350 Mbps, this means the upstream provider is unable to support 1 Gbps."
Certainly possible but wouldn't account for OP noting drops on their interface.
06-13-2024 04:00 PM
@Joseph W. Doherty wrote:
but wouldn't account for OP noting drops on their interface.
That depends if the "packet drops" @Herman2018 reported is actually "Total Output Drops" which I suspect the case to be.
06-13-2024 06:25 PM
yes, it is shown on the C6880-x interface "Total Output Drops" when you run the command " sh int Tx/x/x", and also reflected on the solardwind , it shows the packets are discarded on that interface.
06-13-2024 06:22 PM
thanks @Leo Laohoo for your advice. the problem is the packet dropped at C6880-x ,not upstream device, so seems not service provider issue. The queue of C6880x is not enough to store more packets?
06-14-2024 08:15 AM
Hello @Herman2018 ,
there is flow control to be checked
if upstream device sends PAUSE frames the Cat6880 cannot send out packets that have to stay on the sofftware queue increasing the probability to be dropped.
the Cat 6880 cannot have performance issues on a simple 1GE port but it can be driven to drop by upstream device.
Each PAUSE frame stops transmission for a short time.
If so I agree with @Leo Laohoo that likely the circuit or service is not a full 1 GE
Hope to help
Giuseppe
06-14-2024 10:09 AM
Oh, excellent point if PAUSE frames are being used, although that's unusual usage because it can cause the issues @Giuseppe Larosa describes. (Also why the later "data center" PAUSE Ethernet variants, can be selective about what's paused.)
That aside, I think it unlikely the problem is with the provider. Don't misunderstand, certainly a provider could have capacity issues which don't allow you to obtain your full gig, assuming that's supposed to be available, but such issues tend to drop within the providers network, not the CE interface to PE (excluding the possibility of PAUSE frames - which you can easily check the interface to see if 6880 interface is enabled to accept [by default, off I recall?] - and if enabled, believe you might get a count of them too).
You noted your interface is running at a 30 second load interval, and as I noted, possibly Solarwinds is using even a longer load interval, such can show a lower thoughput average than you're actually obtaining during short intervals.
One factoid not described, is how many drops you're getting, and how often they are occurring. Such information can tell much about what's happening. BTW, a really high drop rate can drive average throughput to the floor. One of the biggest throughput killers of TCP is transmission timeouts.
Also BTW, I'm a big fan of QoS (a tool for obtaining best/selective performance pushing traffic across a network), but one overlooked packet type, not usually mentioned for QoS special processing is TCP ACKs, regardless of application. Specifically, trying to insure they are not dropped.
Anyway, my brower's AI has this to say:
TCP (Transmission Control Protocol) is a transport-layer protocol that ensures reliable data transfer between devices over IP networks. One of the key mechanisms used by TCP to ensure reliable data transfer is the transmission timeout, also known as the Retransmission Timeout (RTO). The RTO is a timer that is started when a segment is sent by the sender, and it is used to determine when to retransmit a segment if no acknowledgment is received.
How TCP Transmission Timeout Works
When a segment is sent by the sender, the RTO is initialized to a default value, typically 3 seconds. The timer starts counting down from this value. If no acknowledgment is received for the segment before the timer expires, the segment is retransmitted. The timer is doubled after each retransmission, up to a maximum number of retransmissions, which is typically 5.
The RTO is adjusted on the fly to match the characteristics of the connection by using Smoothed Round Trip Time (SRTT) calculations. SRTT is a measure of the average round-trip time between the sender and receiver. The RTO is calculated as a function of SRTT, which ensures that the timer is adjusted to match the normal delay of the connection.
Factors Affecting TCP Transmission Timeout
Several factors can affect the TCP transmission timeout, including:
TCP Transmission Timeout in Practice
In practice, TCP transmission timeout can have a significant impact on network performance. For example:
Conclusion
In conclusion, TCP transmission timeout is a critical mechanism used by TCP to ensure reliable data transfer over IP networks. Understanding how TCP transmission timeout works and the factors that affect it is essential for optimizing network performance and troubleshooting network issues.
06-24-2024 07:49 PM
@Giuseppe Larosa @Joseph W. Doherty , thanks you both for your advices. I believe the ISP just applied QOS and set bandwidth to 1g, regarding flow control, normally just leave the default setting, right? how to check flow control? can you please advise, thanks.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide