01-13-2023 11:31 AM - edited 01-13-2023 11:31 AM
HI all,
hope I can have your expertise.
I'm currently in the middle of an installation where I'm having a weird issue. We have the following installed
The network is bigger than this, but going to start with the basic since this is happening between all the switches with this configuration. As per the image, I have two Cisco 9300 that are running OSPF and are routing several VLANs. Everything is working as it's supposed but my customer wants my company to guarantee that we are achieving a 10Gbps throughput. The radios are Siklu radios running at 80 Gigaherts and configured for a throughput of 10Gbps (according to Siklu they can achieve 9.5Gbps). To execute the tests we are using a couple of EXFOS Pro 1 testers with the Ether Bert software configure with a frame size of 1250 but I'm having some issues in validating and certifying the links for the "10Gbps" speed. In essence if I connected one tester in one side connected directly to the fibre that comes from the radio and connect on the other side another tester connected directly to radio I'm able to pass the 9.5Gbps without any issue and I have a certificate stating that the BERT test passed. But if I then connect the radios to the switches as it's supposed to be, and connect the EXFO testers to the switches and run the exactly the same test and the testers starts giving errors (Pattern Loss) and I start having Receiving errors in the port, more precisely "symbolerr frames". This errors normally indicates hardware related issues, so I tried a SFP from FS.com and got the same result, then tried an undefined compatible SFP and got the same, for last tried a cisco oficial SFP and got the same. Changed the ports of the switch and got the same. In other switches in the network I got exactly the same issue as this switches are getting. It's impossible to have ten 9300 switches giving exactly the same thing.
This type of issue only happens when I start to reach the limit (10Gbps), if I do 8 or even 9Gbps I don't have issues.
One thing I forgot to mention as well, the switches are running macsec from point to point passing trough the radios and everything connects and works fine. But noticed one thing, when using the testers and setting 9.0Gbps, the SFP port where the tester was connected was showing in the controller utilization 90, but on the port going to the radio I was seeing 92 (there was no other traffic passing to the radio). Then I disabled Macsec and saw that this wasn't happening anymore, I had 90 utilization on the port from the tester and had as well 90 on the port to the radio. I know MAC sec will increase frame size, but will increase the throughput like this?
Hope you can help me understand what's happening. This issues is delaying the delivery of the system to the customer and I can't really understand what's happening.
Thank you
02-14-2023 02:14 PM - edited 02-14-2023 02:18 PM
Regarding output @Leo Laohoo everything was good, no errors or drops at all, only the input errors would increase. In the 9300L with 4x10Gb ports it would give CRC errors, in the 9300T with the 8 X Port SFP module it would give Symbol Errors but no output errors. By leaving only the traffic of the EXFO with the normal control plane traffic everything would be fine.
Could this be an IOS bug? With normal traffic we shouldn't be seeing symbol errors, report that that indicates L1 problems, when in this case it's obviously not the case. This is being a real pickle to understand and to explain to my managers.
Thank you for the help
02-14-2023 02:28 PM
It is beginning to look like a bug where the Total Output Drops (Deferred Frames) counter is being "confused" with Symbols Error.
Raise a TAC Case with RTP/LATAM or EUMA.
02-15-2023 12:21 AM - edited 02-15-2023 12:31 AM
Thank you very much @Leo Laohoo! I'm going to try to start the processing with TAC today. I say try, because my company doesn't have a contract with Cisco TAC and then to make matters worst I'm the only network engineer in the company, and to debate this things internally I only have myself. Thank you for all the support being given and to be able to run this by the you/community.
Just out of curiosity could QoS being doing this? I asked if it could but I tried without and got the same result. But had an interesting finding, all the production traffic is video traffic and I have this traffic being marked with CS4, and then the output I have a scertain policy, and when I did all the tests I put the EXFO as best effort since I don't want this traffic to go on top of the existing traffic. But when I put the traffic of the EXFO being marked as CS4, I started having excess defered frames and by removing from being marked I would have symbol errors.The symbol errors, should never be present but this got me intrigued, is my QoS badly configured?
The QoS configuration I have on site is the below:
class-map match-any VIDEO_IN
match access-group 100
class-map match-any VIDEO_OUT
match dscp cs4
class-map match-any QOS_SERVER_OUT
match dscp cs7
!
policy-map VIDEO_IN
class VIDEO_IN
set dscp cs4 (setting UDP traffic)
policy-map VIDEO_OUT
class VIDEO_OUT
priority level 2 percent 80
queue-buffers ratio 30
class QOS_SERVER_OUT
priority level 1 percent 5
queue-buffers ratio 10
What I need is for the server traffic to have priority above anything else, and second for the video traffic to have the second most priority, and all the remaining traffic to be dealt by the switch as Best Effort. But if the priority queues are available and they are not needing the bandwidth, for the default traffic to be able to use the bandwidth that is available and not in use by the prioritary traffic.
The config above is good for the above? I couldn't understand when putting the traffic being marked as CS4 I started excess defer frames when the link was only at around 89/90%...
Thank you
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide