cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2715
Views
20
Helpful
17
Replies

Symbol Error when reaching the 10Gbps limit of the interface

Othacon
Level 1
Level 1

HI all,

hope I can have your expertise. 

I'm currently in the middle of an installation where I'm having a weird issue. We have the following installed 

Capture Cisco 9300 Radios.png

 The network is bigger than this, but going to start with the basic since this is happening between all the switches with this configuration. As per the image, I have two Cisco 9300 that are running OSPF and are routing several VLANs. Everything is working as it's supposed but my customer wants my company to guarantee that we are achieving a 10Gbps throughput. The radios are Siklu radios running at 80 Gigaherts and configured for a throughput of 10Gbps (according to Siklu they can achieve 9.5Gbps). To execute the tests we are using a couple of EXFOS Pro 1 testers with the Ether Bert software configure with a frame size of 1250 but I'm having some issues in validating and certifying the links for the "10Gbps" speed. In essence if I connected one tester in one side connected directly to the fibre that comes from the radio and connect on the other side another tester connected directly to radio I'm able to pass the 9.5Gbps without any issue and I have a certificate stating that the BERT test passed. But if I then connect the radios to the switches as it's supposed to be, and connect the EXFO testers to the switches and run the exactly the same test and the testers starts giving errors (Pattern Loss) and I start having Receiving errors in the port, more precisely "symbolerr frames". This errors normally indicates hardware related issues, so I tried a SFP from FS.com and got the same result, then tried an undefined compatible SFP and got the same, for last tried a cisco oficial SFP and got the same. Changed the ports of the switch and got the same. In other switches in the network I got exactly the same issue as this switches are getting. It's impossible to have ten 9300 switches giving exactly the same thing. 

This type of issue only happens when I start to reach the limit (10Gbps), if I do 8 or even 9Gbps I don't have issues. 

One thing I forgot to mention as well, the switches are running macsec from point to point passing trough the radios and everything connects and works fine. But noticed one thing, when using the testers and setting 9.0Gbps, the SFP port where the tester was connected was showing in the controller utilization 90, but on the port going to the radio I was seeing 92 (there was no other traffic passing to the radio). Then I disabled Macsec and saw that this wasn't happening anymore, I had 90 utilization on the port from the tester and had as well 90 on the port to the radio. I know MAC sec will increase frame size, but will increase the throughput like this? 

 

Hope you can help me understand what's happening. This issues is delaying the delivery of the system to the customer and I can't really understand what's happening.

 

Thank you

17 Replies 17

balaji.bandi
Hall of Fame
Hall of Fame

until you post results here we dont really assume what happened when in the transit. What config switch has ?

i did same test in most of the time, instead of Wireless DWDM 10GB (I am able to get 9.68 ( L2 security from provider side) if I do MACsec max I get 9.2GB because of some overheads.

i used jumbo frames 9K switch has just basic config as switch port layer 2

Cisco 9300 UXM - 10G-SR SFP / 10G-LRM

 

 

BB

***** Rate All Helpful Responses *****

How to Ask The Cisco Community for Help

Thank you for the reply @balaji.bandi, the below is the edited config of the switch.

!
ip routing
!
qos queue-softmax-multiplier 1200
vtp domain x.x.x.x
vtp mode transparent
!
transceiver type all
monitoring
!
vlan 30
name MISC
!
vlan 40
name MISC
!
vlan 50
name MANAGEMENT
!
vlan 999
name SPARE
!
vlan 4010
name MISC
!
vlan 4011
name MISC
!
interface TenGigabitEthernet1/1/1
description LINK TO CISCO 9300 LEFT
switchport trunk native vlan 4010
switchport trunk allowed vlan 4010
switchport mode trunk
switchport nonegotiate
switchport port-security maximum 2
switchport port-security violation restrict
switchport port-security mac-address sticky
switchport port-security
mtu 1532
cts manual
sap pmk xxxx
spanning-tree guard loop
service-policy output VIDEO_OUT
!
interface TenGigabitEthernet1/1/2
description LINK TO CISCO 9300 RIGHT
switchport trunk native vlan 4011
switchport trunk allowed vlan 4011
switchport mode trunk
switchport nonegotiate
switchport port-security maximum 2
switchport port-security violation restrict
switchport port-security mac-address sticky
switchport port-security
mtu 1532
cts manual
sap pmk xxxxx
spanning-tree guard loop
service-policy output VIDEO_OUT

interface Vlan1
no ip address
shutdown
!
interface Vlan30
description MISC
ip address x.x.x.x x.x.x.x
ip pim sparse-mode
spanning-tree guard root
ip access-group VLAN30_CLAMP in
service-policy input VIDEO_IN
!
interface Vlan40
description MISC
ip address x.x.x.x x.x.x.x
ip pim sparse-mode
spanning-tree guard root
ip access-group VLAN40_CLAMP in
service-policy input VIDEO_IN
!
interface Vlan50
description MANAGEMENT
ip address x.x.x.x x.x.x.x
ip pim sparse-mode
spanning-tree guard root
!
interface Vlan4010
description MISC
ip address x.x.x.x x.x.x.x
ip pim sparse-mode
ip ospf authentication key-chain MISC
ip ospf network point-to-point
!
interface Vlan4011
description
ip address x.x.x.x x.x.x.x
ip pim sparse-mode
ip ospf authentication key-chain MISC
ip ospf network point-to-point
!
router ospf 1
auto-cost reference-bandwidth 10000
passive-interface Vlan30
passive-interface Vlan40
passive-interface Vlan50
network x.x.x.x x.x.x.x area 1
network x.x.x.x x.x.x.x area 1

 

The switches are the C9300-24T running the IOS version 17.06.03

The EXFO is being placed in port TE1/1/8 with the port configured as access for Vlan 50.

The results of the switch in the controller side of the interface in the switch unfortunately I can't get it anymore since the switch was rebooted, but it was reporting thousands of symbolerr frames, and the same amount in receive errors.

The problem I have is that the customer wants to see the "10Gbps" no matter what, and we have to have a really, really good reason for it not to achieve... and this is being an headache. I was able to get a stable 9.4Gbps but he his going to come back to me and say, why we don't get the 9.5Gbps? It's only 100Mbps but this is a really difficult customer and if we would reach even 9.2 he would never accept it unless we have good documentation supporting it...

Thank you

When you add complexity of config the overhead increases, so you will less bandwdith

make it simple config access port - vlan 50 and test, with out any additional config- what is the outcome ?

i used iperf3 with 4 streams 9.5 to 9.8 consistency

 

https://www.cisco.com/c/en/us/support/docs/switches/catalyst-9300-switch/216236-troubleshoot-output-drops-on-catalyst-90.html

BB

***** Rate All Helpful Responses *****

How to Ask The Cisco Community for Help

Thank you @balaji.bandi I'm going to install two new switches without passing trough radios, from switch to switch first with and without config and see what I get. 

As mentioned my issue is my customer, he doesn't see things that way unfortunately, he specked Macsec and all the bells and whistles but he wants 10Gbps and if we can't achieve it we need to show why by written. I think I'll need to open a Cisco TAC to have this explanation and as well to have an official explanation to the Symbol Err Frames. Even if the link can't support the throughput I shouldn't be seeing this kind of errors correct?

Thank you

Your customer - so you have to justify technically what is feasible and technically possible.

Sure TAC can give you right suggestion. as for demarcation TAC only provide what Cisco product are concern.

 

BB

***** Rate All Helpful Responses *****

How to Ask The Cisco Community for Help

indeed, and this kind of things it needs to come directly from the manufacturer unfortunately. From us, the customer won't accept it... the customer in question has CCNP, and he really doubts what we tell him. 

The why the symbol errors tough, are killing my head, I don't have an explanation for those...

Leo Laohoo
Hall of Fame
Hall of Fame

Post the complete output to the command "sh interface <PORT>".

 

Please see below @Leo Laohoo , it's not from the switch I was testing today, but it's from another switch with exactly the same output. The switch I was working I rebooted and lost the counters. 

Capture cisco stats.PNG

Thank you

Check for MTU mismatch.

Thank you for the reply @Leo Laohoo , but the config on the interfaces related with the MTU is 1532 on both sides. According to Cisco, Macsec adds 32 bytes on top, to account for that I putted the MTU of the interfaces with macsec at 1532 and both are configured for such. I can't see the mismatch hmm.

Thank you

 


@Othacon wrote:
According to Cisco, Macsec adds 32 bytes on top, to account for that I putted the MTU of the interfaces with macsec at 1532 and both are configured for such.


I always associate Symbols Errors with MTU mismatch. Can you try changing the MTU down to 1468 (1500-32) and see if the errors are still incrementing?

Thank you @Leo Laohoo , I will try it. In the next week I won't be on site to be able to test but as soon as I'm there I'm going to try and see the results. Will post as soon as I get the results. 

This is a late reply, but finally was able to do all the tests @Leo Laohoo

Had a switch that instead of giving symbol errors, it was giving CRC errors, and tried everything, from sfps to change ports and nothing. Grabbed a new 9300 and tested it connect back to back with the receiving switch, with fiber and was able to reach 9.7Gbps with Macsec enabled. Grabbed that switch went to the place of the one giving me CRC, connected the switch to the radio, runned the EXFO tester and was able to get 9.3Gbps (the radio is limited to 9.55Gbps) between switches (9300s). All happy replaced the switch and connected everything of the old to the new, executed the same tested but this time instead of having CRC errors started to have symbol errors again... Intrigued and since I had tested this same switch with success, started to shut down the ports that had production traffic (traffic that amounted to around 200Mbps), and to my surprise the errors stopped completely. So basically with the production traffic that is around 200Mbps I can only reach around 8.2Gbps before it start giving symbol errors, in the port utilisation of the controllers it gives an utilization of around 8.8Gbps, if I increase more I start to have symbol errors. I remove this traffic and I can reach 9.3Gbps without any errors whatsoever.

This is truly really weird, how can I have symbol errors incrementing with some production traffic, but then only basically the data from the tester and completely maxing out the test is successful and without any issues whatsoever... I'm truly confused and don't know how to approach this... Any ideas please?

Thank you


@Othacon wrote:
So basically with the production traffic that is around 200Mbps I can only reach around 8.2Gbps before it start giving symbol errors

Interesting result.  

Are you also getting any Total Output Drops?