04-09-2022 02:59 PM
Hi All,
In the middle of a large (250+) 9300L switch refresh project for one of my customers and have noticed that on some of the new switch stacks, active, standby or member switches randomly reboot?
Troubleshooting has lead to the stacking cables / interfaces, noticed some stacks have no errors at all but some do? All equipment is brand new, issue the #show switch stack-ports and the stack comes back with:
#sh sw stack-ports
Switch# Port1 Port2
----------------------------
1 OK OK
2 OK OK
But on the same stack I see lots of CRC errors? Example seen:
#sh sw stack-ports detail
1/1 is OK Loopback No
Cable Length 50cm Neighbor 2
Link Ok Yes Sync Ok Yes Link Active Yes
Changes to LinkOK 1
Five minute input rate 266 bytes/sec
Five minute output rate 928 bytes/sec
2383784958 bytes input
455333557 bytes output
CRC Errors
Data CRC 0
Ringword CRC 0
InvRingWord 0
PcsCodeWord 0
1/2 is OK Loopback No
Cable Length 50cm Neighbor 2
Link Ok Yes Sync Ok Yes Link Active Yes
Changes to LinkOK 1
Five minute input rate 2355 bytes/sec
Five minute output rate 1529 bytes/sec
2348706037 bytes input
97328466 bytes output
CRC Errors
Data CRC 281
Ringword CRC 318
InvRingWord 336
PcsCodeWord 11205406
2/1 is OK Loopback No
Cable Length 50cm Neighbor 1
Link Ok Yes Sync Ok Yes Link Active Yes
Changes to LinkOK 1
Five minute input rate 2791 bytes/sec
Five minute output rate 50554 bytes/sec
570971450 bytes input
18165147962 bytes output
CRC Errors
Data CRC 394
Ringword CRC 790
InvRingWord 528
PcsCodeWord 1002
2/2 is OK Loopback No
Cable Length 50cm Neighbor 1
Link Ok Yes Sync Ok Yes Link Active Yes
Changes to LinkOK 1
Five minute input rate 8 bytes/sec
Five minute output rate 0 bytes/sec
28637843 bytes input
320 bytes output
CRC Errors
Data CRC 0
Ringword CRC 0
InvRingWord 0
PcsCodeWord 0
#sh sw stack-ports detail
I have not replaced the staking cable between ports 1/2 and 2/1 yet, seeing this stacking CRC errors on a lot of stacks.
Any thoughts? Switches combinations used in stacks: C9300L-48UXG-4X, C9300L-48P-4X and C9300L-24P-4X using 50cm stack cable and a max number of 4 switches per stack. Switch code 17.03.04 CAT9K_IOSXE INSTALL mode.
Cheers,
04-09-2022 11:31 PM
Hello,
accprding to the stacking guide:
CRC Errors
Different types of Cyclic Redundancy Check (CRC) errors that are seen on a stack interface:
Data CRC: Stack interface data CRC error
Ringword CRC: Stack interface ring word CRC error
InvRingWord: Stack interface invalid ring word error
PcsCodeWord: Stack interface Physical Coding Sublayer (PCS) error
These errors normally occur when a stack interface state changes due to a switchover or a switch reload. You can ignore such errors.
But when these error counters increase significantly or when they increase continuously over a period of time, check the stack cable for issues.
04-10-2022 12:54 AM - edited 04-10-2022 12:55 AM
Hello
@Jay233 wrote:
n the middle of a large (250+) 9300L switch refresh project for one of my customers and have noticed that on some of the new switch stacks, active, standby or member switches randomly reboot?
What software are you running on ,--- Possible bug
04-10-2022 06:21 AM
Hi Paul,
Switch code 17.03.04 CAT9K_IOSXE INSTALL mode.
Also got reason code for reboot - Critical process sif_mgr fault on rp_0_0 (rc=143)
03-20-2023 05:44 AM
I have the same issue.
One switch with 4x members as follows:
Switch Ports Model SW Version SW Image Mode
------ ----- ----- ---------- ---------- ----
* 1 41 C9300-24T 17.06.04 CAT9K_IOSXE INSTALL
2 41 C9300-24T 17.06.04 CAT9K_IOSXE INSTALL
3 32 C9300X-12Y 17.06.04 CAT9K_IOSXE INSTALL
4 32 C9300X-12Y 17.06.04 CAT9K_IOSXE INSTALL
The stack randomly would reboot after several hours of being fine. All "show stack" commands reported back there was no issues except when doing a "show switch stack-ports detail" which indicated a high CRC error count on some cables / interfaces.
Cables were re-seated, swapped for new ones and moved around to see if the issue followed the cable or not. The results were sporadic. Cables were always finger tight on the screws and correctly seated.
At one point the stack reloaded and a switch completely died. It is now no longer responding to console input (the console light isn't lit at the rear) and no LEDs are on the front yet the PSU is up and fans spin up.
The faulty switch was replaced with an RMA'd unit but the CRC issues persist.
The switches were in a production environment but were too unstable and are now on a bench for testing, but given the nature of the fault which can take hours to manifest a reboot this is an absolute pain to diagnose.
03-20-2023 02:47 PM
Post the complete output to the following commands:
08-01-2023 11:58 AM
Check the order of your stacking cables. Make sure they are connected as documentation states. I had hooked mine up in the reverse order and devices would randomly reboot.
Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: