cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
3642
Views
0
Helpful
6
Replies

C9300L-XXX Stacking Issue (Unexpected/Random reloads)

Jay233
Level 1
Level 1

Hi All,

 

In the middle of a large (250+) 9300L switch refresh project for one of my customers and have noticed that on some of the new switch stacks, active, standby or member switches randomly reboot?

Troubleshooting has lead to the stacking cables / interfaces, noticed some stacks have no errors at all but some do? All equipment is brand new, issue the #show switch stack-ports and the stack comes back with:

 

#sh sw stack-ports
Switch# Port1 Port2
----------------------------
1 OK OK
2 OK OK

 

But on the same stack I see lots of CRC errors? Example seen:

 

#sh sw stack-ports detail
1/1 is OK Loopback No
Cable Length 50cm Neighbor 2
Link Ok Yes Sync Ok Yes Link Active Yes
Changes to LinkOK 1
Five minute input rate 266 bytes/sec
Five minute output rate 928 bytes/sec
2383784958 bytes input
455333557 bytes output
CRC Errors
Data CRC 0
Ringword CRC 0
InvRingWord 0
PcsCodeWord 0
1/2 is OK Loopback No
Cable Length 50cm Neighbor 2
Link Ok Yes Sync Ok Yes Link Active Yes
Changes to LinkOK 1
Five minute input rate 2355 bytes/sec
Five minute output rate 1529 bytes/sec
2348706037 bytes input
97328466 bytes output
CRC Errors
Data CRC 281
Ringword CRC 318
InvRingWord 336
PcsCodeWord 11205406
2/1 is OK Loopback No
Cable Length 50cm Neighbor 1
Link Ok Yes Sync Ok Yes Link Active Yes
Changes to LinkOK 1
Five minute input rate 2791 bytes/sec
Five minute output rate 50554 bytes/sec
570971450 bytes input
18165147962 bytes output
CRC Errors
Data CRC 394
Ringword CRC 790
InvRingWord 528
PcsCodeWord 1002
2/2 is OK Loopback No
Cable Length 50cm Neighbor 1
Link Ok Yes Sync Ok Yes Link Active Yes
Changes to LinkOK 1
Five minute input rate 8 bytes/sec
Five minute output rate 0 bytes/sec
28637843 bytes input
320 bytes output
CRC Errors
Data CRC 0
Ringword CRC 0
InvRingWord 0
PcsCodeWord 0

#sh sw stack-ports detail

I have not replaced the staking cable between ports 1/2 and 2/1 yet, seeing this stacking CRC errors on a lot of stacks.

Any thoughts? Switches combinations used in stacks: C9300L-48UXG-4X, C9300L-48P-4X and C9300L-24P-4X using 50cm stack cable and a max number of 4 switches per stack. Switch code 17.03.04 CAT9K_IOSXE INSTALL mode.

 

Cheers,

   

 

 

6 Replies 6

Hello,

 

accprding to the stacking guide:

 

CRC Errors

Different types of Cyclic Redundancy Check (CRC) errors that are seen on a stack interface:

Data CRC: Stack interface data CRC error

Ringword CRC: Stack interface ring word CRC error

InvRingWord: Stack interface invalid ring word error

PcsCodeWord: Stack interface Physical Coding Sublayer (PCS) error

These errors normally occur when a stack interface state changes due to a switchover or a switch reload. You can ignore such errors.

But when these error counters increase significantly or when they increase continuously over a period of time, check the stack cable for issues.

Hello


@Jay233 wrote:

n the middle of a large (250+) 9300L switch refresh project for one of my customers and have noticed that on some of the new switch stacks, active, standby or member switches randomly reboot?


What software are you running on ,--- Possible bug


Please rate and mark as an accepted solution if you have found any of the information provided useful.
This then could assist others on these forums to find a valuable answer and broadens the community’s global network.

Kind Regards
Paul

Hi Paul,

 

Switch code 17.03.04 CAT9K_IOSXE INSTALL mode.

 

Also got reason code for reboot  - Critical process sif_mgr fault on rp_0_0 (rc=143)

weafawefa
Level 1
Level 1

I have the same issue. 
One switch with 4x members as follows:
Switch Ports Model SW Version SW Image Mode

------ ----- ----- ---------- ---------- ----
* 1 41 C9300-24T 17.06.04 CAT9K_IOSXE INSTALL
2 41 C9300-24T 17.06.04 CAT9K_IOSXE INSTALL
3 32 C9300X-12Y 17.06.04 CAT9K_IOSXE INSTALL
4 32 C9300X-12Y 17.06.04 CAT9K_IOSXE INSTALL

The stack randomly would reboot after several hours of being fine. All "show stack" commands reported back there was no issues except when doing a "show switch stack-ports detail" which indicated a high CRC error count on some cables / interfaces. 

Cables were re-seated, swapped for new ones and moved around to see if the issue followed the cable or not. The results were sporadic. Cables were always finger tight on the screws and correctly seated. 

At one point the stack reloaded and a switch completely died. It is now no longer responding to console input (the console light isn't lit at the rear) and no LEDs are on the front yet the PSU is up and fans spin up.

The faulty switch was replaced with an RMA'd unit but the CRC issues persist. 

The switches were in a production environment but were too unstable and are now on a bench for testing, but given the nature of the fault which can take hours to manifest a reboot this is an absolute pain to diagnose. 

 

Post the complete output to the following commands:

  1. sh version
  2. dir flash:core
  3. dir crashinfo:
  4. dir flash:tracelogs
  5. sh log on switch active uptime detail

Check the order of your stacking cables.  Make sure they are connected as documentation states.  I had hooked mine up in the reverse order and devices would randomly reboot.    

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Review Cisco Networking products for a $25 gift card