cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1318
Views
0
Helpful
16
Replies

9200PXG Stack Standby Switch Reboots (Last reload reason: "Critical pr

mrkaufman
Level 1
Level 1

I wanted to bring to your attention an issue we've been experiencing with our current setup. At the moment, our system is running version 17.3.5, and we've encountered intermittent reboots specifically on the standby switch within our 9200PXG stack. The Last reload reason indicated is "Critical process sessmgrd fault on rp_0_0 (rc=134)."

This unexpected behavior has prompted us to investigate further, and we are reaching out to inquire if anyone has encountered a similar issue or possesses insights into what might be causing this. Any information or guidance would be immensely helpful.

Thanks.

16 Replies 16

balaji.bandi
Hall of Fame
Hall of Fame

check the crash logs on the flash also check below command:

# show platform hardware slot switch <X> R0 soft-error statistics

Looks for me this is Bug, upgrade to 17.9.3 or higher and check

BB

***** Rate All Helpful Responses *****

How to Ask The Cisco Community for Help

Leo Laohoo
Hall of Fame
Hall of Fame

Post the complete output to the following commands: 

dir crashinfo:
dir crashinfo:tracelogs | exclude gz
dir flash:core

my plan at this point is to upgrade software.  but i would rather know why this is happening.  here is the info requested.  i gave you the info for the primary and the secondary as the secondary is the one crashing.

elk-north-left#dir crashinfo:
Directory of crashinfo:/

7329 drwx 57344 Nov 16 2023 16:41:18 -06:00 tracelogs
15 -rw- 7160063 Nov 15 2023 23:37:52 -06:00 elk-north-left_1_RP_0_trace_archive_3-20231115-233745.tar.gz
14 -rw- 7122492 Nov 12 2023 06:22:36 -06:00 elk-north-left_1_RP_0_trace_archive_2-20231112-062230.tar.gz
13 -rw- 7157270 Nov 9 2023 23:01:16 -06:00 elk-north-left_1_RP_0_trace_archive_1-20231109-230108.tar.gz
11 -rw- 6089556 Oct 7 2023 23:01:13 -05:00 elk-north-left_1_RP_0_trace_archive_0-20231007-230106.tar.gz
12 -rw- 0 Sep 21 2019 16:07:05 -05:00 koops.dat

825638912 bytes total (742420480 bytes free)
elk-north-left#dir crashinfo:tracelogs | exclude gz
Directory of crashinfo:/tracelogs/

7367 -rw- 10 Oct 6 2023 12:08:36 -05:00 timestamp
7366 -rw- 22089 Oct 6 2023 12:08:36 -05:00 dmesg

825638912 bytes total (742420480 bytes free)
elk-north-left#dir flash:core
Directory of flash:/core/

64800 -rw- 1 Nov 16 2023 16:38:24 -06:00 .callhome
40481 drwx 4096 Dec 12 2022 09:09:06 -06:00 modules

1956839424 bytes total (885035008 bytes free)

and from the secondary:

 

elk-north-left#dir crashinfo-5:
Directory of crashinfo-5:/

7329 drwx 16384 Nov 16 2023 16:43:01 -06:00 tracelogs
15 -rw- 5429320 Nov 15 2023 23:37:48 -06:00 elk-north-left_5_RP_0_trace_archive_0-20231115-233745.tar.gz
14 -rw- 20757055 Nov 12 2023 06:22:47 -06:00 elk-north-left_5_RP_0-system-report_5_20231112-062235-CST.tar.gz
13 -rw- 5660022 Nov 9 2023 23:01:11 -06:00 elk-north-left_5_RP_0_trace_archive_0-20231109-230108.tar.gz
11 -rw- 19895465 Oct 7 2023 23:01:24 -05:00 elk-north-left_5_RP_0-system-report_5_20231007-230114-CST.tar.gz
12 -rw- 0 Sep 21 2019 16:07:05 -05:00 koops.dat

825753600 bytes total (721420288 bytes free)
elk-north-left#dir crashinfo-5:tracelogs | exclude gz
Directory of crashinfo-5:/tracelogs/

7367 -rw- 35998 Nov 12 2023 06:22:56 -06:00 shutdown_rp0.log
7330 -rw- 162308 Nov 12 2023 06:22:55 -06:00 shutdown_journal_rp0.log
7380 -rw- 382 Nov 12 2023 06:22:31 -06:00 shutdown_cc0.log
7353 -rw- 383 Nov 12 2023 06:22:30 -06:00 shutdown_fp0.log
7364 -rw- 10 Oct 4 2023 12:44:09 -05:00 timestamp
7352 -rw- 22337 Oct 4 2023 12:44:09 -05:00 dmesg

825753600 bytes total (721420288 bytes free)
elk-north-left#dir flash-5:core
Directory of flash-5:/core/

64801 -rw- 1 Nov 16 2023 16:41:14 -06:00 .callhome
40481 drwx 4096 Dec 12 2022 19:28:14 -06:00 modules

1957167104 bytes total (1356857344 bytes free)

mrkaufman
Level 1
Level 1

Listed below is the directorys you requested.  the first ones are on the primary switch.  the second ones (switch5) is the secondary ones).

my plan at this moment is to update the switch but i would rather figure out whats causing it.  

 

 

elk-north-left#dir crashinfo:
Directory of crashinfo:/

7329 drwx 57344 Nov 16 2023 15:19:51 -06:00 tracelogs
15 -rw- 7160063 Nov 15 2023 23:37:52 -06:00 elk-north-left_1_RP_0_trace_archive_3-20231115-233745.tar.gz
14 -rw- 7122492 Nov 12 2023 06:22:36 -06:00 elk-north-left_1_RP_0_trace_archive_2-20231112-062230.tar.gz
13 -rw- 7157270 Nov 9 2023 23:01:16 -06:00 elk-north-left_1_RP_0_trace_archive_1-20231109-230108.tar.gz
11 -rw- 6089556 Oct 7 2023 23:01:13 -05:00 elk-north-left_1_RP_0_trace_archive_0-20231007-230106.tar.gz
12 -rw- 0 Sep 21 2019 16:07:05 -05:00 koops.dat

825638912 bytes total (742473728 bytes free)
elk-north-left#dir crashinfo:tracelogs | exclude gz
Directory of crashinfo:/tracelogs/

7367 -rw- 10 Oct 6 2023 12:08:36 -05:00 timestamp
7366 -rw- 22089 Oct 6 2023 12:08:36 -05:00 dmesg

825638912 bytes total (742473728 bytes free)
elk-north-left#dir flash:core
Directory of flash:/core/

64800 -rw- 1 Nov 16 2023 15:08:23 -06:00 .callhome
40481 drwx 4096 Dec 12 2022 09:09:06 -06:00 modules

 

 

 

elk-north-left#dir crashinfo-5:
Directory of crashinfo-5:/

7329 drwx 16384 Nov 16 2023 15:21:53 -06:00 tracelogs
15 -rw- 5429320 Nov 15 2023 23:37:48 -06:00 elk-north-left_5_RP_0_trace_archive_0-20231115-233745.tar.gz
14 -rw- 20757055 Nov 12 2023 06:22:47 -06:00 elk-north-left_5_RP_0-system-report_5_20231112-062235-CST.tar.gz
13 -rw- 5660022 Nov 9 2023 23:01:11 -06:00 elk-north-left_5_RP_0_trace_archive_0-20231109-230108.tar.gz
11 -rw- 19895465 Oct 7 2023 23:01:24 -05:00 elk-north-left_5_RP_0-system-report_5_20231007-230114-CST.tar.gz
12 -rw- 0 Sep 21 2019 16:07:05 -05:00 koops.dat

825753600 bytes total (721420288 bytes free)

elk-north-left#dir crashinfo-5:tracelogs | exclude gz
Directory of crashinfo-5:/tracelogs/

7367 -rw- 35998 Nov 12 2023 06:22:56 -06:00 shutdown_rp0.log
7330 -rw- 162308 Nov 12 2023 06:22:55 -06:00 shutdown_journal_rp0.log
7380 -rw- 382 Nov 12 2023 06:22:31 -06:00 shutdown_cc0.log
7353 -rw- 383 Nov 12 2023 06:22:30 -06:00 shutdown_fp0.log
7364 -rw- 10 Oct 4 2023 12:44:09 -05:00 timestamp
7352 -rw- 22337 Oct 4 2023 12:44:09 -05:00 dmesg

825753600 bytes total (721420288 bytes free)

elk-north-left#dir flash-5:core
Directory of flash-5:/core/

64801 -rw- 1 Nov 16 2023 15:11:13 -06:00 .callhome
40481 drwx 4096 Dec 12 2022 19:28:14 -06:00 modules

1957167104 bytes total (1356857344 bytes free)

 

 

 


@mrkaufman wrote:
7330 -rw- 162308 Nov 12 2023 06:22:55 -06:00 shutdown_journal_rp0.log

Bingo!

Please share this file.

-- Logs begin at Sun 2023-11-12 06:19:22 CST, end at Sun 2023-11-12 06:22:55 CST. --
Nov 12 06:19:22 elk-north-left_5_RP_0 xinetd[30979]: execve /usr/bin/rsync
Nov 12 06:19:22 elk-north-left_5_RP_0 xinetd[30981]: execve /usr/bin/rsync
...
Nov 12 06:19:59 elk-north-left_5_RP_0 xinetd[32755]: execve /usr/bin/rsync
Nov 12 06:19:59 elk-north-left_5_RP_0 xinetd[32756]: execve /usr/bin/rsync
Nov 12 06:19:59 elk-north-left_5_RP_0 xinetd[32758]: execve /usr/bin/rsync
Nov 12 06:19:59 elk-north-left_5_RP_0 xinetd[32762]: execve /usr/bin/rsync
Nov 12 06:19:59 elk-north-left_5_RP_0 xinetd[32763]: execve /usr/bin/rsync
Nov 12 06:19:59 elk-north-left_5_RP_0 xinetd[309]: execve /usr/bin/rsync
Nov 12 06:19:59 elk-north-left_5_RP_0 xinetd[311]: execve /usr/bin/rsync
Nov 12 06:19:59 elk-north-left_5_RP_0 xinetd[314]: execve /usr/bin/rsync
...
Nov 12 06:20:16 elk-north-left_5_RP_0 xinetd[995]: execve /usr/bin/rsync
Nov 12 06:20:16 elk-north-left_5_RP_0 xinetd[996]: execve /usr/bin/rsync
Nov 12 06:20:16 elk-north-left_5_RP_0 xinetd[1000]: execve /usr/bin/rsync
Nov 12 06:20:16 elk-north-left_5_RP_0 xinetd[1001]: execve /usr/bin/rsync
...
Nov 12 06:22:27 elk-north-left_5_RP_0 xinetd[7273]: execve /usr/bin/rsync
Nov 12 06:22:27 elk-north-left_5_RP_0 xinetd[7274]: execve /usr/bin/rsync
Nov 12 06:22:27 elk-north-left_5_RP_0 xinetd[7278]: execve /usr/bin/rsync
Nov 12 06:22:27 elk-north-left_5_RP_0 xinetd[7292]: execve /usr/bin/rsync
Nov 12 06:22:27 elk-north-left_5_RP_0 xinetd[7294]: execve /usr/bin/rsync
Nov 12 06:22:28 elk-north-left_5_RP_0 xinetd[7299]: execve /usr/bin/rsync
Nov 12 06:22:28 elk-north-left_5_RP_0 xinetd[7300]: execve /usr/bin/rsync
Nov 12 06:22:28 elk-north-left_5_RP_0 audispd[410]: type=ANOM_ABEND msg=audit(1699791748.016:93): auid=4294967295 uid=0 gid=0 ses=4294967295 subj=kernel pid=20313 comm="sessmgrd" exe="/tmp/sw/mount/cat9k_lite-rpbase.17.03.05.SPA.pkg/quake2/usr/binos/bin/sessmgrd" sig=6 res=1
Nov 12 06:22:28 elk-north-left_5_RP_0 kernel: dplr_intrpt: deregister interrupt
Nov 12 06:22:28 elk-north-left_5_RP_0 kernel: dplr_intrpt: Doppler found: slotid: 0,dplrid: 0
Nov 12 06:22:28 elk-north-left_5_RP_0 kernel: dplr_intrpt: Irq 22 indx 0 is cleaned up
Nov 12 06:22:28 elk-north-left_5_RP_0 kernel: dplr_intrpt: Irq 23 indx 1 is cleaned up
Nov 12 06:22:28 elk-north-left_5_RP_0 kernel: dplr_intrpt: Irq 21 indx 2 is cleaned up
Nov 12 06:22:28 elk-north-left_5_RP_0 kernel: dplr_intrpt: Irq 20 indx 3 is cleaned up
Nov 12 06:22:28 elk-north-left_5_RP_0 kernel: dplr_intrpt: Irq 24 indx 4 is cleaned up
Nov 12 06:22:28 elk-north-left_5_RP_0 kernel: dplr_intrpt: Irq 25 indx 5 is cleaned up
Nov 12 06:22:29 elk-north-left_5_RP_0 pvp[7896]: %PMAN-5-EXITACTION: Process manager is exiting: reload fp action requested
Nov 12 06:22:30 elk-north-left_5_RP_0 btman_rotate_immediate[8249]: %SERVICES-2-NORESOLVE_LOCAL: Error resolving local FRU: Invalid argument
Nov 12 06:22:30 elk-north-left_5_RP_0 btman_rotate_immediate[8249]: %SERVICES-3-INVALID_CHASFS: Thread 0xf63e6010 has no global chasfs context
Nov 12 06:22:30 elk-north-left_5_RP_0 pvp[8463]: %PMAN-5-EXITACTION: Process manager is exiting: reload cc action requested
Nov 12 06:22:30 elk-north-left_5_RP_0 btman_rotate_immediate[8735]: %SERVICES-2-NORESOLVE_LOCAL: Error resolving local FRU: Invalid argument
Nov 12 06:22:30 elk-north-left_5_RP_0 btman_rotate_immediate[8735]: %SERVICES-3-INVALID_CHASFS: Thread 0xf66c3010 has no global chasfs context
Nov 12 06:22:32 elk-north-left_5_RP_0 root[9005]: %PMAN-3-PROCHOLDDOWN: The process sessmgrd has been helddown (rc 134)
Nov 12 06:22:32 elk-north-left_5_RP_0 kernel: LSMPI: Deregister dual stack diverter
Nov 12 06:22:33 elk-north-left_5_RP_0 pvp[9166]: %PMAN-5-EXITACTION: Process manager is exiting: rp processes exit with reload switch code
Nov 12 06:22:33 elk-north-left_5_RP_0 systemd[1]: Stopping Console relay...
Nov 12 06:22:33 elk-north-left_5_RP_0 systemd[1]: Stopped Console relay.
Nov 12 06:22:33 elk-north-left_5_RP_0 audispd[410]: type=SERVICE_STOP msg=audit(1699791753.924:94): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=agetty-iosd comm="systemd" exe="/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Nov 12 06:22:34 elk-north-left_5_RP_0 btman_rotate_immediate[9248]: %SERVICES-2-NORESOLVE_LOCAL: Error resolving local FRU: Invalid argument
Nov 12 06:22:34 elk-north-left_5_RP_0 btman_rotate_immediate[9248]: %SERVICES-3-INVALID_CHASFS: Thread 0xf6189010 has no global chasfs context
Nov 12 06:22:55 elk-north-left_5_RP_0 pvp[9778]: %PMAN-3-PROCESS_NOTIFICATION: System report /crashinfo/elk-north-left_5_RP_0-system-report_5_20231112-062235-CST.tar.gz (size: 20271 KB) generated

Does this stack have Netflow or DNAC enabled? 

If Netflow is enabled, install the SMU file "cat9k_lite_iosxe.17.03.05.CSCwe89814.SPA.smu.bin".  

it does not have DNAC or netflow enabled.

sphillips
Level 1
Level 1

This is a known cisco bug. My property experienced the same bug on several different stacks in different closets. We upgraded from 16.9.3 to 17.6.4 about 8 months ago and haven't experienced a reboot since. 

do you know the bug id?

Not sure if @mrkaufman crash is CSCvv86246.  

CSCvv86246 crash is caused by the cmand process.  @mrkaufman's crash is due to sessmgrd.  

Granted, CSCvv86246, does contain bugger-all information.  

Your absolutely correct Leo. Sorry that's my fault, mrkaufman. That command looked very familiar to me and I thought it was the bug I had. There is a difference now that Leo pointed it out. 

All good.  Apologies is not necessary.  

We are all help to pool in our brain.  

Review Cisco Networking products for a $25 gift card