cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
5361
Views
5
Helpful
12
Replies

C9300-48P-A %FMANRP-3-PEER_IPC_STUCK error and switch restart

david_casey
Level 1
Level 1

Good evening, everyone!  I have a stack of 3 Cisco C9300-48P-A switches which has been seeing a strange problem since early this morning.  Every so often, switch #3 is the stack will simply reboot itself.  These are the log message we are seeing:

 

%FMANRP-3-PEER_IPC_STUCK: Switch 1 R0/0: fman_rp: IPC to fman-log-bay0-peer3 is stuck for more than 30 seconds

 

Things I've tried so far:

1. Rebooted the entire stack.

2. Upgraded from 16.9.4 to 16.12.4.

3. Upgrading from 16.12.4 to 16.12.5 right now.

 

I did find a bug ID via a Google search but the workaround is to simply reload the affected switch which has not had any affect on this happening.  Has anyone else seen this before and if so what did you do to sort it out?

 

ETA: Upgrading to 16.12.5 had no effect on the issue. The switch continues to reboot at random intervals.

 

Thank you!

12 Replies 12

Leo Laohoo
Hall of Fame
Hall of Fame

@david_casey wrote:

Every so often, switch #3 is the stack will simply reboot itself. 


Post the complete output to the following commands: 

sh version
dir crashinfo-1:
dir crashinfo-2:
dir crashinfo-3:
dir flash-3:core
sh log on switch 3 uptime detail

As requested.


@david_casey wrote:
Mar 9 2021 07:13:27 -07:00  LEG-PMG-01-NORTH-LAN-A_1_RP_0_trace_archive_11-20210309-071325.tar.gz

Switch 1 and Switch 2 has this file (note the timestamp of the file creation).  
Can you attach both to the thread?

See attached.  Thank you!


@david_casey wrote:
03/09/2021 04:09:11   PowerOn                       0     0     0     0     10 
03/09/2021 06:04:51   PowerOn                       0     0     0     1     0  
03/09/2021 06:17:19   PowerOn                       0     0     0     0     5  
03/09/2021 06:30:06   PowerOn                       0     0     0     0     5  
03/09/2021 07:41:35   PowerOn                       0     0     0     1     0  
03/09/2021 08:43:17   PowerOn                       0     0     0     0     55 
03/09/2021 08:54:29   PowerOn                       0     0     0     0     5  
03/09/2021 09:29:42   PowerOn                       0     0     0     0     30 
03/09/2021 09:37:25   PowerOn                       0     0     0     0     0  
03/09/2021 09:46:09   PowerOn                       0     0     0     0     5  
03/09/2021 13:54:20   PowerOn                       0     0     0     4     0  
03/09/2021 14:15:46   PowerOn                       0     0     0     0     15 

How many power supplies are attached to Switch 3? 

The two so-called "crashlogs" do not contain anything useful that I can see.  TAC can probably analyze the files a lot better than me.  

There are two power supplies in each switch of the stack and all power supplies are connected to the same pair of UPS devices.


@david_casey wrote:

all power supplies are connected to the same pair of UPS devices.


Can one of the power supply be connected to raw power?

I'll have to check to see if we have any standard outlets in that closet, it's pretty small and dedicated to just our network equipment. Yesterday we did migrate the few connections that were on switch #3 onto other open ports on the other 2 switches in the stack.  Right now switch #3 is empty of any connections at all.

 

These are the log messages we are seeing from a bounce event this morning:

 

Mar 10 04:20:58.170 MST: %HMANRP-5-CHASSIS_DOWN_EVENT: Chassis 3 gone DOWN!
Mar 10 04:20:58.041 MST: %STACKMGR-6-STACK_LINK_CHANGE: Switch 2 R0/0: stack_mgr: Stack port 1 on Switch 2 is down
Mar 10 04:20:58.043 MST: %STACKMGR-4-SWITCH_REMOVED: Switch 2 R0/0: stack_mgr: Switch 3 has been removed from the stack.
Mar 10 04:20:58.042 MST: %STACKMGR-6-STACK_LINK_CHANGE: Switch 1 R0/0: stack_mgr: Stack port 2 on Switch 1 is down
Mar 10 04:20:58.043 MST: %STACKMGR-4-SWITCH_REMOVED: Switch 1 R0/0: stack_mgr: Switch 3 has been removed from the stack.
Mar 10 04:21:37.150 MST: %FMANRP-3-PEER_IPC_STUCK: Switch 1 R0/0: fman_rp: IPC to fman-log-bay0-peer3 is stuck for more than 30 seconds
Mar 10 04:23:07.151 MST: %FMANRP-3-PEER_IPC_RESUME: Switch 1 R0/0: fman_rp: IPC to fman-log-bay0-peer3 has returned to normal after previous stuck
Mar 10 04:23:48.679 MST: %STACKMGR-6-STACK_LINK_CHANGE: Switch 1 R0/0: stack_mgr: Stack port 2 on Switch 1 is up
Mar 10 04:23:48.678 MST: %STACKMGR-6-STACK_LINK_CHANGE: Switch 2 R0/0: stack_mgr: Stack port 1 on Switch 2 is up
Mar 10 04:23:52.030 MST: %STACKMGR-4-SWITCH_ADDED: Switch 2 R0/0: stack_mgr: Switch 3 has been added to the stack.
Mar 10 04:23:52.031 MST: %STACKMGR-4-SWITCH_ADDED: Switch 1 R0/0: stack_mgr: Switch 3 has been added to the stack.
Mar 10 04:23:53.747 MST: %STACKMGR-4-SWITCH_ADDED: Switch 2 R0/0: stack_mgr: Switch 3 has been added to the stack.
Mar 10 04:23:53.748 MST: %STACKMGR-4-SWITCH_ADDED: Switch 1 R0/0: stack_mgr: Switch 3 has been added to the stack.
Mar 10 04:24:04.817 MST: %PLATFORM_FEP-6-FRU_PS_OIR: Switch 3: FRU power supply A inserted
Mar 10 04:24:04.818 MST: %PLATFORM_FEP-6-FRU_PS_OIR: Switch 3: FRU power supply B inserted
Mar 10 04:24:04.819 MST: %PLATFORM_THERMAL-6-FRU_FAN_OIR: Switch 3: System fan 1 inserted
Mar 10 04:24:04.819 MST: %PLATFORM_THERMAL-6-FRU_FAN_OIR: Switch 3: System fan 2 inserted
Mar 10 04:24:04.820 MST: %PLATFORM_THERMAL-6-FRU_FAN_OIR: Switch 3: System fan 3 inserted
Mar 10 04:24:05.872 MST: %PLATFORM_FEP-6-FRU_PS_OIR: Switch 3: FRU power supply A inserted
Mar 10 04:24:05.873 MST: %PLATFORM_FEP-6-FRU_PS_OIR: Switch 3: FRU power supply B inserted
Mar 10 04:23:48.451 MST: %CHUNK-2-NULLCHUNK : Switch 3 R0/0: stack_mgr: NULL chunk pointer argument
Mar 10 04:23:50.111 MST: %STACKMGR-6-STACK_LINK_CHANGE: Switch 3 R0/0: stack_mgr: Stack port 1 on Switch 3 is down
Mar 10 04:23:50.111 MST: %STACKMGR-6-STACK_LINK_CHANGE: Switch 3 R0/0: stack_mgr: Stack port 2 on Switch 3 is down
Mar 10 04:23:50.409 MST: %STACKMGR-6-STACK_LINK_CHANGE: Switch 3 R0/0: stack_mgr: Stack port 1 on Switch 3 is up
Mar 10 04:23:50.410 MST: %STACKMGR-6-STACK_LINK_CHANGE: Switch 3 R0/0: stack_mgr: Stack port 2 on Switch 3 is up
Mar 10 04:23:50.414 MST: %STACKMGR-4-SWITCH_ADDED: Switch 3 R0/0: stack_mgr: Switch 3 has been added to the stack.
Mar 10 04:23:52.141 MST: %STACKMGR-4-SWITCH_ADDED: Switch 3 R0/0: stack_mgr: Switch 3 has been added to the stack.
Mar 10 04:24:08.446 MST: %HMANRP-6-HMAN_IOS_CHANNEL_INFO: HMAN-IOS channel event for switch 3: EMP_RELAY: Channel UP!
Mar 10 04:24:08.525 MST: %HMANRP-6-EMP_NO_ELECTION_INFO: Could not elect active EMP switch, setting emp active switch to 0: EMP_RELAY: Could not elect switch with mgmt port UP

Swap the power cables with the ones that works. 

So switch 3, PS 1, swap with switch 2 PS 2. 

Switch 3, PS2, swap with switch 1 PS 2. 

Okay, I'll run this test either tomorrow or early next week and report back.

agionetworks
Level 1
Level 1

Iam seeing similar issue, were you able to fix the issue ?

Could you create a fresh post describing your situation? it would bring more attention since this one is a year old and we can try to help you out there.

Review Cisco Networking for a $25 gift card