08-21-2009 10:36 AM - edited 03-06-2019 07:22 AM
On one of our switch stacks we received the following log messages:
Aug 20 09:47:05: %XDR-6-XDRIPCNOTIFY: Fatal IPC error occurred for peer in slot
4. Message not sent due to timeout. Disabling linecard
-Traceback= AFFD58 21BCA4 BBCE14 BC1148 BC1874 BD3FC0 909AAC 90000C
Aug 20 10:00:05: %PLATFORM_RPC-3-MSG_THROTTLED: RPC Msg Dropped by throttle mech
anism: type 1, class 21, max_msg 8, total throttled 0
-Traceback= AFFD58 577E98 24F3EC 24F530 24FFA8 2501A0 909AAC 90000C
Aug 20 10:03:05: %PLATFORM_RPC-3-MSG_THROTTLED: RPC Msg Dropped by throttle mech
anism: type 3, class 21, max_msg 8, total throttled 1
-Traceback= AFFD58 577E98 24F3EC 24F530 24FF8C 2501A0 909AAC 90000C
We disconnected and reconnected the power cord on the member that was disabled to get it to restart. It re-joined the stack correctly and is functioning normally again. I could not find anything in the bug toolkit or release notes regarding fatal IPC problems on 3750 switches.
Has anyone seen these messages before on 3750s? Are there any thoughts as to whether this error condition was a fluke, whether it indicates hardware possibly going bad, whether it could occur as a result of some type of "bad" network traffic, etc?
08-23-2009 10:05 AM
What version of code is this 3750 running? How many switches are in this stack?
08-24-2009 05:37 AM
The switches in the stack are running 12.2(25)SEE4, and there are 4 switches in the stack.
08-24-2009 07:22 AM
I need the exact image name (i.e. show ver) to be able to properly decode the stack trace.
08-24-2009 07:30 AM
Here's the image name:
System image file is "flash:c3750-ipbasek9-mz.122-25.SEE4/c3750-ipbasek9-mz.122-25.SEE4.bin"
08-24-2009 08:20 AM
There are a few bugs that match the symptoms and stack trace, but without knowing the full config, these seem most likely:
CSCse51203
CSCsi74526
CSCsd26784
The minimum revision of code to run to get all three fixes is 12.2(40)SE.
This may also be CSCeg57839 if you have configured the unsupported "snmp-server ifindex persist" on the cluster. This command is not supported on 3750s, and should be removed.
08-24-2009 01:13 PM
Thank you for the info, but none of the mentioned bug reports seem to fit the situation. We don't have arp inspection trust configured, we don't have the "snmp-server ifindex persist" configured, and no one was logged into the switch entering commands at the time the problem occured.
I'm not so concerned about the RPC throttled messages, I'm more concerned about the "Fatal IPC error occurred for peer in slot 4. Message not sent due to timeout. Disabling linecard" message.
08-24-2009 01:17 PM
It could be a new bug then, or faulty hardware. I couldn't find any other issues that would match the version of code that you're running. If you can reproduce, I suggest you open a TAC service request so live troubleshooting can be done.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide