ā04-01-2010 06:14 PM - edited ā03-04-2019 08:00 AM
GRS fail
Hello everyone,
I have a problem that my active PRP was fail with leaving this log below
" SEC 8:.Mar 2 12:51:22: %MBUS-6-FAILEDPEER: Failed peer RP in slot 7 reason peer: pri heartbeat t/o "
My Question is that if I get this kind of logs how can I suspect whether it's hardware fail or software,
and next time, if I get the log again, what should I check?
Do you have any commands to investigate this fail?
If so, Pleas let me know
Detail information is like below :
SEC 8:50w5d: Not all config may be removed and may reappear after reactivating the sub-interface
SEC 8:1y2w: Not all config may be removed and may reappear after reactivating the sub-interface
SEC 8:1y12w: Not all config may be removed and may reappear after reactivating the sub-interface
SEC 8:1y12w: Not all config may be removed and may reappear after reactivating the sub-interface
SEC 8:1y32w: Not all config may be removed and may reappear after reactivating the sub-interface
SEC 8:1y36w: Not all config may be removed and may reappear after reactivating the sub-interface
SEC 8:2y6w: Not all config may be removed and may reappear after reactivating the sub-interface
SEC 8:2y20w: Not all config may be removed and may reappear after reactivating the sub-interface
SEC 8:2y21w: Not all config may be removed and may reappear after reactivating the sub-interface
##
SEC 8:.Mar 2 12:51:22: %MBUS-6-FAILEDPEER: Failed peer RP in slot 7 reason peer: pri heartbeat t/o
SEC 8:.Mar 2 12:51:22: %RP-5-NEWPRIMARY: Switchover to new RP
.Mar 2 12:51:23: %FIB-4-FIBNULLIDB: Missing idb for fibidb ATM2/0.14 (if_number 63).
-Traceback= 1DB4E8 161E40 177338 178560 582EDC 583400 58419C 5842EC 584500 57F958 582BFC 2B14EC
.Mar 2 12:51:24: %MBUS-6-RP_STATUS: RP in Slot 8 Mode = MBUS Active
.Mar 2 12:51:30: %MBUS-6-FABCONFIG: Switch Cards 0x1F (bitmask) Primary Clock is CSC_1 Fabric Clock is Redundant
Bandwidth Mode : 10Gbps Bandwidth
##
Mar 22 10:35:49: %SONET-4-ALARM: ATM4/2: ~SLOF ~SLOS ~LAIS LRDI ~PAIS PRDI ~PLOP
Mar 22 10:35:59: %SONET-4-ALARM: ATM4/2: ~SLOF ~SLOS ~LAIS LRDI ~PAIS PRDI ~PLOP
Mar 22 10:35:59: %SONET-4-ALARM: ATM4/2: ~SLOF ~SLOS ~LAIS ~LRDI ~PAIS ~PRDI ~PLOP
GSR_B# sh redundancy
Redundant System Information :
------------------------------
Available system uptime = 4 years, 42 weeks, 4 days, 23 hours, 28 minutes, 43 seconds
Switchovers system experienced = 3
Standby failures = 0
Last switchover reason = active unit failed
Hardware Mode = Duplex
Configured Redundancy Mode = SSO (Stateful Switchover)
Operating Redundancy Mode = SSO (Stateful Switchover)
Maintenance Mode = Disabled
Communications = Down Reason: Simplex mode
Current Processor Information :
-------------------------------
Active Location = slot 8
Current Software state = ACTIVE
Uptime in current state = 4 weeks, 2 days, 1 hour, 27 minutes, 51 seconds
Image Version = Cisco Internetwork Operating System Software
IOS (tm) GS Software (C12KPRP-P-M), Version 12.0(30)S2, RELEASE SOFTWARE (fc1)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2005 by cisco Systems, Inc.
Compiled Thu 31-Mar-05 13:29 by pwade
BOOT = disk0:c12kprp-p-mz.120-30.S2.bin,1;
CONFIG_FILE =
BOOTLDR =
Configuration register = 0x2102
Peer (slot: unavailable) information is not available because it is in 'DISABLED' state
GSR_B# sh context
GSR_B# sh context ?
all show all context info for all slots
slot specify a slot for which to show context information
summary display list of context information available
| Output modifiers
<cr>
GSR_B# sh context slot 7
GSR_B#sh gsr slot 7
SLOT STATE TRACE TABLE -- Slot 7 (Current Time is 151982313.864)
+-----------------------------------------------------------------------
| Timestamp Pid State Event Flags
+-----------------------------------------------------------------------
0.944 3 ABSENT EV_NULL
2.284 34 ACTV RP EV_RP_DEDUCE_PRIMARY
149376152.332 31 RP RDY EV_RP_INSTANTIATE
ā04-01-2010 11:57 PM
Hello Java,
>> %MBUS-6-FAILEDPEER: Failed peer RP in slot 7 reason peer: pri heartbeat t/o
>> Peer (slot: unavailable) information is not available because it is in 'DISABLED' state
the PRP in slot 7 was disabled because it has stopped to communicate with the other PRP
pri heartbeat timeout.
it is not currently operational and it is not ready to takeover should the PRP in slot 8 fail.
see
http://www.cisco.com/univercd/cc/td/doc/product/core/cis12000/cis12410/icg/hfdm_c06.htm
for diagnostic tests
Hope to help
Giuseppe
ā04-02-2010 01:56 AM
Hi Java,
well the reasons could go from a bad seat of the card, so a strong reseat might solve the issue, to a crash (is there any crashinfo file on the faulty card?) or an hardware failure of the card OR of the slot in the chassis (worst case).
Bottom line, first check if there's a crashinfo file, then reseat the card very smoothly but strongly.
Monitor it until the next occurrence. If it happens open a TAC case.
Regards,
Antonio
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide