08-09-2011 08:27 AM - edited 03-04-2019 01:14 PM
Hi All,
I have problem with my 6509-E. The card number on disbales and enables sequentialy every about 10 minutes. Below is list modules and then the logs:
rtr-019#sh module
Mod Ports Card Type Model Serial No.
--- ----- -------------------------------------- ------------------ -----------
1 24 CEF720 24 port 1000mb SFP WS-X6724-SFP SAL1048958X
3 4 CEF720 4 port 10-Gigabit Ethernet WS-X6704-10GE SAL1031WPAY
5 2 Supervisor Engine 720 (Hot) WS-SUP720-BASE SAD084600S8
6 2 Supervisor Engine 720 (Active) WS-SUP720-BASE SAL1228X83G
Mod MAC addresses Hw Fw Sw Status
--- ---------------------------------- ------ ------------ ------------ -------
1 0019.5606.5448 to 0019.5606.545f 2.5 12.2(18r)S1 12.2(33)SXH7 Ok
3 0018.73b8.5ec0 to 0018.73b8.5ec3 2.4 12.2(14r)S5 12.2(33)SXH7 Ok
5 0011.21b9.bb74 to 0011.21b9.bb77 3.2 8.5(4) 12.2(33)SXH7 Ok
6 000a.b86d.75f8 to 000a.b86d.75fb 4.0 8.5(4) 12.2(33)SXH7 Ok
Mod Sub-Module Model Serial Hw Status
---- --------------------------- ------------------ ----------- ------- -------
1 Distributed Forwarding Card WS-F6700-DFC3BXL SAL1106G5KL 5.3 Ok
3 Distributed Forwarding Card WS-F6700-DFC3BXL SAL1051BJJ3 5.3 Ok
5 Policy Feature Card 3 WS-F6K-PFC3BXL SAL1030WGRC 1.8 Ok
5 MSFC3 Daughterboard WS-SUP720 SAD083902XF 2.4 Ok
6 Policy Feature Card 3 WS-F6K-PFC3BXL SAD09100DKU 1.4 Ok
6 MSFC3 Daughterboard WS-SUP720 SAL1228X6TH 3.1 Ok
Mod Online Diag Status
---- -------------------
1 Pass
3 Pass
5 Pass
6 Pass
==================================================
IOS: s72033-adventerprisek9_wan-vz.122-33.SXH7.bin
==================================================
04:39:49: %CPU_MONITOR-SP-6-NOT_HEARD: CPU_MONITOR messages have not been heard for 90 seconds [1/0]
04:40:19: %CPU_MONITOR-SP-6-NOT_HEARD: CPU_MONITOR messages have not been heard for 120 seconds [1/0]
04:38:23: %XDR-6-XDRIPCNOTIFY: Message not sent to slot 1/0 (1) because of IPC error timeout. Disabling linecard. (Expected during linecard OIR or system reloads)
Would you help me to find the root cause?
08-09-2011 09:09 AM
Hello Ali,
IPC is the inter processer communication done by the IOS processors which happens via the EOBC.The above messages indicates that either RP or SP believes there is a problem with the peer device. SP will reset a LC if it does not hear CPU_MONITOR messages for > 150 seconds (default timeout value).
Can you please share the following commands to investigate this further,
sh heart
remote com sw sh heartbeat
sh ipc port
sh ipc session tx verbose
remote login switch
debug heartbeat
also I would also suggest you to do a manual switchover by entering the redundancy force-switchover command
and observe if this errors are appearing again in the switch.
Thanks,
Ricky Micky
*Rate if this is useful
08-09-2011 12:10 PM
Hello Richard,
Thank you for your reply. I attached your requested logs and outputs. Unfortunately after switching the SP the problem still exist.
08-09-2011 01:05 PM
Hi Ali,
There needs to be more investigations done with this SUP. Is it possible for you to open a TAC case?
Thanks,
Richard
08-10-2011 05:20 PM
Hi Richard,
Actually we don't have support contract and I can not open TAC case. Do you think the SUP has problem or the card? Currently a 10G card on this platform works fine.
Regards,
Ali
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide