02-05-2012 04:13 AM - edited 03-04-2019 03:08 PM
Hi All
I have had this card fall over in the following way 3 times in the past 3 weeks (??? to protect the innocent) :
Feb 3 17:27:31.209 ACDT: %MBUS-6-SWITCHED_FABCLK: Slot 4 primary clock switched to clock 0
Feb 3 17:27:33.360 ACDT: %LDP-5-NBRCHG: LDP Neighbor ???.???.???.??:0 (1) is DOWN (Received error notification from peer: Holddown time expired)
Feb 3 17:27:36.380 ACDT: %PIM-5-NBRCHG: neighbor ???.???.???.??? DOWN on interface GigabitEthernet4/0/0 non DR
Feb 3 17:28:03.267 ACDT: %FIB-2-FIBDISABLE: Fatal error, slot 4: IPC Failure: timeout
Feb 3 17:28:03.267 ACDT: %RP-4-RSTSLOT: Resetting the card in the slot: 4,Event: CEF failure
Feb 3 17:28:03.303 ACDT: %LINK-5-CHANGED: Interface GigabitEthernet4/0/0, changed state to administratively down
Feb 3 17:28:03.307 ACDT: %OSPFv3-5-ADJCHG: Process 42, Nbr ???.???.???.??? on GigabitEthernet4/0/0 from FULL to DOWN, Neighbor Down: Interface down or detached
Feb 3 17:28:03.307 ACDT: %OSPF-5-ADJCHG: Process 42, Nbr ???.???.???.??? on GigabitEthernet4/0/0 from FULL to DOWN, Neighbor Down: Interface down or detached
Feb 3 17:28:03.315 ACDT: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet4/0/0, changed state to down
Feb 3 17:28:03.323 ACDT: %SPA_OIR-6-OFFLINECARD: SPA (SPA-1X10GE-L-V2) offline in subslot 4/0
Feb 3 17:32:29.862 ACDT: %RP-3-LC_ROMMON_STARTUP_FAILURE: Slot 4, output =
PON_09
PON_09
PON_09
PON_09
POFF09
POFF09
Chassis code release:
IOS (tm) GS Software (C12KPRP-K4P-M), Version 12.0(32)SY11, RELEASE SOFTWARE (fc2)
Hardware info:
AME: "slot 4", DESCR: "ISE 10G Modular Services Card v2"
PID: 12000-SIP-601 , VID: V03, SN: ?????
NAME: "SPA subslot 4/0", DESCR: "1-port 10 Gigabit Ethernet Shared Port Adapter XFP based"
PID: SPA-1X10GE-L-V2 , VID: V02, SN: ?????
NAME: "subslot 4/0 transceiver 0", DESCR: "OC192 + 10GBASE-L"
PID: 10-1989-02XFP , VID: C , SN: ??????
After the initial failure and card reseat it has lasted almost 3 weeks, reseated it again today and it fell over again within a minute. Reseated again and it is stil up and passing traffic after an hour.
Trying to power cycle the card from the chassis fails, a remove/insert is required to get the card running.
Is this a memory failure on the SIP (CEF failure??) or some other odd fault with the SIP, that makes it require a card remove/reinsert?
Thanks
David
02-08-2012 08:06 AM
David
Looks like some sort of fabric issue
Feb 3 17:27:31.209 ACDT: %MBUS-6-SWITCHED_FABCLK: Slot 4 primary clock switched to clock 0
capture "show controller fia" from the line card and global cli
Thanks
Dave Aicher
02-08-2012 03:50 PM
Hi David
The card is currently in an operational state still ..
From the global cli:
#show controllers fia
Fabric configuration: 10Gbps bandwidth, redundant fabric
Master Scheduler: Slot 17 Backup Scheduler: Slot 16
Fab epoch no 0 Halt count 0
From Fabric FIA Errors
-----------------------
redund overflow 0 cell drops 0
cell parity 0
Switch cards present 0x007C Slots 18 19 20 21 22
Switch cards monitored 0x007C Slots 18 19 20 21 22
Slot: 18 19 20 21 22
Name: sfc0 sfc1 sfc2 sfc3 sfc4
-------- -------- -------- -------- --------
los 0 0 0 0 0
state Off Off Off Off Off
crc16 0 0 0 0 0
To Fabric FIA Errors
-----------------------
sca not pres 0 req error 0 uni fifo overflow 0
grant parity 0 multi req 0 uni fifo undrflow 0
cntrl parity 0 uni req 0
multi fifo 0 empty dst req 0 handshake error 0
cell parity 0
From the SIP in slot 4:
========= Line Card (Slot 4) =========
Fabric configuration: Full bandwidth redundant
Master Scheduler: Slot 17
Fab epoch no 0 Halt count 0
From Fabric FIA Errors
-----------------------
cell fifo parity 0 no 125 MHz clock 0
cell processor ctrl wd error 0 reassembly mem ctrl wd error 0
reassembly mem single ECC 0 reassembly mem multi ECC 0
first last err 0 sequence err 0 pkt length err 0
Switch cards present: 0x1F
Switch cards monitored: 0x1F
0 1 2 3 4
-------- -------- -------- -------- --------
los 0 0 0 0 0
state Off Off Off Off Off
crc16 0 0 0 0 0
xor error0 0 0 0
cell drops0 0 0 0
drop packets from these linecards
0 1 2 3
-------- -------- -------- --------
Unicast High 0 0 0 0
Unicast Low 0 0 0 0
Multicast High0 0 0 0
Multicast Low 0 0 0 0
4 5 6 7
-------- -------- -------- --------
Unicast High 0 0 0 0
Unicast Low 0 0 0 0
Multicast High0 0 0 0
Multicast Low 0 0 0 0
8 9 10 11
-------- -------- -------- --------
Unicast High 0 0 0 0
Unicast Low 0 0 0 0
Multicast High0 0 0 0
Multicast Low 0 0 0 0
12 13 14 15
-------- -------- -------- --------
Unicast High 0 0 0 0
Unicast Low 0 0 0 0
Multicast High0 0 0 0
Multicast Low 0 0 0 0
To Fabric FIA Errors
-----------------------
piranha data parity 0 piranha cmd parity 0
assem fifo data parity 0 assem fifo cmd par 0
sca gnt parity 0 request error 0
cfifo overflow 0 cfifo underflow 0
piranha miss pkt end 0 piranha miss pkt start 0
pir start pkt LT 40b 0 pir mid pkt LT 32b 0 pir end pkt GT 32b 0
output mask all zeros 0 cfifo single ecc 0 cfifo multi ecc 0
pir no even clk 0 pir on odd clk 0 no 150MHz clk 0
02-09-2012 07:25 AM
Looks clean at the moment but the logs do seem to indicate a fabric issue. Any chance you have an open slot you can move the card to?
Dave
02-09-2012 03:43 PM
No spare slots unfortunately, but we have replaced the SIP and SPA, so will wait see what happens.
Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: