Cisco 12000-SIP-601 keeps going offline
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-05-2012 04:13 AM - edited 03-04-2019 03:08 PM
Hi All
I have had this card fall over in the following way 3 times in the past 3 weeks (??? to protect the innocent) :
Feb 3 17:27:31.209 ACDT: %MBUS-6-SWITCHED_FABCLK: Slot 4 primary clock switched to clock 0
Feb 3 17:27:33.360 ACDT: %LDP-5-NBRCHG: LDP Neighbor ???.???.???.??:0 (1) is DOWN (Received error notification from peer: Holddown time expired)
Feb 3 17:27:36.380 ACDT: %PIM-5-NBRCHG: neighbor ???.???.???.??? DOWN on interface GigabitEthernet4/0/0 non DR
Feb 3 17:28:03.267 ACDT: %FIB-2-FIBDISABLE: Fatal error, slot 4: IPC Failure: timeout
Feb 3 17:28:03.267 ACDT: %RP-4-RSTSLOT: Resetting the card in the slot: 4,Event: CEF failure
Feb 3 17:28:03.303 ACDT: %LINK-5-CHANGED: Interface GigabitEthernet4/0/0, changed state to administratively down
Feb 3 17:28:03.307 ACDT: %OSPFv3-5-ADJCHG: Process 42, Nbr ???.???.???.??? on GigabitEthernet4/0/0 from FULL to DOWN, Neighbor Down: Interface down or detached
Feb 3 17:28:03.307 ACDT: %OSPF-5-ADJCHG: Process 42, Nbr ???.???.???.??? on GigabitEthernet4/0/0 from FULL to DOWN, Neighbor Down: Interface down or detached
Feb 3 17:28:03.315 ACDT: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet4/0/0, changed state to down
Feb 3 17:28:03.323 ACDT: %SPA_OIR-6-OFFLINECARD: SPA (SPA-1X10GE-L-V2) offline in subslot 4/0
Feb 3 17:32:29.862 ACDT: %RP-3-LC_ROMMON_STARTUP_FAILURE: Slot 4, output =
PON_09
PON_09
PON_09
PON_09
POFF09
POFF09
Chassis code release:
IOS (tm) GS Software (C12KPRP-K4P-M), Version 12.0(32)SY11, RELEASE SOFTWARE (fc2)
Hardware info:
AME: "slot 4", DESCR: "ISE 10G Modular Services Card v2"
PID: 12000-SIP-601 , VID: V03, SN: ?????
NAME: "SPA subslot 4/0", DESCR: "1-port 10 Gigabit Ethernet Shared Port Adapter XFP based"
PID: SPA-1X10GE-L-V2 , VID: V02, SN: ?????
NAME: "subslot 4/0 transceiver 0", DESCR: "OC192 + 10GBASE-L"
PID: 10-1989-02XFP , VID: C , SN: ??????
After the initial failure and card reseat it has lasted almost 3 weeks, reseated it again today and it fell over again within a minute. Reseated again and it is stil up and passing traffic after an hour.
Trying to power cycle the card from the chassis fails, a remove/insert is required to get the card running.
Is this a memory failure on the SIP (CEF failure??) or some other odd fault with the SIP, that makes it require a card remove/reinsert?
Thanks
David
- Labels:
-
Other Routing
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-08-2012 08:06 AM
David
Looks like some sort of fabric issue
Feb 3 17:27:31.209 ACDT: %MBUS-6-SWITCHED_FABCLK: Slot 4 primary clock switched to clock 0
capture "show controller fia" from the line card and global cli
Thanks
Dave Aicher
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-08-2012 03:50 PM
Hi David
The card is currently in an operational state still ..
From the global cli:
#show controllers fia
Fabric configuration: 10Gbps bandwidth, redundant fabric
Master Scheduler: Slot 17 Backup Scheduler: Slot 16
Fab epoch no 0 Halt count 0
From Fabric FIA Errors
-----------------------
redund overflow 0 cell drops 0
cell parity 0
Switch cards present 0x007C Slots 18 19 20 21 22
Switch cards monitored 0x007C Slots 18 19 20 21 22
Slot: 18 19 20 21 22
Name: sfc0 sfc1 sfc2 sfc3 sfc4
-------- -------- -------- -------- --------
los 0 0 0 0 0
state Off Off Off Off Off
crc16 0 0 0 0 0
To Fabric FIA Errors
-----------------------
sca not pres 0 req error 0 uni fifo overflow 0
grant parity 0 multi req 0 uni fifo undrflow 0
cntrl parity 0 uni req 0
multi fifo 0 empty dst req 0 handshake error 0
cell parity 0
From the SIP in slot 4:
========= Line Card (Slot 4) =========
Fabric configuration: Full bandwidth redundant
Master Scheduler: Slot 17
Fab epoch no 0 Halt count 0
From Fabric FIA Errors
-----------------------
cell fifo parity 0 no 125 MHz clock 0
cell processor ctrl wd error 0 reassembly mem ctrl wd error 0
reassembly mem single ECC 0 reassembly mem multi ECC 0
first last err 0 sequence err 0 pkt length err 0
Switch cards present: 0x1F
Switch cards monitored: 0x1F
0 1 2 3 4
-------- -------- -------- -------- --------
los 0 0 0 0 0
state Off Off Off Off Off
crc16 0 0 0 0 0
xor error0 0 0 0
cell drops0 0 0 0
drop packets from these linecards
0 1 2 3
-------- -------- -------- --------
Unicast High 0 0 0 0
Unicast Low 0 0 0 0
Multicast High0 0 0 0
Multicast Low 0 0 0 0
4 5 6 7
-------- -------- -------- --------
Unicast High 0 0 0 0
Unicast Low 0 0 0 0
Multicast High0 0 0 0
Multicast Low 0 0 0 0
8 9 10 11
-------- -------- -------- --------
Unicast High 0 0 0 0
Unicast Low 0 0 0 0
Multicast High0 0 0 0
Multicast Low 0 0 0 0
12 13 14 15
-------- -------- -------- --------
Unicast High 0 0 0 0
Unicast Low 0 0 0 0
Multicast High0 0 0 0
Multicast Low 0 0 0 0
To Fabric FIA Errors
-----------------------
piranha data parity 0 piranha cmd parity 0
assem fifo data parity 0 assem fifo cmd par 0
sca gnt parity 0 request error 0
cfifo overflow 0 cfifo underflow 0
piranha miss pkt end 0 piranha miss pkt start 0
pir start pkt LT 40b 0 pir mid pkt LT 32b 0 pir end pkt GT 32b 0
output mask all zeros 0 cfifo single ecc 0 cfifo multi ecc 0
pir no even clk 0 pir on odd clk 0 no 150MHz clk 0
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-09-2012 07:25 AM
Looks clean at the moment but the logs do seem to indicate a fabric issue. Any chance you have an open slot you can move the card to?
Dave
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-09-2012 03:43 PM
No spare slots unfortunately, but we have replaced the SIP and SPA, so will wait see what happens.
