01-17-2020 11:32 AM
We are migrating some of these A99-8X100GE-TR modules to the smaller ASR-9904 platform and running A9K-RSP880 and IOS 6.5.3. When the module installed, some messages come up requiring the FPD upgrade and following the FPD upgrade new errors appear. Also, many modules have 1 or 2 of the NP processors fail, so after the module is fully booted, it is shoing 6 or 4 of the 100G ports active instead of all 8 ports. Some outputs are below.
Let me know if there are any suggestions.
Thank you.
==========================================
BEFORE THE FPD UPGRADE
RP/0/RSP0/CPU0:ios(admin)#LC/0/1/CPU0:Jan 17 00:28:43.559 UTC: lda_server[65]: %PLATFORM-LDA-3-TIMING_FAILURE : 1588 timing configuration failure : PWM25 PLL in Meldun is unlocked Cannot configure timing. (0x00000016 - Invalid argument) : pkg/bin/lda_server : (PID=69665) : -Traceback= b8582dd 10fd9fad 42001a1 4200395 10ac0018 1022eef7 1022ee9d 1022bf4e 4201bb6 4200054
RP/0/RSP0/CPU0:ios(admin)#LC/0/1/CPU0:Jan 17 00:29:49.868 UTC: x86rmon_fpd_agent[395]: %PLATFORM-UPGRADE_FPD-4-DOWN_REV : rommon instance 0 is down-rev (V8.40), upgrade to (V8.47). Use the "upgrade hw-module fpd" CLI in admin mode.
LC/0/1/CPU0:Jan 17 00:30:05.694 UTC: tom_fpd_fpga_agent[356]: %PLATFORM-UPGRADE_FPD-4-DOWN_REV : fpga2 instance 0 is down-rev (V1.76), upgrade to (V1.97). Use the "upgrade hw-module fpd" CLI in admin mode.
LC/0/1/CPU0:Jan 17 00:30:05.701 UTC: tom_fpd_fpga_agent[356]: %PLATFORM-UPGRADE_FPD-4-DOWN_REV : fsbl instance 0 is down-rev (V1.78), upgrade to (V1.103). Use the "upgrade hw-module fpd" CLI in admin mode.
LC/0/1/CPU0:Jan 17 00:30:05.710 UTC: tom_fpd_fpga_agent[356]: %PLATFORM-UPGRADE_FPD-4-DOWN_REV : lnxfw instance 0 is down-rev (V1.78), upgrade to (V1.103). Use the "upgrade hw-module fpd" CLI in admin mode.
LC/0/1/CPU0:Jan 17 00:30:05.719 UTC: tom_fpd_fpga_agent[356]: %PLATFORM-UPGRADE_FPD-4-DOWN_REV : fpga3 instance 1 is down-rev (V1.04), upgrade to (V1.07). Use the "upgrade hw-module fpd" CLI in admin mode.
LC/0/1/CPU0:Jan 17 00:30:05.719 UTC: tom_fpd_fpga_agent[356]: %PLATFORM-UPGRADE_FPD-4-DOWN_REV : fpga3 instance 0 is down-rev (V1.04), upgrade to (V1.07). Use the "upgrade hw-module fpd" CLI in admin mode.
LC/0/1/CPU0:Jan 17 00:30:05.723 UTC: tom_fpd_fpga_agent[356]: %PLATFORM-UPGRADE_FPD-4-DOWN_REV : fpga4 instance 0 is down-rev (V1.06), upgrade to (V1.09). Use the "upgrade hw-module fpd" CLI in admin mode.
Starting the upgrade/download of following FPDs:
=========== ==== ======= ======= =========== ========= ====
Current Upg/Dng
Location Type Subtype Upg/Dng Version Version Inst
=========== ==== ======= ======= =========== ========= ====
0/1/CPU0 lc rommon upg 8.40 8.47 0
0/1/CPU0 lc fpga2 upg 1.76 1.97 0
0/1/CPU0 lc fsbl upg 1.78 1.103 0
0/1/CPU0 lc lnxfw upg 1.78 1.103 0
0/1/CPU0 lc fpga3 upg 1.04 1.07 0
0/1/CPU0 lc fpga4 upg 1.06 1.09 0
0/1/CPU0 lc fpga3 upg 1.04 1.07 1
------------------------------------------------------------
FPD upgrade in progress. Max timeout remaining 89 min.
FPD upgrade in progress. Max timeout remaining 88 min.
FPD upgrade in progress. Max timeout remaining 87 min.
FPD upgrade in progress. Max timeout remaining 86 min.
Successfully upgraded rommon for A99-8X100GE-TR on location 0/1/CPU0 from 8.40 to 8.47
Successfully upgraded fpga2 for A99-8X100GE-TR on location 0/1/CPU0 from 1.76 to 1.97
Successfully upgraded fsbl for A99-8X100GE-TR on location 0/1/CPU0 from 1.78 to 1.103
Successfully upgraded lnxfw for A99-8X100GE-TR on location 0/1/CPU0 from 1.78 to 1.103
Successfully upgraded fpga3 for A99-8X100GE-TR on location 0/1/CPU0 from 1.04 to 1.07
Successfully upgraded fpga4 for A99-8X100GE-TR on location 0/1/CPU0 from 1.06 to 1.09
Successfully upgraded fpga3 for A99-8X100GE-TR on location 0/1/CPU0 from 1.04 to 1.07
AFTER THE FPD UPGRADE
RP/0/RSP0/CPU0:ios(admin)#LC/0/0/CPU0:Jan 17 10:13:12.894 UTC: pfm_node_lc[292]: %FABRIC-FIA-1-ASIC_INIT_ERROR : Set|fialc[159811]|0x108a000|ASIC INIT Error detected on FIA instance 0
RP/0/RSP0/CPU0:Jan 17 10:13:34.951 UTC: FABMGR[230]: %PLATFORM-FABMGR-2-FABRIC_INTERNAL_FAULT : 0/0/CPU0 (slot 0) encountered fabric fault. Interfaces are going to be shutdown.
RP/0/RSP0/CPU0:ios(admin)#sh plat
Fri Jan 17 10:16:28.379 UTC
Node Type State Config State
-----------------------------------------------------------------------------
0/RSP0/CPU0 A9K-RSP880-SE(Active) IOS XR RUN PWR,NSHUT,MON
0/RSP1/CPU0 A9K-RSP880-SE(Standby) IOS XR RUN PWR,NSHUT,MON
0/FT0/SP ASR-9904-FAN READY
0/0/CPU0 A99-8X100GE-TR IOS XR RUN PWR,NSHUT,MON
0/PS0/M0/SP PWR-3KW-AC-V2 READY PWR,NSHUT,MON
0/PS0/M1/SP PWR-3KW-AC-V2 READY PWR,NSHUT,MON
0/PS0/M2/SP PWR-3KW-AC-V2 READY PWR,NSHUT,MON
RP/0/RSP0/CPU0:ios(admin)#sh ver
Fri Jan 17 10:16:37.974 UTC
Cisco IOS XR Software, Version 6.5.3[Default]
Copyright (c) 2019 by Cisco Systems, Inc.
ROM: System Bootstrap, Version 10.65(c) 1994-2014 by Cisco Systems, Inc.
ios uptime is 16 minutes
System image file is "disk0:asr9k-os-mbi-6.5.3/0x100305/mbiasr9k-rsp3.vm"
cisco ASR9K Series (Intel 686 F6M14S4) processor with 33554432K bytes of memory.
Intel 686 F6M14S4 processor at 1904MHz, Revision 2.174
ASR 9904 2 Line Card Slot Chassis with V2 AC PEM
4 Management Ethernet
1 FastEthernet
8 HundredGigE
8 DWDM controller(s)
375k bytes of non-volatile configuration memory.
6117M bytes of hard disk.
25012208k bytes of disk0: (Sector size 512 bytes).
25012208k bytes of disk1: (Sector size 512 bytes).
Configuration register on node 0/RSP0/CPU0 is 0x2102
RP/0/RSP0/CPU0:ios(admin)#RP/0/RSP0/CPU0:Jan 17 10:17:44.205 UTC: pfm_node_rp[369]: %PLATFORM-DIAGS-3-PUNT_FABRIC_DATA_PATH_FAILED : Set|online_diag_rsp[229504]|System Punt/Fabric/data Path Test(0x2000004)|failure threshold is 3, (slot, NP) failed: (0/0/CPU0, 0)
RP/0/RSP0/CPU0:ios(admin)#sh diagn res loc 0/RSP0/CPU0
Fri Jan 17 10:17:55.765 UTC
Current bootup diagnostic level for A9K-RSP880-SE 0/RSP0/CPU0: minimal
A9K-RSP880-SE 0/RSP0/CPU0:
Overall diagnostic result: MINOR ERROR
Diagnostic level at card bootup: minimal
Test results: (. = Pass, F = Fail, U = Untested)
1 ) SrspStandbyEobcHeartbeat --------> U
2 ) SrspActiveEobcHeartbeat ---------> U
3 ) FabricLoopback ------------------> .
4 ) PuntFabricDataPath --------------> F
5 ) CPUCtrlScratchRegister ----------> .
6 ) DeviceCtrlScratchRegister -------> .
7 ) ClkCtrlScratchRegister ----------> .
8 ) FabSwitchIdRegister -------------> .
9 ) NVRAMScratchRegister ------------> .
RP/0/RSP0/CPU0:ios(admin)#RP/0/RSP1/CPU0:Jan 17 10:18:04.554 UTC: pfm_node_rp[369]: %PLATFORM-DIAGS-3-PUNT_FABRIC_DATA_PATH_FAILED : Set|online_diag_rsp[225402]|System Punt/Fabric/data Path Test(0x2000004)|failure threshold is 3, (slot, NP) failed: (0/0/CPU0, 0) (0/0/CPU0, 1) (0/0/CPU0, 2) (0/0/CPU0, 3)
01-17-2020 11:50 AM
Hello,
Have you observed this in multiple instances with the same conditions ?
1. FPD Upgrade
2. Chassis is 9904
3. RSP-880
4. XR 6.5.3
If you have any devices still in the affected state i.e. the faulty LC is inserted , can you collect and upload a
show tech-support np
Did you try any recovery steps ?
1. OIR
2. Inserting the LC back to any other supported chassis
Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: