02-11-2021 12:59 AM - edited 02-11-2021 01:16 AM
We have a 9006-V2 in our lab.
The 9006 currently has 1 x MOD80-SE and 1 x MOD80-TR. In the MOD80-SE there is a MPA-20X1GE installed.
When running with RSP880-SE with IOSXR 6.5.3 all is fine everything seems smooth.
When running with RSP440-SE with IOSXR 6.5.3 we get the following error
RP/0/RSP0/CPU0:Feb 12 08:42:10.245 UTC: FABMGR[222]: %PLATFORM-FABMGR-2-FABRIC_LINK_DOWN_FAULT : (0/2/CPU0 XBAR 0) <--> (0/RSP0/CPU0 XBAR 1) fabric link is down
RP/0/RSP0/CPU0:Feb 12 08:42:10.257 UTC: FABMGR[222]: %PLATFORM-FABMGR-2-FABRIC_INTERNAL_FAULT : 0/2/CPU0 (slot 4) encountered fabric fault. Interfaces are going to be shutdown.
However none of the interfaces ever go down. Then we put a different RSP440-SE with IOSXR 6.5.3 and we get same error. We moved the MOD80-SE to different slot and same thing.
The strange thing is this behavior doesn't exist with the RSP880-SE.
I looked at different bugs and there are different answers to this specific issue.
Anyone experience the same?
These 2 RSP440-SE will power up a MOD400-SE with no issues and not errors. Of course you get Rate Limit warning, but that's to be expected.
Now, this is popping up with the RSP440-SE when MOD80-SE is moved to a different slot
fab_si[175]: %PLATFORM-STATS_INFRA-3-ERR_STR_1 : ErrStr:Unable to map stats infra shared mem
Solved! Go to Solution.
02-11-2021 06:03 PM - edited 02-11-2021 06:04 PM
While Mark is right that RSP440 is not supported by development or TAC beyond 6.4.2 there is nothing to prevent it from working (its more of a support and troubleshooting issue). With that said I am curious the combinations that were tried exactly.
Is it always RSP0 that reports the link down to the same MOD80?
You have moved the same MOD80 between slots using the same RSP and the issue follows with the MOD80?
Using the same RSP if you swap the MOD80 with the other MOD80 you get no failure?
If you swap RSP440 with another RSP440 and keep the original MOD80 you still get the same failure?
Please note that MOD80 vs MOD400 and RSP440 vs RSP880 use different backplane pins/connectors so that may be why you see a difference when connecting a different generation card. So please perform the testing as I described above with the same type of card (TR vs SE does not matter for the pinout, the only difference is RAM and TCAM and CPU).
Sam
02-11-2021 01:38 PM
Hi PK99
Rsp440 is not supported after 6.4.2 so perhaps this is your issue ?
Pls check this link as well as the eos notice for rsp440
Hope this helps
Mark
02-11-2021 06:03 PM - edited 02-11-2021 06:04 PM
While Mark is right that RSP440 is not supported by development or TAC beyond 6.4.2 there is nothing to prevent it from working (its more of a support and troubleshooting issue). With that said I am curious the combinations that were tried exactly.
Is it always RSP0 that reports the link down to the same MOD80?
You have moved the same MOD80 between slots using the same RSP and the issue follows with the MOD80?
Using the same RSP if you swap the MOD80 with the other MOD80 you get no failure?
If you swap RSP440 with another RSP440 and keep the original MOD80 you still get the same failure?
Please note that MOD80 vs MOD400 and RSP440 vs RSP880 use different backplane pins/connectors so that may be why you see a difference when connecting a different generation card. So please perform the testing as I described above with the same type of card (TR vs SE does not matter for the pinout, the only difference is RAM and TCAM and CPU).
Sam
02-12-2021 02:12 AM
Thanks Sam, I will give it a try. In our lab we only have 1 x MOD80-SE and 1 x MOD80-TR. I don't have the ability to try a different MOD80-SE. I will try moving the RSP440 around and see if I get same behavior. I can rule out using RSP880 and MOD400 with said RSP440s. BTW, A9K-400G-DWDM-TR also works fine with RSP440, albeit the RATE LIMIT warning - but I understand that the DWDM card is a Tomahawk card.
While I understand that the RSP440 is not supported beyond 6.4.2, I was well aware of this fact but due to our needs we have to have minimum 6.5 and like you said, supporting 6.5 should have no bearing on the fact that I'm getting these errors.
Will test again and report back.
02-12-2021 04:17 AM
Sam, sure enough...our RSP slot 0 must be bad. I moved the RSP440-SE to slot RSP1 and no fabric error. Did the same for 2nd RSP440-SE and no error. I tried RSP880-SE on slot RSP0 and sure enough, FABRIC error. Was going to try RSP5 but if memory serves me correct, MOD80 won't even work with RSP5.
I guess I never even considered RSP0 slot was bad. I'm guessing the FABRIC link from RSP0 slot to the rest of the slots must be bad.
02-12-2021 04:26 AM
I am however getting this annoying message on 1 of the RSP440-SE now in RSP1 slot
RP/0/RSP1/CPU0:ios#LC/0/1/CPU0:Feb 13 12:08:36.883 UTC: pfm_node_lc[308]: %FABRIC-FIA-1-SKT_SP0_INTR_BAD_CODE_0 : Set|fialc[168004]|0x103d000|SKT_SP0_INTR_BAD_CODE on FIA 0
LC/0/1/CPU0:Feb 13 12:08:36.886 UTC: pfm_node_lc[308]: %FABRIC-FIA-1-SKT_SP0_INTR_LANE_CRC_ERR_0 : Set|fialc[168004]|0x103d000|SKT_SP0_INTR_LANE_CRC_ERR on FIA 0
LC/0/1/CPU0:Feb 13 12:08:36.886 UTC: pfm_node_lc[308]: %FABRIC-FIA-1-SKT_SP0_INTR_CW_CRC_ERR_0 : Set|fialc[168004]|0x103d000|SKT_SP0_INTR_CW_CRC_ERR on FIA 0
LC/0/1/CPU0:Feb 13 12:08:46.887 UTC: pfm_node_lc[308]: %FABRIC-FIA-1-SKT_SP0_INTR_BAD_CODE_0 : Clear|fialc[168004]|0x103d000|SKT_SP0_INTR_BAD_CODE on FIA 0
LC/0/1/CPU0:Feb 13 12:08:46.887 UTC: pfm_node_lc[308]: %FABRIC-FIA-1-SKT_SP0_INTR_LANE_CRC_ERR_0 : Clear|fialc[168004]|0x103d000|SKT_SP0_INTR_LANE_CRC_ERR on FIA 0
LC/0/1/CPU0:Feb 13 12:08:46.888 UTC: pfm_node_lc[308]: %FABRIC-FIA-1-SKT_SP0_INTR_CW_CRC_ERR_0 : Clear|fialc[168004]|0x103d000|SKT_SP0_INTR_CW_CRC_ERR on FIA 0
It's "CLEAR" message so must be cosmetic but it's annoying because it's constantly popping up
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide