cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1111
Views
1
Helpful
3
Replies

N540-24Z8Q2C-M - Routing errors - insert to hardware failed for NPU# 0

netgn
Level 1
Level 1

We've recently installed this device at the edge of our network to peer with 3 different providers.  This has been working fine up until recently with the following errors.  Are we hitting our limit for ipv4 routes?  I cant find anything about this model and how many ipv4 prefixes it can handle anywhere.  I've asked in other forums and gotten some good answers but was recommended to reach out to Cisco IOS XR community for guidance as well.  We are running 7.4.2 version of code.  We've got 1 of the bgp peers shut down right now due to this issue and routes are not injecting into the FIB.  Are we out of luck here?  Any suggested further commands or output to look at let me know what to run.  Sort of new to this model device.  

LC/0/0/CPU0:Jun 28 11:44:44.430 CDT: fia_driver[323]:

%PLATFORM-DPA-3-ERROR : IPv4 route 151.236.162.0/24 insert to hardware failed for NPU# 0, Error code - 'DPA' detected the 'warning' condition 'SDK - No resources for operation'(0x4f9c2200),442286

LC/0/0/CPU0:Jun 28 11:44:44.468 CDT: fib_mgr[222]:

%PLATFORM-PLAT_FIB-3-HW_PROG_ERROR : HW Programming failed for table,iproute,key,0,255.255.255.0,255.255.255.0,unit,0,dpa_trans_id,442286,failure-reason,NoResrc

 

If anyone has any data sheets or numbers indicating what this platform can handle please share.  Thats where Im struggling is to find something that definitively says, you can only do 'x' number of routes.  

3 Replies 3

Ramblin Tech
Spotlight
Spotlight

Cisco published some NCS 540 scale numbers in Cisco Live presentations in the past, including this one:

https://www.ciscolive.com/c/dam/r/ciscolive/us/docs/2020/pdf/DGTL-BRKSPG-2159.pdf

That said, what was the motivation for implementing peering on this particular product, codenamed "Tortin"? Tortin is aimed at SP access networks where the elements will be deployed in outside plant (OSP) needing Class 2 environmentals.  A specific use-case for Tortin is as a cell site router.  All of this to say: TCAM for IPv4/IPv6 routes in Tortin's QAX NPU is rather meager as it was not designed for peering.  If this is the platform that you must go with, there may be a "hw-module" command to optimally recarve TCAM for your specific use-case (at the cost of losing TCAM resources for other features).

Disclaimer: I am long in CSCO

Thanks for this info.  After looking at the Cisco Live doc you provided thats what I was looking for.  It clearly is not the platform for peering.  We got hung up on the CPU and memory thinking that would be enough, but its the "special" place Cisco likes to carve up and make it difficult to track down scale.  We are looking at the 5501-SE now as according to this doc

https://xrdocs.io/ncs5500/tutorials/ncs5500-routing-resource-with-2020-internet/

This platform can do a much better job of handling Internet peering.  My only concern right now is the limitation on the FIB being 2million and we plan to peer with at least 3 providers using BGP full tables.  

Hi @netgn,

Thanks for the link, I had not seen it before. The XR BU (aka, Mass-scale Infrastructure Group) has a lot of really sharp TMEs (Technical Marketing Engineers) and Nicolas is certainly one of them.

If I was still a Cisco SE (I retired earlier this year), I would hesitate to recommend the NCS 5501-SE to a new customer for a few reasons:

  • It is a 1st-gen 5500 product that is getting old; I believe it came out around 2017.
  • It is based on Broadcom's DNX Qumran MX NPU, which is a Jericho NPU without the fabric interface. Cisco's newest DNX-based products utilize the Jericho2/Qumran2 NPUs and their derivatives (eg, Jericho2C).
  • It runs eXR rather than XR7/LNT. The XR functionality between these two flavors of XR is the same, but XR7 has operational advantages.
  • Has only 64GB of SSD, which makes management of multiple XR images problematic.
  • The replacements for the 5501/5501-SE have been on the market for several years, making the 5501 ripe for an EOL announcement. Its replacements are the NCS 55A1-36H-S and NCS 55A1-36H-SE-S.

The direct replacement for the 5501-SE is the 55A1-36H-SE-S, based on the Jericho+ NPU (falls between J and J2 NPUs) and has 128GB SSD. It still runs eXR, but eXR vs XR7 should be a tie-breaker, not a deal-breaker. Data sheet claims 4M FIB entries and Nicolas' link which projects Internet growth out to 2029 says: "Conclusion: these devices can handle the internet growth with zero concern or limitation, we have a lot of available space in the OP eTCAM." Note the NCS 5500/5700 models with a "-SE" suffix are increased scale editions with external TCAM (external to the NPU) to handle increase route scale.

I have no idea what your budget is, nor what your discount levels are, but you might also consider a couple of other options, though they might be overkill (or over budget), as these implement NPUs newer than J+ and run XR7/LNT:

I am less familiar with the 8K than than the 5500/5700 models, but Silicon One appears to be the future for Cisco's custom-ASIC XR platforms. (Custom-ASIC as opposed to the product lines based on merchant silicon like 5500/5700. I expect Cisco will continue with 1RU merchant silicon products for the foreseeable future). Anyway... you might talk over other options with your Cisco or VAR account team, rather than just default to the 5501-SE.

ps - FIB only needs to be big enough to handle BGP's "best" routes for all the prefixes in its RIB, so peering with 3 providers does not necessarily mean 3x the FIB size is needed (just 3x the RIB size in RAM).

Disclaimer: I am long in CSCO