cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Announcements
941
Views
20
Helpful
29
Replies
kozharov
Beginner

ASR1001-X booting up problem due to heavy configuration

Dear experts,

 

I’ll appreciate your comment and advice on the situation we've encountered recently.

 

Our customer’s ASR1001-X router worked just fine until it required rebooting due to maintenance activity.

 

The router refused to boot up (we guess it is due to heavy configuration file) and our customer managed to finally boot it up only in several hours, uploading configuration manually in small portions.

 

Here are some details:

 

  • cisco ASR1001-X (1NG) processor (revision 1NG), 16G of physical memory, asr1001x-universalk9.16.03.06.SPA.bin
  • 7000+ active GRE tunnels and 7000+ bgp peers over these tunnels (mainly advertising default route and receiving small amount of specifics).

 

 

While ASR is running, “sh cpu” and “sh memory” are just fine (25% CPU) (5GB of free memory) with moderate traffic load (mostly telemetry).

 

 

After reboot we apparently experience the lack of resources and do see these type of messages:

 

%SYS-2-MALLOCFAIL: Memory allocation

Pool: Processor  Free:  Cause: Memory fragmentation

Alternate Pool: Cause: No Alternate pool

%SYS-2-CHUNKEXPANDFAIL: Could not expand chunk pool for Packet Elements. No memory available -Process= "Chunk Manager"

%SYS-2-CFORKMEM: Process creation of BGP Open failed (no memory). -Process= "BGP Router",

etc.

 

 

We know that we are exceeding datasheet limits of ASR1001-X as due to the datasheet “Up to 4,000 tunnels GRE are supported” but it works fine under load, the problem is only during booting up.

 

 

The questions are:

 

  • Is there any workaround to help ASR1001-X router booting up with this heavy configuration?

 

  • What could be the recommended upgrade for current ASR1001-X, may be a shift to more powerful platform needed?

 

Unfortunately we can’t address this question to TAC because our service package is expired.

 

 

Also I’m under NDA and can’t upload full detailed config, logs, etc.

 

 

An upgrade to the latest software asr1001x-universalk9.16.12.05.SPA has not helped.

 

Having second ASR1001-X is a clear option.

 

 

Thank you!

 

Mikhail

29 REPLIES 29
marce1000
VIP Advisor

 

 - Ref : https://bst.cloudapps.cisco.com/bugsearch/bug/CSCva80218

     You may try : https://software.cisco.com/download/home/284932298/type/282046477/release/Amsterdam-17.3.3

                                 ( highest suggested release)

 M.

Dear Marce1000,

 

Thank you for your reply. But it doesn’t look like our case.

 

 

The router doesn’t crash, it just can’t boot up and process full configuration.

 

We tried the latest version in 16th (16.12.05) and it has not helped. I doubt 17.3.3 can help. Do you believe it can help?

 

 

Thank you!

 

Mikhail

 

                          >....I doubt 17.3.3 can help. Do you believe it can help

    I would certainly give it a try since it is one of few options left over.

 M.

Dear Marce1000,

 

I’ll persuade the customer for the next try with IOS upgrade, though I don’t believe in success by myself…

Giuseppe Larosa
Hall of Fame Master

Hello @kozharov ,

 

>> Having second ASR1001-X is a clear option.

Adding a second ASR1001-X and dividing the load (GRE Tunnels and BGP sessions)  between the two is probably your best option. Because you want to have a solution able to survive in case of a reboot.

 

The system is not able to restart all the GRE tunnels and the BGP sessions at once, but it is able to support them once reached the steady state.

 

As you have noted you have gone beyond the suggested performance data.

 

Hope to help

Giuseppe

 

 

Dear Giuseppe, you message is clear.

 

The current idea is to find a single-box solution that can support this amount of GRE tunnels and bgp peers and to buy the redundant box as well.

 

Simply sharing the current load between two ASRs will definitely solve the situation but only in a half. If one router fails half the tunnel will get down.

 

What model(s) would you advice if there is nothing to do with ASR1001-Х?

 

 

Thank you!

Hello @kozharov ,

the new Cisco platform has the misleading name of Catalyst 8000 but it should be the substitute for ASR 1000.

 

see presentation attached.

 

Of course, being a new platform there are some concerns about this.

In alternative there is looking for higher models in ASR 1000 family.

 

Hope to help

Giuseppe

 

Dear Giuseppe, thank you very much for your comments!

 

“higher models in ASR 1000 family” – it is unclear if they can help (they have more memory and more powerful CPU) or not (all upper models have the same restriction “up to 4000 GRE tunnels”).

 

8000 platform – indicates “Up to 8000 SD-WAN IPsec Tunnels” but doesn’t mention GRE. So there is no 100% guaranty it will help with GRE.

 

So, I’m still in limbo which way to go…

Leo Laohoo
VIP Community Legend


@kozharov wrote:

Also I’m under NDA and can’t upload full detailed config, logs, etc.


Well, this is going to make things a lot complicated.  

Dear Leo,

 

It depends on what you want to see... “Show tech-support” or “sh run” are definitely can’t be shown in full but some specific output can be shown if needed…

 

Thank you!

Leo Laohoo
VIP Community Legend

Kindly provide the complete output to the following commands:

  • sh platform software status con brief
  • sh processes memory platform sorted location r0

Dear Leo,

 

here are requested outputs:

 

 

Elliot Dierksen
Enthusiast

Stray thought, but does the router need more memory? You mentioned memory allocation failures during boot up.

Dear Elliot,

 

I believe the router lacks memory during booting up that is why all these allocation problems, but ASR1001-X can have onle 8GB or 16GB on board and we do already have 16GB.

 

Mikhail