cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1718
Views
20
Helpful
8
Replies

Free up Memory on ASR1002-X

Peyman Zadmehr
Spotlight
Spotlight

Hi

Recently we have Memory leakage issue on our ASR1002-X. I searched and I found following link about troubleshooting:

https://www.cisco.com/c/en/us/support/docs/routers/asr-1000-series-aggregation-services-routers/116777-technote-product-00.html

Question:

Is there any command for free up the memory on this router? As far as I see in above document we have only some show commands.

Thank you

2 Accepted Solutions

Accepted Solutions

Leo Laohoo
Hall of Fame
Hall of Fame

@Peyman Zadmehr wrote:

Is there any command for free up the memory on this router?


Reboot the router will stop the memory leak.  Otherwise, raise a TAC Case so TAC can identify the cause of the memory leak as well as provide recommendation for workaround and fixed version. 

View solution in original post

"It seems, each time a subscriber is connected to ASR1002-X, some amount of memory is reserved for it but upon ending the secession this allocated memory is not freed."

That's does sound like a memory leak bug.

I agree with Leo's "You really need to engage TAC".  Make sure you classify the bug at the highest impact level (deserved, I think, due to both it's on multiple ASRs and [so far] on multiple IOS versions).

"It is a little bit strange, since based on ASR1000 Scaling document this box supports 29K PPP/PPPoE session:"

Yea, but in the real world, vendor testing doesn't always catch "rare" usage bugs that's large scale and/or long term.  If it did, there would be no bug patches after software was released to customers.

"As You mentioned reloading the router will solve the problem temporarily but considering I am providing Internet Service via the router to end-subscriber this is not a good solution."

Understandable, but in the interim, you don't want "non-planned" stoppages either.  So, short term, perhaps your best option is scheduled reloads at the least busy time (e.g. there will be a 2 AM down period, every #th day [until issue resolved]).  I assume your service must allow for some system maintenance down time.  If not, it should.  Notify your clients your dealing with a vendor bug.

If you have enough Cisco hardware, and since this bug is so impactful to your customers, you might mention, to your Cisco sales team, "ya know, I wonder if brand X doesn't have this issue". 

Since you mention you have a couple of long ACLs, if not already doing so (and if supported on the platform), try Turbo ACLs .  (Reason I suggest this, besides improving ACL processing performance, sometimes changing a software execution path, can avoid a bug, even in an "unrelated" software feature.  [NB: the converse is true too.])

View solution in original post

8 Replies 8

Leo Laohoo
Hall of Fame
Hall of Fame

@Peyman Zadmehr wrote:

Is there any command for free up the memory on this router?


Reboot the router will stop the memory leak.  Otherwise, raise a TAC Case so TAC can identify the cause of the memory leak as well as provide recommendation for workaround and fixed version. 

Joseph W. Doherty
Hall of Fame
Hall of Fame

As a true memory leak is a bug, the only ways to truly fix a memory leak are to (if possible) stop using the feature that has it, or update your IOS to a version with a bug fix that addresses the leak.

For immediate recovery of the memory leak, you need to reboot.  Just to clarify, this is not a permanent solution.  However, if it's a slow memory leak, you might hobble along with it reloading your ASR on a reoccurring schedule.

BTW, you can also run out of memory w/o a memory leak.  In those cases, there's no bug involved, however the fixes are (again, if possible) stop using the feature causing it, (if possible) modify how you use the feature so it's not as memory demanding, and (if possible) add memory to the ASR.

Again, a memory leak is from a bug, so adding memory (if possible), generally only delays when the memory leak will exhaust your device's free memory.  Other features can exhaust memory, so when they exhaust it, you have insufficient hardware (i.e. memory) for what you're doing.

Peyman Zadmehr
Spotlight
Spotlight

Dear Joseph

Thank you for comprehensive reply. My problem is I am Terminating about 15K PPPoE session on ASR1002-X. I am using Service accounting with ACL on these sessions.(Upto five services per PPPoE session to account and one of these services uses two huge ACL with about 10K ACE), I tested some old and new version of IOS(15.3,15.4,16.6.7,16.9.2) and none of them solved my  issue.

The problem is whenever I reload the router, memory usage starts to increase starting from 38% toward 95% in a 20-days time-frame. I don't know which feature causes this issue, the only observable info is output of show process memory command, which shows SSS Manager uses most of the memory. 

As You mentioned reloading the router will solve the problem temporarily but considering I am providing Internet Service via the router to end-subscriber this is not a good solution. I must mention I have more than one ASR1002-X and number of concurrent PPPoE sessions on each box is vary from 15K to 19K. Same issue is observable on each box. I am using maximum amount of supported RAM on each box which is 16GB.

It is a little bit strange, since based on ASR1000 Scaling document this box supports 29K PPP/PPPoE session:

https://www.cisco.com/c/en/us/td/docs/routers/asr1000/configuration/guide/chassis/asr1000-software-config-guide/scaling-asr.pdf

It seems, each time a subscriber is connected to ASR1002-X, some amount of memory is reserved for it but upon ending the secession this allocated memory is not freed.

I was thinking if there are commands for clearing the memory, I can write EEM to clear the memory automatically and prevent reloading the router each time.

Thank you

Peyman

  


@Peyman Zadmehr wrote:

SSS Manager


You really need to engage TAC.  There is a long (very long) list of memory leak bugs attributed to the "SSS Manager" process, starting with CSCur10056.

Dear Leo

Thank you for the tip.

"It seems, each time a subscriber is connected to ASR1002-X, some amount of memory is reserved for it but upon ending the secession this allocated memory is not freed."

That's does sound like a memory leak bug.

I agree with Leo's "You really need to engage TAC".  Make sure you classify the bug at the highest impact level (deserved, I think, due to both it's on multiple ASRs and [so far] on multiple IOS versions).

"It is a little bit strange, since based on ASR1000 Scaling document this box supports 29K PPP/PPPoE session:"

Yea, but in the real world, vendor testing doesn't always catch "rare" usage bugs that's large scale and/or long term.  If it did, there would be no bug patches after software was released to customers.

"As You mentioned reloading the router will solve the problem temporarily but considering I am providing Internet Service via the router to end-subscriber this is not a good solution."

Understandable, but in the interim, you don't want "non-planned" stoppages either.  So, short term, perhaps your best option is scheduled reloads at the least busy time (e.g. there will be a 2 AM down period, every #th day [until issue resolved]).  I assume your service must allow for some system maintenance down time.  If not, it should.  Notify your clients your dealing with a vendor bug.

If you have enough Cisco hardware, and since this bug is so impactful to your customers, you might mention, to your Cisco sales team, "ya know, I wonder if brand X doesn't have this issue". 

Since you mention you have a couple of long ACLs, if not already doing so (and if supported on the platform), try Turbo ACLs .  (Reason I suggest this, besides improving ACL processing performance, sometimes changing a software execution path, can avoid a bug, even in an "unrelated" software feature.  [NB: the converse is true too.])

Dear Joseph

Thank you for your answer, Turbo ACL is not supported on ASR1000 since the command is not available:

access-list compiled

For the bug, I searched about it, it seems it was fixed in IOS 15.5-3, I will test this IOS and see what is the outcome. of course we have maintenance windows may be until TAC look into the issue I can do this.

Peyman


@Peyman Zadmehr wrote:

it seems it was fixed in IOS 15.5-3


Raise a TAC Case and get TAC to verify the cause of the memory leak. 

Bug IDs are rarely updated.  Because of this the information found in them are mostly inaccurate, outdated and often do not make any sense.  

Review Cisco Networking products for a $25 gift card