cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
861
Views
2
Helpful
7
Replies

ASR9001/XR 6.4.2 - strange process.

serjo7979
Level 1
Level 1

Good day!

Recently we had to reload one of our ASR9001/XR 6.4.2 because of huge memory leak.
While trying to figure out what was the root cause of the leak I found syslog message from this router with a very strange process name:
wdsysmon[480]: %HA-HA_WD-4-TOP_MEMORY_USER_WARNING : 0: Process Name: .[65925], pid: 1328935301, Heap usage 364400 Kbytes, Virtual Shared memory usage: 69988 Kbytes

I wonder what is the process with name "." and huge pid number ?

Appreciate any support.

1 Accepted Solution

Accepted Solutions

A transient process is one that is short lived, such as when you issue a show command. Once the command is done executing the process will kill itself. For these processes we use very high PID values. Unfortunately due to the nature of these transient processes once dead you can't identify what the process was post-mortem. Netconf could be a culprit as you suggested.

Sam

View solution in original post

7 Replies 7

anpetit
Cisco Employee
Cisco Employee

Hello @serjo7979 ,

Hope you are doing well. 
Checking this, I'm not able to find what is this process.
Maybe with show context or show processes blocked the name of this process could be found.
Also, you probably know this already but 6.4.2 is a very old release, maybe an upgrade should be consider as memory leak will probably happen again in the future even if you reloaded the ASR9001. 
Hope it helps and have a nice day.
Kind Regards,
Antoine

Hello Antoine,
unfortunately the router in question was so deep in comatose that we couldn`t ssh or telnet to it - even from its p2p neighbor. Needless to say that all OSPF/LDP/BGP sessions were down.  That is why we couldn`t collect any diagnostic commands output. And you are absolutely right about maturity XR642, now we have all necessary reasons to upgrade it.

Sergey

Hello @serjo7979 ,
Thank you for your reply. 
If you need any recommendation regarding the release you want to upgrade to, let me know. 
Otherwise, if you already know which release you will upgrade to, please mark this thread as resolved to clear the queue. 
Thanks for you feedback and cooperation. 
Kind Regards,
Antoine

smilstea
Cisco Employee
Cisco Employee

Based on the PID value it is a transient process. Do you have the cli history from around the time this message was logged or know if any scripts were logging in and collecting data?

 

Thanks,

Sam

 

 

Hello Sam,
here is the list of show commands our NMS periodically collects every day:

  • show controllers np fabric counters
  • show controllers np interrupts
  • show drops all
  • show interface summary
  • show pfm
  • show process memory
  • show run
  • show environment
  • show version
  • show watchdog memory state

Router`s config wasn`t changed for weeks. You said it was a transient process. Transient of what kind - almost dead or just respawned and haven`t received its birth certificate yet ?
The amount of heap memory this process was holding looks pretty similar to Netconf.

Sergey

A transient process is one that is short lived, such as when you issue a show command. Once the command is done executing the process will kill itself. For these processes we use very high PID values. Unfortunately due to the nature of these transient processes once dead you can't identify what the process was post-mortem. Netconf could be a culprit as you suggested.

Sam

I`m sorry for long radio silence. You helped me greatly, thank you very much!

Sergey