Introduction

xthuijs · ‎09-11-2013

Introduction
- Core Issue
Monitoring Areas
Configuration additions
- Kernel Dumper
- Online Diagnostics
Notes
- Periodic monitoring command summary
- Related Information

Introduction

Over the course of the last few months I received a lot of questions in relation to CPU and memory management in IOS-XR. Since we all grew up with IOS I certainly recognize where that question (and concern) is coming from. However the architecture of the OS between IOS (monolithic) and XR (more like a linux kernel) is so different that the management of CPU and memory is not the same and we can't adopt the KPI's from IOS to XR.

Core Issue

In this document we're discussing the precise details of the hows and tos regarding this. Also take note of the IOS to XR migration guide which has

some more related detail to the IOS to XR comparison.

IOS XR monitoring is substantially different then classic IOS. One process could hook the CPU in IOS, in XR similar issues don’t exist per-se. Generally we see the request for people using the cisco process or memory mibs to monitor XR and are flabbergasted by the massive output it generates.

There is no total cpu utilization as such as what IOS used to have. Also BGP for instance “claiming” 80% in XR may be a good thing during convergence, however the overal cpu util would still not be too bad. Note also that All IOS XR routers have at least dual core cpu’s.

The good old KPI of monitoring overall system cpu and memory doesn’t apply to XR for those reasons.

Monitoring memory goes per process also. Having little free memory available is not a worry on itself. A process allocating memory and continuously and never releasing is obviously not good, but that is hard to debug with the old method used in IOS.

Basically a single process having allocated 1M of memory may not use all that 1M, but if it continuous to increase over time from its individual use 100k, 200k, 300k etc this is the sign that it *may* be leaking memory.

The good thing is here that even when it leaks memory, it only affects this process and not others. So the impact of the leak is generally contained.

Monitoring Areas

Blocked Processes and Process States

To see which processes are in blocked state, issue the command ‘show processes blocked’

RP/0/RP1/CPU0:host#sh processes blocked location 0/RP1/CPU0
Jid       Pid Tid            Name State   TimeInState    Blocked-on
65546 13365258   1             ksh Reply    0:06:36:0903    8200 devc-conaux
   52     36890   2         attachd Reply 331:08:01:0495   32792 eth_server
   52     36890   3         attachd Reply 331:08:01:0494   12301 mqueue
   77     36892   6            qnet Reply    0:00:00:0052   32792 eth_server
   77     36892   7            qnet Reply    0:00:00:0051   32792 eth_server
   77     36892   8            qnet Reply    0:00:00:0051   32792 eth_server
   77     36892   9            qnet Reply    0:00:00:0048   32792 eth_server
   51     36898   2   attach_server Reply 331:08:01:0297   12301 mqueue
379    147534   1     tftp_server Reply 331:04:07:0584   12301 mqueue
270    188603   2         lpts_fm Reply    8:57:20:0657 168011 lpts_pa
65725 13369533   1            exec Reply    0:00:00:0240       1 kernel
65729 13390017   1            more Reply    0:00:00:0197   12299 pipe
65776 13390064   1 show_processes Reply    0:00:00:0000       1 kernel

At any point in time a thread can be in a specific state. A prolonged blocked state can be a symptom of a problem. This does not mean that if a thread is in blocked state then there is a problem, so please don’t issue the show process blocked command and go raise a case with the TAC. Blocked threads are also very normal in a distributed OS using IPC mechansims. Consider the above command output. If we look at the first thread in the list we note it’s the ksh, and its reply blocked on devc-conaux. What has happened here is the client, the ksh in this case, has sent a message to the devc-conaux (driver for console and aux-port) process, the server being devc-conaux is holding ksh reply blocked until it replies. Ksh is the unix shell that someone is using on the console or aux port. Ksh is waiting for input from the console. If there is none because the operator is not typing at the time, it will remain blocked until such time that it processes some input. After processing, ksh will return to reply blocked on devc-conaux waiting for the next input. This is all perfectly normal and does not illustrate a problem. Blocked threads are normal; depending on what XR version, the type of system you have, what you have configured and who is doing what will alter the output of the show process blocked command. Having said that, using the show process blocked command can be very helpful when troubleshooting OS problems, bearing in mind what the normal baseline is.

Note that from release 3.5.0 (CSCek34731) onwards, the ‘TimeInState’ column is introduced. This provides a cumulative count as to how long a process has been in a particular state. In this case, since we’re looking for ‘blocked’ processes, the ‘TimeInState’ value is just for that state. As noted previously, some processes are expected to remain in blocked state.

At any point in a time a thread can be in a particular state. The table below provides a list of the states.

If the State is:             The Thread is:
DEAD            Dead. The Kernel is waiting to release the threads resources
RUNNING         Actively running on a CPU
READY           Not running on a CPU but is ready to run
STOPPED         Suspended (SIGSTOP signal)
SEND            Waiting for a server to receive a message
RECEIVE         Waiting for a client to send a message
REPLY           Waiting for a server to reply to a message
STACK           Waiting for more stack to be allocate
WAITPAGE        Waiting for the process manager to resolve a page fault
SIGSUSPEND      Waiting for a signal
SIGWAITINFO     Waiting for a signal
NANOSLEEP       Sleeping for a period of time
MUTEX           Waiting to acquire a mutex
CONDVAR         Waiting for a conditional variable to be signaled
JOIN            Waiting for the completion of another thread
INTR            Waiting for an interrupt
SEM             Waiting to acquire a semaphore

Process Monitoring

Most users are used to using the CPU utilization levels as a rough indication as the state of their routers. CPU utilization does indeed provide a guide but given the distributed nature of operation in a router running IOS XR, it must be born in mind that the apparent high CPU utilization of the Active RP does not necessarily indicate a ‘problem’. An XR router can operate at 100% utilization for extended periods of time – the key to ‘safe’ operation being that processes are able to get sufficient run-time on the processor. XR implements pre-emptive scheduling where by an individual thread may execute continuously for up to 4ms before it must yield to allow another thread to run. XR does not implement a ‘run to completion’ model. Peaks and troughs in CPU utilization are to be expected and are normal based on the various periodic processes that are executing.

‘Normal’ utilization level will depend on the version of XR that’s being used, the exact hardware configuration (the number of linecards in chassis result increased inter-card communication, for example), the features being used and how they have been configured (timer tuning for example). Obtaining a router’s baseline utilization level over a period of time in the live network is the most valuable approach, since this provides the most realistic guide when a deviation from the baseline is observed.

Note that utilization level associated with any particular process may vary from release to release as the code is optimized. Optimizations may include moving from single-threaded to multi-threaded operation, blocking IPC calls to non-blocking critical thread separation.

Monitoring of both active and standby RP CPUs together with linecard CPUs will provide a detailed view of system behavior. CPU utilization levels can be obtained using SNMP polling of the XXX MIB object. The information may also be obtained via the command ‘show process cpu location <loc>’ command. ‘Live’ monitoring can be performed using the ‘monitor process / monitor threads’ command which provides a unix ‘top’-like output. Note that executing these commands will in of itself add a small increase to the CPU levels.

233 processes; 788 threads; 4663 channels, 5906 fds
CPU states: 94.8% idle, 4.1% user, 1.0% kernel
Memory: 4096M total, 3599M avail, page size 4K

      JID TIDS Chans   FDs Tmrs   MEM   HH:MM:SS   CPU NAME
        1   26 236   183    1      0   67:18:56 1.06% procnto-600-smp-cisco…
      256    5   39    21    4   292K    0:02:44 0.79% packet
       69   10 454     9    3     2M    0:33:07 0.62% qnet
      331    8 254    21   13     2M    0:15:20 0.52% wdsysmon
       55   11   23    15    6    36M    0:31:18 0.50% eth_server
      241   12   96    83   13     1M    0:04:54 0.37% netio
      171   15   97    44    9     2M    0:03:33 0.12% gsp

Note that on a router with dual-core CPUs the command ‘run top_procs’ will display the CPU loading on both CPU cores:
RP/0/RP0/CPU0:host#run top_procs
Computing times...
node0_RP0_CPU0: 238 procs, 2 cpus, 5.11 delta, 331:34:01 uptime
Memory: 4096 MB total, 3.209 GB free, sample time: Mon Aug 17 07:06:06 2013
cpu 0 idle: 97.50%, cpu 1 idle: 97.70%, total idle: 97.60%, kernel: 0.13%

      pid   mem MB   user cpu kernel cpu   delta % ker % tot name
    32792    0.382 13444.362      0.192   0.047   0.00   0.46 eth_server
   147532    3.250   1406.130      0.480   0.042   0.00   0.41 gsp
   147537    1.062    661.831      0.054   0.037   0.00   0.36 sysdb_mc
   131107    1.031    684.142      0.064   0.034   0.00   0.33 sysdb_svr_admin
    36892    1.312   7629.390      0.735   0.015   0.00   0.14 qnet
   151640    0.441   4453.265    613.912   0.013   0.00   0.12 shelfmgr
     8200    0.281      9.085      1.295   0.008   0.00   0.07 devc-conaux
    90148    0.406   2490.764      1.383   0.007   0.00   0.06 sysmgr
   122930    2.097   1239.322    340.326   0.006   0.00   0.05 wdsysmon
13500656    0.156      0.026      0.030   0.006   0.03   0.05 top_procs

‘Top procs’ can be run on other nodes using the syntax ‘run top_procs –l node0_<slot|RP0|RP1>_CPU0’.

WDSysmon monitors the operation and behaviour of processes on the system. If a process is determined to be ‘hogging’ the CPU, after a period of time, WDsysmon will reset the process, terminating the CPUHog and recording a set of data captures.

Processes within the OS have assigned priorities between 0 and 63, with most processes and child threads operating at level 10. The priority values are used by the Scheduler to determine which process should get CPU runtime. WDsysmon operates at level 63 and is able to terminate processes of a lower priority. There are a selected set of core processes which although a CPUHog event will be detected, will not be terminated by WDsysmon since terminating these processes could result in system instability.

The processes are listed below:

Process name    Thread Priority
devb-ata         10
Dumper           10
eth_server       10,30,49,50 & 55
Exec             10
Nvram            10
devf-scrp        10
parser_server    10
Wdsysmon         10,11,19 & 63

Wdsysmon will declare a CPUHog when a single process consumes 25% of total CPU time. Wdsysmon will generate log messages after 20 seconds of a CPUHog being detected. If hog persists for 30 seconds, WDsysmon will identify the hogging process(es), the process(es) will then be killed, process dumps taken and recorded to the bootflash media on the Active RP. The system will wait for another 150 seconds after killing any processes to allow for dumps to complete before resetting the RP. Once the RP reset is trigger, recovery time is as normal.

Syslog message like the following example will be generated should a hog be detected and the process terminated:

RP/0/RP1/CPU0:Jun 8 23:58:53.007 : wdsysmon[334]: %HA-HA_WD-6-CPU_HOG_4 : Process wd_test pid 13770923 tid 2 prio 12 using 99% is the top user of CPU.
# RP/0/RP1/CPU0:Jun 8 23:58:55.012 : wdsysmon[334]: %HA-HA_WD-1-CPU_HOG_5 : Process wd_test pid 13770923 tid 2 prio 12 using 99% is hogging CPU and will be terminated

The output of the command ‘show watchdog trace | inc hog’ can be used to see which processes have been recorded as ‘hogging’.

To get a fast overview of abnormally terminated processes one can check the output from ‘show process abort’ or ‘show sysmgr trace verbose | i PROC_ABORT’. In addition one can use the command ‘show context location all’ to provide a view across the router of any process crashes that may have occurred.

Since the Sysmgr process is already monitoring all processes on the system it is not necessarily required to monitor vital processes by external management tools since syslog messages will be generated should Sysmgr perform any action on process. In an operational environment with large number of Syslog messages to be reviewed, checking the output of the command ‘show event manager metric process <name> location <node>’ for critical processes on a regular basis can be helpful. The output provides information about process termination behavior and reason of the particular process. Important items to look for in the command output are the number of times the process ended abnormally and the number of abnormal ends within the past time periods.

RP/0/RP0/CPU0:host#show event manager metric process bgp location 0/RP0/CPU0

=====================================
job id: 123, node name: 0/RP0/CPU0
process name: bgp, instance: 1
--------------------------------
last event type: process start
recent end type: terminated by signal (SIGTERM)
recent start time: Jul 17 02:45:14 2013
recent normal end time: n/a
recent abnormal end time: Jul 17 02:45:14 2013
recent abnormal end type: terminated by signal (SIGTERM)
number of times started: 2
number of times ended normally: 0
number of times ended abnormally: 1
most recent 10 process start times:
--------------------------
May 12 23:34:08 2013

Jul 17 02:45:14 2013
--------------------------

most recent 10 process end times and types:
--------------------------

, terminated by signal (SIGTERM)
--------------------------

cumulative process available time: 1563 hours 11 minutes 14 seconds 55 milliseconds
cumulative process unavailable time: 0 hours 0 minutes 1 seconds 276 milliseconds
process availability: 0.999999773
number of abnormal ends within the past 60 minutes (since reload): 1
number of abnormal ends within the past 24 hours (since reload): 1
number of abnormal ends within the past 30 days (since reload): 1

Vital system processes are for example: qnet, gsp, qsm, ens, redcon, netio, ifmgr, fgid_aggregator, fgid_server, fgid_allocator, fsdb_server, fsdb_aserver, fabricq_mgr, fia_driver, shelfmgr and lrd on the RP and fabricq_mgr, ingressq, egressq, pse_driver, fia_driver, cpuctrl, pla_server on the LC. Obviously application level processes like isis, bgp, ospf, mpls_ldp, mpls_lsd, fib_mgr, ipv4_rib can be added.

As mentioned in the previously a temporary blocked state for any process is possible. Therefore it is recommended to run the command ‘show process blocked’ two times consecutively per interval per node. To avoid false alarms a third iteration could be triggered if a process is displayed as blocked after the first and second iteration.

The output of the command will always display blocked on ‘reply’ for a handful of processes similar like the following example:

RP/0/RP1/CPU0:router#sh processes blocked
Jid       Pid Tid                 Name State Blocked-on
65546      4106   1                  ksh Reply    4104 devc-conaux
105     53304   2              attachd Reply   24597 eth_server
105     53304   3              attachd Reply    8205 mqueue
350     61511   1          tftp_server Reply    8205 mqueue
247     82079   2              lpts_fm Reply   77936 lpts_pa
159     82091   2               fdiagd Reply   24597 eth_server
159     82091   3               fdiagd Reply    8205 mqueue
361     94412   1            udp_snmpd Reply   82066 udp
65760 58855648   1                 exec Reply       1 kernel
65761 58933473   1                 more Reply    8203 pipe
65764 58933476   1       show_processes Reply       1 kernel

For these processes it is a normal output and not a matter of concern. For example, the line:

65764 58933476 1 show_processes Reply 1 kernel

is a direct result of executing the command ‘show process blocked’. Each time the command is applied the process ID (Pid) will change.

If vital system processes or fundamental application controlling connectivity like routing protocols or mpls ldp etc. appear to be consistently blocked in Reply, Sent, Mutex or Condvar state it is recommended to collect the data from the ‘follow job|process’ output or execute a ‘dumpcore running’ of the affected process. Note that unless otherwise configured by the ‘exception choice’ configuration command, the output from dumpcore is saved in harddisk:/dumper.

The ‘follow job|process’ command can be used to debug a live process or a live thread in a process. This command is particularly useful for debugging deadlock and livelock conditions, for examining the contents of a memory location or a variable in a process to determine the cause of a corruption issue or in investigating issues where a thread is stuck spinning in a loop.
The command ‘follow job 123 location 0/RP1/CPU0 verbose’ will follow the operations of the process ‘123’ on the Route Processor 0/RP1/CPU00. If the iteration option is not specified the command performs the operation for 5 iterations with a delay of 5 seconds between each iteration.

The command ‘follow process 139406 blocked iteration 2 thread 4’ will follows the chain of thread IDs (tids) or pids that are blocking the target process.

In an emergency situation, if all these data has been collected, a restart of the impacted process can be considered. It is recommended to involve your technical representative and follow the advice from the TAC engineers before performing the restart.

Wdsysmon critical service monitoring

In release 3.6 onwards, the functionality provided by Wdsysmon and wd-mbi has been replaced by three processes known as wd-critical-mon, wd-stat-publisher and wdsymon. Wd-critical-mon contains the two functions known as ‘watcher’ and ‘ticker’ (also present in prior releases). These two threads check that processes are being run and that process scheduling is taking place within ‘normal’ time period boundaries. In the event that exceptions are detected, recovery behavior may be to trigger and RP failover or to signal to the Wdsymon process to perform an action such as detecting and terminating a process which is ‘CPU-hogging’, memory-leaking or file-descriptor leaking.

In addition, statistics are gathered on the operation these key threads as well as ‘critical processes’, critical processes being wdsysmon, sc-reddrv-main, sc-reddrv-broadcast and hbagent-main. Statistics are gathered and published via a show command that provides a history over a 24 hour period at 15 minute intervals.

While the system will detect, report and attempt to correct issues affecting system operation, monitoring the output of the show command also provides useful indications as to the system’s health and stability. The command ‘show critmon statistics all location <loc>’ can be used by users with ‘cisco-support’ taskgroup membership.

RP/0/RP0/CPU0:host(admin)#sh critmon statistics all location 0/RP0/CPU0

------------------------------------------------------------------------------
Ticker statistics info (Node: 0/RP0/CPU0)
------------------------------------------------------------------------------
Period        SnapShotTimestamp                Frequency
(min) CPU#   MM/DD/YYYY hh:mm:ss tick count (count/min)
------ ------ ------------------- ---------- ------------
15     cpu:0 07/17/2013 01:57:32 4456        297
15     cpu:0 07/17/2013 02:12:32 4455        297
15     cpu:0 07/17/2013 02:27:32 4455        297
15     cpu:0 07/17/2013 02:42:32 4456        297
------------------------------------------------------------------------------
Watcher statistics info (Node: 0/RP0/CPU0)
------------------------------------------------------------------------------
Period SnapShotTimestamp                  Frequency
(min) MM/DD/YYYY hh:mm:ss watch count (count/min)
------ ------------------- ----------- ------------
15     07/17/2013 01:57:32 1495         99
15     07/17/2013 02:12:32 1495         99
15     07/17/2013 02:27:32 1495         99
15     07/17/2013 02:42:32 1495         99
------------------------------------------------------------------------------
CPU congestion history (Node: 0/RP0/CPU0)
------------------------------------------------------------------------------
No congestion history
------------------------------------------------------------------------------
Deadline monitoring statistics info (Node: 0/RP0/CPU0)
------------------------------------------------------------------------------
client                   SnapShotTimestamp                Frequency
(name)                   MM/DD/YYYY hh:mm:ss tick count (count/min)
------------------------ ------------------- ---------- ------------
wdsysmon                 07/17/2013 01:57:32 449         29
wdsysmon                 07/17/2013 02:12:32 450         30
wdsysmon                 07/17/2013 02:27:32 449         29
wdsysmon                 07/17/2013 02:42:32 450         30
sc-reddrv-main           07/17/2013 01:57:32 1793        119
sc-reddrv-main           07/17/2013 02:12:32 1793        119
sc-reddrv-main           07/17/2013 02:27:32 1792        119
sc-reddrv-main           07/17/2013 02:42:32 1793        119
sc-reddrv-broadcast      07/17/2013 01:57:32 179         11
sc-reddrv-broadcast      07/17/2013 02:12:32 180         12
sc-reddrv-broadcast      07/17/2013 02:27:32 180         12
sc-reddrv-broadcast      07/17/2013 02:42:32 180         12
hbagent-main             07/17/2013 01:57:32 900         60
hbagent-main             07/17/2013 02:12:32 900         60
hbagent-main             07/17/2013 02:27:32 900         60
hbagent-main             07/17/2013 02:42:32 900         60

The output above shows a system in its steady-state. Again, variations from the norm are good indicators of events that are impacting the systems operation. When a CPU-hog is detected, Wd-critical-mon will signal Wdsysmon CPU-hog handler which in turn will perform the recovery procedure, recording information in syslog as well as in the ‘CPU congestion history’ field.

Memory Monitoring

Wdsysmon perform memory-leak detection checking the memory state of each node ((d)RP and Linecard) on regular intervals. It is distinguished between four state thresholds, Normal, Minor, Severe and Critical. The definition of the node state thresholds depends on the size of the physical memory. For instance on a node with 4GB of physical memory by default the memory state is considered NORMAL as long the Free Memory is greater than 80Mb.

The memory state can be verified with the CLI command ‘show watchdog memory-state <location>’.

RP/0/RP1/CPU0:router#sh watchdog memory-state
Memory information:
    Physical Memory: 4096     MB
    Free Memory:     3447.226 MB
    Memory State:         Normal

If the memory state changes from NORMAL to MINOR the following syslog message is generated

%HA-HA_WD-4-MEMORY_ALARM Memory threshold crossed: [chars] with [unsigned int].[unsigned int]MB free

Wdsysmon itself has a procedure to recover from memory-depletion conditions. When Wdsysmon determines that the state of the node is SEVERE, it attempts to identify the errant process or set of processes that are holding excessive memory leading to the depletion condition. Wdsysmon will obtain record process state information before restarting the process in attempt to free the memory. This situation is accompanied by the following syslog message:

%HA-HA_WD-4-TOP_MEMORY_USER_WARNING [dec]: Process Name: [chars][[dec]], pid: [dec][chars], Kbytes used: [dec]

The default memory thresholds can be verified with the CLI command ‘show watchdog threshold memory defaults location <node>’

RP/0/RP0/CPU0:router#sh watchdog threshold memory defaults location 0/RP0/CPU0
Default memory thresholds:
Minor:     409      MB
Severe:    327      MB
Critical: 204.799 MB
Memory information:
    Physical Memory: 4096     MB
    Free Memory:     3419.582 MB
    Memory State:         Normal

The default limits can be overwritten with the ‘watchdog threshold memory location’ configuration command. Recommended memory thresholds on a P router are 20% minor, 10% severe and 5% for critical. The following example configures the recommended thresholds on RP0:

(config)#watchdog threshold memory location 0/RP0/CPU0 minor 20 severe 10 critical 5

The configured thresholds can be verified with the CLI command ‘show watchdog threshold memory configured location <node>’

RP/0/RP0/CPU0:router#sh watchdog threshold memory configured location 0/RP0/CPU0
Configured memory thresholds:
    Minor:     819      MB,
    Severe:    409      MB
    Critical: 204.799 MB

Memory usage analyser

The Memory usage analyser tool records brief details about the heap memory usage of all processes on the router in two different snapshot, comparing the results and providing a report highlight processes that are increasing or decreasing their held memory values. The tool is used as follows:

1.    Take an initial snapshot using the following command ‘show mem compare start’
2.    Take another snapshot using the following command ‘show mem compare end’
3.    Print the output using the following command ‘show mem compare report’

The tool could be integrated into a regular operations routine whereby the initial memory report is obtained at the beginning of the shift, the ‘end’ report part-way during the shift allowing time for the engineer to review the report.

The output contains information about each process whose heap memory usage has changed over the test period. It is ordered by the size of the change starting with the process with the largest increase. Again, the most efficient approach to detect memory leaks using the tool, is to use the tool on a stable system with no configuration changes taking place during the monitoring period.

For each process the following information is printed:
JID    Process Job ID
Name    Process name
mem before    Heap memory usage at start (in bytes)
mem after    Heap memory usage at end (in bytes)
difference    Difference in heap memory usage (in bytes)
mallocs    Number of unfreed allocations made during test period
restarted    Indicates if the process was restarted during test period

File Descriptor leaks

XR is a file-oriented OS in that inter-process communication typically results in messages being sent to a ‘file’ which in turn relates to another process somewhere in the system. Processes allocate File Descriptors handles for a number of functions. In general, processes should obtain and release File Descriptors as required, the number averaging over time. File descriptor ‘leaks’ can occur where a process fails to release descriptors.

WDsysmon monitors descriptor utilization (from rls 3.4. onwards) once every minute. Each process is allocated a descriptor limit, the default being 1000 descriptors. Individual applications may modify the limit. Syslog messages will be triggered if a process exceed 80% of the maximum and then again at 85%. At 95% the errant process will be restarted after debug information has been obtained.

The command ‘show process files location <loc>’ can be run periodically to see how many descriptors are allocated to each process. Increases over periods of time may indicate an issue for further investigation. Again, process restarts will typically release file descriptor but should be performed with care.

Qnet monitoring

Qnet is a transport mechanism for interprocess communication between nodes in the chassis that uses the control-ethernet network. Qnet process operation is monitored by Wdsysmon. If the process is observed to have become blocked for more than 30sec, Wdsysmon will perform a process restart in an attempt to correct the situation. A coredump will be written to disk and syslog messages generated.

The primary method for monitoring Qnet operation is to check the output of ‘show process qnet’ and/or ‘show process blocked’. Qnet relies on the eth_server process to place packets onto the control-ethernet network. Communication to the eth_server process can be checked using the admin command ‘sh controllers backplane ethernet clients 1 statistics location <loc>’.

Client QNET, ES Client Id 1, PID 45089 running on FastEthernet0_RP0_CPU0
    LWM calls 1 open, 0 close, 0 close callback, 0 unblocks
    142745816 packets input, 18454528798 bytes
    142745816 packets delivered,418184718 bytes
    0 packets discarded (0 bytes) in garbage collection
    0 (0 bytes) unicast packets filtered
    0 (0 bytes) multicast packets filtered
    0 (0 bytes) buffer mgmt policy discards
    0 (0 bytes) locking error discards
    0 packets waiting for client

    143282818 packets output, 15705151452 bytes, 0 could not be transmitted
    Packets output at high priority : 0
    Packets output at med priority : 0
    Packets output at low priority : 143282818
    Out-of-packet write rejects (high) : 0
    Out-of-packet write rejects (med ) : 0
    Out-of-packet write rejects (low ) : 0
    DMA write rejects (high) : 0
    DMA write rejects (med ) : 0
    DMA write rejects (low ) : 0

Packet rejects and locking errors may indicate an issue which requires further investigation. Inter-node communication across the control-ethernet can be tested using the admin command ‘ping control-eth location <loc>’. The ‘location’ defines the target to which pings are sent from the Active RP.

Src node:        513 : 0/RP0/CPU0
Dest node:        49 : 0/3/CPU0
Local node:      513 : 0/RP0/CPU0
Packet cnt:        1 Packet size:   128 Payload ptn type: default (0)
Hold-off (ms):   300 Time-out(s):     2 Max retries: 5
Destination node has MAC addr 5246.4800.0031

Running CE node ping.
Please wait...
Src: 513, Dest: 49, Sent: 1, Rec'd: 1, Mismatched: 0
Min/Avg/Max RTT (usecs): 5000/5000/5000
CE node ping succeeded for node: 49

Packet loss across the control-ethernet should be investigated. An alternative is to use the Online-Diagnostics test ‘ControlEthernetPingTest’. This can be run using the admin command ‘diagnostic start location <loc> test 1’ if the diagnostics package is installed.

Group Service protocol

The Group Services process provides a reliable multicast interprocess communications mechanism used to implement scalable, distributed applications. GSP can operate either over the control-ethernet network or over the switch fabric.

A group can be created or joined by a process. After a create or join operation succeeds, the process becomes a member of the group. A process can be part of more than one group, which can often happen when libraries are included, and the libraries create and use their own groups. When a process wishes to stop using a group it can leave the group.

When any member of a group sends a message, the message will be received by all members of the same group, on all nodes which have members of the group. Various options are available when creating groups, and different options can be used to send data to subsets of the group as well.

Basic GSP operation can be checked using the commands ‘run gsp_ping -g1 -c20 -rv -q0’, ‘run gsp_ping -g1001 -c20 -rv -q0’ (tests gsp over Control-ethernet) and ‘run gsp_ping -g2001 -c20 -rv -q0’ (tests gsp over fabric). Example outputs are shown below:

run gsp_ping -g1 -c20 -rv -q0

                 Node                   Sent     Rcv.    Late    Lost
        ______________________________________________________________
          0/SM2/SP (0x820:1)             20      20       0       0
          0/SM1/SP (0x810:2)             20      20       0       0
          0/SM0/SP (0x800:3)             20      20       0       0
            0/3/SP (0x30:4)              20      20       0       0
          0/SM3/SP (0x830:5)             20      20       0       0
            0/5/SP (0x50:6)              20      20       0       0
            0/1/SP (0x10:7)              20      20       0       0
        0/RP1/CPU0 (0x211:8)             20      20       0       0
          0/3/CPU0 (0x31:9)              20      20       0       0
          0/5/CPU0 (0x51:10)             20      20       0       0
gsp_ping: All 10 nodes responded to all 20 pings

run gsp_ping -g1001 -c20 -rv -q0

                 Node                   Sent     Rcv.    Late    Lost
        ______________________________________________________________
        0/RP1/CPU0 (0x211:1)             20      20       0       0
          0/3/CPU0 (0x31:2)              20      20       0       0
          0/5/CPU0 (0x51:3)              20      20       0       0
gsp_ping: All 3 nodes responded to all 20 pings

run gsp_ping -g2001 -c20 -rv -q0

                 Node                   Sent     Rcv.    Late    Lost
        ______________________________________________________________
        0/RP1/CPU0 (0x211:1)             20      20       0       0
          0/3/CPU0 (0x31:2)              20      20       0       0
          0/5/CPU0 (0x51:3)              20      20       0       0
gsp_ping: All 3 nodes responded to all 20 pings

The number of GSP ‘ping’ packets sent and received should be the same. If responses are timed-out (late) or lost, then gather the output of the command ‘show tech gsp’

Configuration file systems

Configuration File System (CFS) are a set of files and directories used to store the router's configuration. CFS is stored under the directory disk0:/config/, which is the default media used on the RP. Files and directories in CFS are internal to the router and should never be modified or removed by the used. This can result in loss or corruption of the configuration and is service affecting.

The CFS is checkpointed to the standby-RP after every commit. This helps preserve the router's configuration file after a fail over.

CFS integrity can be confirmed using the ‘cfs check’ command on a periodic basis.

LC resource monitoring

Fib entries are split across two memory structures attached each PSE (both on ingress and egress). Prefixes are stored in the PLU memory, adjacency and load-balancing information is stored in TLU memory. Memory leaks in these structures can lead to instances in which prefix updates can no longer be applied leading to inconsistent forwarding behaviour. Resources can be monitored using command ‘sh cef resource hardware <ingress|egress> detail location <loc>’:

CEF resource availability summary state: GREEN
ipv4 shared memory resource:
        CurrMode GREEN, CurrAvail 998563840 bytes, MaxAvail 1019772928 bytes
ipv6 shared memory resource:
        CurrMode GREEN, CurrAvail 998563840 bytes, MaxAvail 1019772928 bytes
mpls shared memory resource:
        CurrMode GREEN, CurrAvail 998563840 bytes, MaxAvail 1019772928 bytes
common shared memory resource:
        CurrMode GREEN, CurrAvail 998563840 bytes, MaxAvail 1019772928 bytes
DATA_TYPE_TABLE_SET hardware resource: GREEN
DATA_TYPE_TABLE hardware resource: GREEN
DATA_TYPE_IDB hardware resource: GREEN
DATA_TYPE_IDB_EXT hardware resource: GREEN
DATA_TYPE_LEAF hardware resource: GREEN
DATA_TYPE_LOADINFO hardware resource: GREEN
DATA_TYPE_PATH_LIST hardware resource: GREEN
DATA_TYPE_NHINFO hardware resource: GREEN
DATA_TYPE_LABEL_INFO hardware resource: GREEN
DATA_TYPE_FRR_NHINFO hardware resource: GREEN
DATA_TYPE_ECD hardware resource: GREEN
DATA_TYPE_RECURSIVE_NH hardware resource: GREEN
DATA_TYPE_TUNNEL_ENDPOINT hardware resource: GREEN
DATA_TYPE_LOCAL_TUNNEL_INTF hardware resource: GREEN
DATA_TYPE_ECD_TRACKER hardware resource: GREEN
DATA_TYPE_ECD_V2 hardware resource: GREEN
DATA_TYPE_ATTRIBUTE hardware resource: GREEN
DATA_TYPE_LSPA hardware resource: GREEN
DATA_TYPE_LDI_LW hardware resource: GREEN
DATA_TYPE_LDSH_ARRAY hardware resource: GREEN
DATA_TYPE_TE_TUN_INFO hardware resource: GREEN
DATA_TYPE_DUMMY hardware resource: GREEN
DATA_TYPE_IDB_VRF_LCL_CEF hardware resource: GREEN
DATA_TYPE_TABLE_UNRESOLVED hardware resource: GREEN
DATA_TYPE_MOL hardware resource: GREEN
DATA_TYPE_MPI hardware resource: GREEN
DATA_TYPE_SUBS_INFO hardware resource: GREEN
To examine the prefix-carrying capacity of the linecard, use the command ‘show tbm ipv4 unicast dual detail location <loc>’.

IPV4 UNICAST TBM Table
----------------------
TBM Table type/id is: 0
    Num Prefixes: 200023
    Sw Shadow memory usage (static): 18900
    Sw Shadow memory usage (dynamic): 3248984
    Num Inserts: 2500041
    Num Deletes: 2300017
      Search Nodes:   809
      Leaf Nodes:     200277
      Internal Nodes: 12
     End Nodes:      50283

In this example, the router contains 200k /24 prefixes generated in a lab environment. In this contrived configuration, the leaf node to end node ratio is ~4:1. Depending on the prefix distribution, this ratio can vary from 1:1 (worst case) to 32:1 (best case). In a typical routing environment with a mixture of different prefix lengths, the internal node value will increase and the leaf node to end node ratio to alter.

RP/0/RP1/CPU0:q#show tbm ipv4 unicast dual detail location 0/0/cpu0
IPV4 UNICAST TBM Table
----------------------
TBM Table type/id is: 0
    Num Prefixes: 294744
    Sw Shadow memory usage (static): 18900
    Sw Shadow memory usage (dynamic): 7920168
    Num Inserts: 1532357
    Num Deletes: 109363
      Search Nodes:   10417
      Leaf Nodes:     296833
      Internal Nodes: 8903
      End Nodes:      121454

In the example above, using a sample from a router with full internet routes, the leaf node to end node ratio is 2.44:1.

In addition to the ‘show tbm’ command, the command ‘show plu server summary ingress|egress location <loc> ‘ can be used to example the memory usage for the prefixes.

RP/0/RP1/CPU0:q#show plu server summary ingress location 0/0/cpu0
Channel-1 free pages: 439
Channel-2 free pages: 443
Channel-3 free pages: 440
Channel-4 free pages: 448

Channel-0 allocations
        Table-0: 37 pages (Channel-0)
        Table-1: 1 pages (Channel-0)
        Table-3: 1 pages (Channel-0)
        Table-13: 2 pages (Channel-0)
        Table-14: 1 pages (Channel-0)
Channel-1 allocations
        Table-0: 62 pages (Channel-1)
        Table-1: 3 pages (Channel-1)
        Table-3: 1 pages (Channel-1)
        Table-13: 3 pages (Channel-1)
        Table-14: 3 pages (Channel-1)
Channel-2 allocations
        Table-0: 58 pages (Channel-2)
        Table-1: 3 pages (Channel-2)
        Table-13: 4 pages (Channel-2)
        Table-14: 3 pages (Channel-2)
Channel-3 allocations
        Table-0: 62 pages (Channel-3)
        Table-1: 3 pages (Channel-3)
        Table-13: 3 pages (Channel-3)
        Table-14: 3 pages (Channel-3)
Channel-4 allocations
        Table-0: 57 pages (Channel-4)
        Table-1: 2 pages (Channel-4)
        Table-13: 2 pages (Channel-4)
        Table-14: 2 pages (Channel-4)

In addition, data is split across up to 5 memory banks known as ‘Channels’. Within the channels, memory is organized into a series of tables, the table number depending on the data type that’s being stored. In 3.5 onwards, the table names are displayed in the command output (CSCse97366). The list below indicates the table number to data type mapping:

Table Number    Table Type
0    TBM_TABLE_TYPE_IP
1    TBM_TABLE_TYPE_IPV6
2    TBM_TABLE_TYPE_MULTICAST
3    TBM_TABLE_TYPE_MPLS
4    TBM_TABLE_TYPE_VPNV4
5    TBM_TABLE_TYPE_MULTICAST_SIGNAL
6    TBM_TABLE_TYPE_TEST1
7    TBM_TABLE_TYPE_TEST2
8    TBM_TABLE_TYPE_L2TPV3
9    TBM_TABLE_TYPE_IPV6_MULTICAST
10    TBM_TABLE_TYPE_IPV6_MULTICAST_HASH
11    TBM_TABLE_TYPE_IPV6_MULTICAST_SIGNAL_HASH
12    TBM_TABLE_TYPE_IPV6_MULTICAST_SIGNAL
13    TBM_TABLE_TYPE_IP_DEFAULT
14    TBM_TABLE_TYPE_IPV6_DEFAULT
16    TBM_TABLE_TYPE_SW_ONLY

Each channel has ‘tables’ allocated for each type of data structure that could be stored. Within each table are 512 ‘pages’ used for storing the actual prefixes. In the output from the ‘show plu’ command one could see that within a single channel, the maximum number of pages allocated across all tables was 72. Given that the limit is 512, 72 pages equates to ~14% utilization. The exact distribution of prefixes to channels (and pages within the channels) will depend on the prefix type and length that’s being held together with the the lookup key (such as VRF table #).

Monitoring the maximum page allocation on the channels and calculating the utilization percentage provides an rough indication as to the amount of ‘spare room’ available for additional prefixes. One must emphasize that the allocation is not linear since it will depend on the prefix length, type and actual address.

Route Processor Redundancy

Check the state of router redundancy using the command ‘show redundancy’. Check the node uptimes, paying particular attention to the standby node state:

RP/0/RP1/CPU0:host#sh redundancy

Redundancy information for node 0/RP1/CPU0:
==========================================
Node 0/RP1/CPU0 is in ACTIVE role
Partner node (0/RP0/CPU0) is in STANDBY role
Standby node in 0/RP0/CPU0 is ready
Standby node in 0/RP0/CPU0 is NSR-ready

Reload and boot info
----------------------
RP reloaded Jun 2 14:25:01 2013: 6 weeks, 2 days, 12 hours, 43 minutes ago
Active node booted Jun 22 01:00:56 2013: 3 weeks, 4 days, 2 hours, 7 minuteo
Last switch-over Jun 22 02:19:02 2013: 3 weeks, 4 days, 49 minutes ago
Standby node boot Jun 22 02:20:17 2013: 3 weeks, 4 days, 48 minutes ago
Standby node last went not ready Jun 22 02:26:58 2013: 3 weeks, 4 days, 41 o
Standby node last went ready Jun 22 02:26:58 2013: 3 weeks, 4 days, 41 minuo
There have been 5 switch-overs since reload

Active node reload "Cause: Initiating switch-over."
Standby node reload "Cause: Initiating switch-over."
Key is to ensure that the standby RP is in ‘standby, ready’ state indicating that in the event of the Active RP failing, the standby RP is in a position to take over system control. Note that if using NSR functionality (for protocols such as OSPF, LDP and BGP), the standby node needs to be in NSR-ready state.

Show system verify

‘Show system verify’ provides a report which can be used to assess system health. One must be cautious with the use of this tool since it is a script-based function with no intelligence. It may report false positives. Again, as with other aspects covered in this document, understanding what is ‘normal’ and what triggers ‘false positives’ in the particular network/router will allow the user to understand deviations from the norm.

System verify works by taking a snapshot of a series of key functional areas, taking another snapshot at some point later (manually triggered) and then comparing the two reports. Data can be provided in summary or detailed output.

RP/0/RP1/CPU0:host# show system verify report
Getting current router status ...
System Verification Report
==========================
- Verifying Memory Usage
- Verified Memory Usage                                 : [OK]
- Verifying CPU Usage
- Verified CPU Usage                                    : [OK]
- Verifying Blocked Processes
- Verified Blocked Processes                            : [OK]
- Verifying Aborted Processes
- Verified Aborted Processes                            : [OK]
- Verifying Crashed Processes
- Verified Crashed Processes                            : [OK]
- Verifying LC Status
- Verified LC Status                                    : [OK]
- Verifying QNET Status
- Verified QNET Status                                  : [OK]
- Verifying GSP Fabric Status
- Verified GSP Fabric Status                            : [OK]
- Verifying GSP Ethernet Status
- Verified GSP Ethernet Status                          : [OK]

RP/0/RP1/CPU0:host# show system verify report detail
Getting current router status ...
System Verification Report
==========================
+ Verifying Memory Usage on node0_0_CPU0
Type                Initial   Current
Application         1907M     1907M
Available           1684M     1684M
Physical            2048M     2048M
+ Verified Memory Usage on node0_0_CPU0            : [OK]
+ Verifying Memory Usage on node0_RP0_CPU0
Type                Initial   Current
Application         3951M     3951M
Available           3677M     3677M
Physical            4096M     4096M
+ Verified Memory Usage on node0_RP0_CPU0          : [OK]
+ Verifying Memory Usage on node0_RP1_CPU0
Type                Initial   Current
Application         3951M     3951M
Available           3616M     3616M
Physical            4096M     4096M
+ Verified Memory Usage on node0_RP1_CPU0          : [OK]
+ Verifying CPU Usage on node0_0_CPU0
Initial CPU Usage   : 1.2
Current CPU Usage   : 0.6
+ Verified CPU Usage on node0_0_CPU0               : [OK]
+ Verifying CPU Usage on node0_RP0_CPU0
Initial CPU Usage   : 0.9
Current CPU Usage   : 3.68
+ Verified CPU Usage on node0_RP0_CPU0             : [OK]
+ Verifying CPU Usage on node0_RP1_CPU0
Initial CPU Usage   : 11.44
Current CPU Usage   : 1.74
+ Verified CPU Usage on node0_RP1_CPU0             : [OK]

A common false positive is CPU utilization. The script can result in raised CPU levels causing an ‘error’ to be reported:

Initial CPU Usage : 2.17
Current CPU Usage : 72.83
CPU usage is significantly increased
System may not be stable. Please look into WARNING messages

Obviously the alert can be investigated using the command ‘show process cpu’ to determine if the utilization level is just a spike or something prolonged.

Configuration additions

Kernel Dumper

This function is enabled by default in rls 3.4. onwards. No configuration is required unless the target directory for the dump file is being changed using the configuration command ‘exception kernel memory kernel filepath <path>’. Note that the command must be applied both in ‘normal’ and ‘admin’ configuration modes.

The dump path can be displayed with 'show exception | i ernel’:

Choice kernel path = harddisk:/dumper/ filename = kernel_core Memory = user+kernel
Tftp route for kernel core dump not configured

It is also recommended to ensure that system timezone configuration such as:

clock timezone EST 1
clock summer-time EDT recurring last sunday march 03:00 last sunday october 02:00

is present both in the LR AND Admin configurations so that timestamps are synchronized.

Note: if the filepath is overwritten it has to be configured at least in normal mode. If configured
only in admin mode the command has no effect and the kernel_core file is still written to harddisk: instead to harddisk:/dumper/. If defining a sub-directory rather than the root of the destination filesystem, the sub-directory needs to be created on both the Active and Standby RP devices manually. If the sub-directory does not exist, the dump file will fail to be written.

Online Diagnostics

Online Diagnostics, constantly run by the ASR9000. Online diagnostics are non-intrusive tests that can be run on a periodic (timed) basis that is saved in the admin configuration. Online diagnostics can also be run on demand.

Depending on the type of the selected test and the period desired for this test, the test(s) can be run as "Scheduled" diagnostics (that have a configurable daily or weekly period) and "Health Monitor" diagnostics (that have a period granularity in the millisecond range, and thus have numerous iterations daily or weekly) or a combination of both methods.

The recommendation is as a minimum:

Configure and run the FabricDiagnosisTest in the Health Monitor suite with the default period (a more aggressive period can be tested).

Configure and run the Control Ethernet Inactive Link test in the Health Monitor suite with the default period (a more aggressive period can be tested and deployed if desired).

Additional scheduled tests can be added according to the users requirements.

The configuration example below shows the recommended options:

diagnostic monitor location 0/RP0/CPU0 test FabricDiagnosisTest
diagnostic monitor location 0/RP0/CPU0 test ControlEthernetInactiveLinkTest
diagnostic monitor location 0/RP1/CPU0 test FabricDiagnosisTest
diagnostic monitor location 0/RP1/CPU0 test ControlEthernetInactiveLinkTest
diagnostic monitor syslog

Note that FabricDiagnosisTest and ControlEthertnetInactiveLinkTest execute on the StandbyRP, so test results will only be shown when viewing the output for the StandbyRP’s location:

RP/0/RP0/CPU0:host(admin)#sh diagnostic result location 0/RP1/CPU0

Current bootup diagnostic level for RP 0/RP1/CPU0: minimal

RP 0/RP1/CPU0:

Overall diagnostic result: PASS
Diagnostic level at card bootup: minimal

Test results: (. = Pass, F = Fail, U = Untested)

1 ) ControlEthernetPingTest ---------> U
2 ) SelfPingOverFabric --------------> U
3 ) FabricPingTest ------------------> U
4 ) ControlEthernetInactiveLinkTest -> .
5 ) RommonRevision ------------------> U
6 ) FabricDiagnosisTest -------------> .

Test shows the CPU impact to be minimal for the default settings of Health Monitor applications. The diagnostics pie will take about 19.4M of disk0: flash disk space, which will need to be factored into upgrade and downgrade procedures.

Notes

Periodic monitoring command summary

This section provides a quick-reference list of the commands detailed in the document that could be used by a customer as part of a script that is run periodically to check the health of the system

sh processes blocked location 0/RP0/CPU0
sh processes blocked location 0/RP1/CPU0
sh process cpu location 0/RP0/CPU0 | inc CPU
sh process cpu location 0/RP1/CPU0 | inc CPU
run top_procs -D -i 1 -l node0_RP0_CPU0 | grep idle
run top_procs -D -i 1 -l node0_RP1_CPU0 | grep idle
show watchdog trace | inc hog
show process abort
show context location all
sh critmon statistics all location 0/RP0/CPU0 (rls 3.6)
sh critmon statistics all location 0/RP1/CPU0 (rls 3.6)
show watchdog memory-state all
show processes files location <loc>
ping control-eth location <loc>
run gsp_ping -g1 -c20 -rv -q0
run gsp_ping -g1001 -c20 -rv -q0
run gsp_ping -g2001 -c20 -rv -q0
show controller fabric plane all statistics
show controllers fabric connectivity all detail
show asic-errors all location <loc>
sh cef resource hardware <ingress|egress> detail location <loc>
sh redundancy

Related Information

IOS to XR migration guide: https://supportforums.cisco.com/docs/DOC-22848

Xander Thuijs CCIE#6775

Principal Engineer, ASR9000

IOS XR monitoring is substantially different then classic IOS. One process could hook the CPU in IOS, in XR similar issues don’t exist per-se. Generally we see the request for people using the cisco process or memory mibs to monitor XR and are flabbergasted by the massive output it generates.

There is no total cpu utilization as such as what IOS used to have. Also BGP for instance “claiming” 80% in XR may be a good thing during convergence, however the overal cpu util would still not be too bad. Note also that All IOS XR routers have at least dual core cpu’s.

The good old KPI of monitoring overall system cpu and memory doesn’t apply to XR for those reasons.

Monitoring memory goes per process also. Having little free memory available is not a worry on itself. A process allocating memory and continuously and never releasing is obviously not good, but that is hard to debug with the old method used in IOS.

Basically a single process having allocated 1M of memory may not use all that 1M, but if it continuous to increase over time from its individual use 100k, 200k, 300k etc this is the sign that it *may* be leaking memory.

The good thing is here that even when it leaks memory, it only affects this process and not others. So the impact of the leak is generally contained.

yansenyansen · ‎01-14-2014

Hi Xander,

for memory (bytes) shown on "show route vrf all sum"

Is that memory on the RSP?

so maximum memory 6GB for RSP-440-TR. correct me if i am wrong.

BR.

xthuijs · ‎01-22-2014

totally missed this yansenyansen, sorry about that!

yes that memory reported is process memory on the RSP (if location keyword is omited we look at the RSP).

and that is 6G on the RSP440-TR (12G on the SE).

remember that XR is a 32 bit OS (for now) meaning a single proc can only allocate 4G of memory alone.

the mem-mapper is 64bit, so we can use the 4G+ but not on a single process (today).

cheers

xander

Alejandro Rivera · ‎09-25-2014

Hi Xander,

Great post!

I have a question...

We've been following a case from a local Service Provider, where there are two ASR9006, they're both the same, same RSPs and Software versions, except for their LCs. One has Trident LCs (A9K-4T-B & A9K-40GE-B), the other Typhoon (A9K-MOD80-SE).

The main issue lies in the fact that the one with trident LCs has a constant CPU usage found in the output (show process cpu) of more that 38%. The other ASR9006 has a normal look of 10%.

ipv4_mfwd_partner is the process with the highest percentage, while in the ASR9K6 with typhoon LCs, does not.

ill be glad to hear any suggestion.

Thnx.

Best Regards,

Alejandro Rivera

xthuijs · ‎09-26-2014

hi alejandro,

if it is constant at 38% and never backs down, that is not normal indeed.

but it may have a good explanation. we need to "follow the process" and decode some of the PC's that it is spinning on. It may also be part related to configuration or mcast events that are being received.

this type of troubleshooting is not easy to handle over the forums I think, so I would want to recommend opening a tac case for that verification. However it will definitely be possible to identify why that is happening and to address/mitigate that situation.

Good news is that eventhough the proc is running high, there should not be a direct concern as what we used to have with ios.

cheers!

xander

Vladimir Pisarenko · ‎08-10-2015

Hi Xander!

After installing 5.2.4 on our ASR9010 I noticed memory leaking. I made memory compare report and saw that process pm_collector leaks memory. For what is responsible this process? Can I restart it safe to free memory?

Thanks.

Aleksandar Vidakovic · ‎08-10-2015

hi Vladimir,

the role of this process is to collect statistics. If this is really a leak (please take one more snapshot to confirm), it could be CSCuu10349. Process restart will clear the leak.

hope this helps,

Aleksandar

Vladimir Pisarenko · ‎08-10-2015

Thanks Alexandar!

I'll restart process, then if it will continue to leak memory, I'll open ticket. Because I don't see SMU on CSCuu10349.

orramire · ‎08-22-2018

Hi Xander;

We have a doubt about the concepts of the memory usage. I have a GSP Customer that own ASR9k with RSP440-TR. They have a lot of issues with capacity planning and we helping us with the best practice to optimize the resources. When we see the different commands about the memory monitoring and we saw a little difference:

RP/0/RSP0/CPU0:ASR9K-1#show watchdog memory-state
Wed Aug 22 14:28:30.533 CDT
Memory information:
    Physical Memory: 6144     MB
    Free Memory:     1615.351 MB
    Memory State:         Normal
RP/0/RSP0/CPU0:ASR9K-1#show health memory
Wed Aug 22 14:28:22.441 CDT

Total physical memory     (bytes):                    4294967295
Available physical memory (bytes):                    1692401664

Available memory is 26.27% of physical memory
-----------------------------------------------------------------

Top 5 heap memory holders
-----------------------------------------------------------------
Process                 JID          Heapsize      High watermark
-----------------------------------------------------------------
bgp                    1058         699256832          879276032
ipv4_rib               1171         322220032          330244096
l2fib_mgr               307          44236800           44236800
isis                   1010          29433856           29761536
parser_server           365          27824128           31920128
-----------------------------------------------------------------

Total files in directory /dev/shmem is 554
Top 5 files in directory /dev/shmem
-----------------------------------------------------------------

RP/0/RSP0/CPU0:ASR9k-1#

Both commands have different amounts of Physical Memory. Wich amount is correct? I need to know how interpret this values about the memory usage and the percentage of use or free memory with respect to what baseline.

Thanks

Best Regards

Orlando Ramirez

hank · ‎11-01-2018

Both our ASR9000s have stopped handling SNMP. First it was sporadic and now they have stopped totally. The error we see is:

Internal Dispatch failed for xxx.68.3.139,5343 (reqID 45716) 'snmp-lib-ipc' detected the 'resource not available' condition 'Not enough memory'

Could be CSCve04643. Could be Zabbix was polling too hard (which we have stopped).

But now that we have lost memory, I have not found a way to reclaim lost memory other than doing a reload. I am hoping someone within the next 48 hours can suggest a way to restart snmp and its memory loss w/o us having to do a reload.

Will doing "no snmp-server" and then rebuilding all snmp-server commands do a reload to snmpd and cause it to find the lost memory?

We tried "process restart snmpd" but that helped for an hour before cacti stopped polling again.

Thanks!

panayiotiscy · ‎01-16-2019

Hello Xander,

can we run the "cfs check" outside a maintenance window?

on another note, we are experiencing long (roughly around to 20m) output delays when executing the "show bgp ipv4 uni neighbors x.x.x.x advertised-routes" , especially after modifying the attached as-path rpl. Could this be related to low memory?

Thanks

ASR9000/XR: Monitoring CPU and memory

Introduction

Core Issue

Monitoring Areas

Blocked Processes and Process States

Process Monitoring

Wdsysmon critical service monitoring

Memory Monitoring

Memory usage analyser

File Descriptor leaks

Qnet monitoring

Group Service protocol

Configuration file systems

LC resource monitoring

Route Processor Redundancy

Show system verify

Configuration additions

Kernel Dumper

Online Diagnostics

Notes

Periodic monitoring command summary

Related Information