cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1753
Views
0
Helpful
0
Comments
Viten Patel
Cisco Employee
Cisco Employee

 

Introduction

This Document is written by "Viten Patel", is working as Technical Marketing Engineer (Converged Access) at Cisco. He was working as Wireless Escalation Engineer - Cisco TAC and holds many wireless certification like CCIE Wireless, CCNP Wireless, CCNA Wireless, CWNA, CWSP, CWAP, CWNE#146.

Below checks are a good starting point for troubleshooting issues with IOS-XE based controllers (5760, 3850, 3650). Its a good practice to keep an eye on some of these stats even during normal operation of the system. It will be handy to have the below info handy before opening TAC case.

CPU Utilization

  • Monitor the cpu utilization history and make sure that there are no CPU spikes for longer period (more than couple of minutes).
  • The hourly average CPU utilization of any core for any given minute going higher than ~70% needs further close monitoring. Identify the process that is taking most of the CPU and see the spike can be attributed to any kind of network activity.
  • NOTE: If the controller is added to the Prime Infrastructure(PI), then there could be periodic spikes of some of the processes listed below:
    • snmp_subagent, eicored, eicore_ipc(wcm) and SNMP ENGINE (iosd)
    • The SNMP ENGINE utilization should drop down if there is any other higher priority activity that requires the CPU.
  • Commands

show processes cpu sorted detailed | exclude 0.00 # Sorted list of processes to the thread level based on the CPU utilization.
show processes cpu history

Sample output with PI activity:

#sh proc cpu sort det | e 0.00
Core 0: CPU utilization for five seconds: 45%; one minute: 16%; five minutes: 13%
Core 1: CPU utilization for five seconds: 44%; one minute: 14%; five minutes: 10%
Core 2: CPU utilization for five seconds: 52%; one minute: 15%; five minutes: 11%
Core 3: CPU utilization for five seconds: 44%; one minute: 15%; five minutes: 11%
PID    T C  TID    Runtime(ms) Invoked uSecs  5Sec      1Min     5Min     TTY   Process
                                               (%)       (%)      (%)
5805   L           2621250     1821404 85     17.69     3.14    1.03    0     snmp_subagent
5805   L 2  5805   1074        1478    0      15.99     1.47    0.39    0     snmp_subagent
5805   L 0  6626   1389355     7685676 0      8.42      7.46    0.41    0     snmp_subagent
5805   L 2  6586   762745      9850229 0      3.94      2.63    0.18    0     CMI default xdm
5805   L 1  13461  874         33226   0      2.56      1.84    1.79    0     SNMP SA CACHE
5805   L 2  6497   276868      724659  0      0.13      0.09    0.09    0     snmp_subagent
8127   L           3936457     4684750 100    16.91     5.53    4.32    0     iosd
8127   L 2  8127   600155      2944832 0      13.54     3.74    2.66    0     iosd
8127   L 3  8742   2624104     1473118 0      2.47      1.69    1.62    0     iosd.fastpath
8127   L 2  8743   707123      2610018 0      0.90      0.11    0.03    0     CMI Thread
281    I           89336       1250203 0      24.44     4.11    1.11    0       SNMP ENGINE
276    I           859231      18049   0      11.66     0.99    0.22    0       SNMP Timers
22     I           1967404     1384683 0      8.55      1.66    0.44    0       CMI IOSd task
59     I           1908371     5190050 0      3.00      2.99    2.88    0       ARP Snoop
312    I           1547565     190253  0      2.77      1.33    1.33    0       QoS stats process
274    I           671863      5192881 0      0.88      0.88    0.88    0       DAI Packet Process
30     I           2510652     2756779 0      0.66      0.55    0.44    0       ARP Input
200    I           911297      1344167 0      0.66      0.22    0.11    0       Tunnel IOSd shim DB
195    I           1717254     4141977 0      0.33      0.22    0.22    0       IP Host Track Proce
232    I           1066616     5559004 0      0.33      0.22    0.22    0       UDLD
223    I           1590722     5162371 0      0.33      0.33    0.33    0       IP Input
219    I           977672      3720297 0      0.11      0.11    0.11    0       DHCP Snooping
8123   L           3603041     7383618 63     4.31      2.04    2.04    0     wcm
8123   L 1  8808   2457789     5526106 0      2.34      4.40    0.40    0     eicore_ipc

 

#show processes cpu history
History information for system:
     11111     11111     11111     11111     11111          1111
    811111999991111177777111118888800000888880000099999999990000
100
 90
 80
 70
 60
 50
 40
 30
 20
 10 ************************************************************
   0....5....1....1....2....2....3....3....4....4....5....5....
             0    5    0    5    0    5    0    5    0    5
               CPU% per second (last 60 seconds)

    164111111111111111454411116611111111111111135441113651111111
    157124331111111613114211143611111104010000091641006601002200
100
 90
 80
 70  *                         *                       *
 60  *                        **                       #
 50  **                *      **                **     #*
 40  #*               *##*    ##               *##*   *#*
 30  #*               ###*    ##               *##*   *#*
 20  #*            *  ####    ##               ####   *##
 10 ############################################################
   0....5....1....1....2....2....3....3....4....4....5....5....
             0    5    0    5    0    5    0    5    0    5
               CPU% per minute (last 60 minutes)
              * = maximum CPU%   # = average CPU%

    677776677677777766656666666565554555555555556575556556565555555655665644
    634359546523207032491111314839989215282111231924203084021809628696108257
100
 90
 80     *   *     *
 70 ****************                              *                *
 60 ********************************   * *      ***   * ** * * ** ********
 50 ************************************************************************
 40 ************************************************************************
 30 ************************************************************************
 20 #*##*#**#########**********************************************#********
 10 ########################################################################
   0....5....1....1....2....2....3....3....4....4....5....5....6....6....7.
             0    5    0    5    0    5    0    5    0    5    0    5    0
                       CPU% per hour (last 72 hours)
                    * = maximum CPU%   # = average CPU%

Memory Utilization

  • Monitor the RSS column of the individual processes. If this component is monotonically growing then it is alarming and further investigation needed based on the process that is growing.
  • Monitor the free memory component of the system level memory. Don’t be alarmed if it drops down suddenly as IOSd usage could spike up and the memory freed will be held back in IOSd itself. If the system level free memory goes down below 40% it is alarming and further investigation needed.
  • If the IOSd is process is growing, the run IOSd process specific command and look for Processor 8E2F3008 1610612736 344548600 1266064136 1114693628 1102149236 line in the output. Make sure the Lowest and Largest Columns are not too much apart.

Commands
show processes memory sorted
show memory detailed process iosd allocating-process totals # If the IOSd process RSS is growing.
Sample output

#show processes memory sorted
System memory  : 7995664K total, 2601088K used, 5394576K free, 2227952K kernel reserved
Lowest(b)      : 568393300
PID     Text         Data      Stack     Heap      RSS       Total     Process
8123    20048        1140084   88        34448     748636    1312796   wcm
8127    63252        1601300   104       932       740208    1924924   iosd
5240    9272         422520    92        115968    486208    783940    fed
5782    4260         157088    88        81200     139288    256792    ffm
5242    612          93880     88        1184      117536    306532    stack-mgr
5805    760          219200    88        22908     102992    283952    snmp_subagent  

 

#show memory detailed process iosd allocating-process totals
System memory  : 7995664K total, 2601040K used, 5394624K free, 2227952K kernel reserved
Lowest(b)      : 568393300

Process iosd, type L, PID = 8127
    1924924K total, 63252K text, 1601300K data, 104K stack, 932K heap

                Head    Total(b)     Used(b)     Free(b)   Lowest(b)  Largest(b)
Processor   8E2F3008   1610612736   344548600   1266064136   1114693628   1102149236
IOS Proce   8D2F2008    16777216     7691620     9085596     8954256     8466680

Allocator PC Summary for: Processor

    PC          Total   Count  Name
0xF4CFA0B7   43112624   19401  *Packet Data*
0xF4CFA036   23628624   19560  *Packet Header*
0xF5936761   20985680     320  AAA AttrL Sub
0xF5A01BE7   12586816  168267  Init

AP and Client stats

  • Monitor the number of APs, clients and make sure there is no sudden drop in the count at any point.
  • Monitor the APs up time to make sure there are no AP drop offs.
  • Monitor the client stats for learn-ip state and clients with self-assigned IP addresses.
  • Monitor the AP load and channel utilization of all the APs.

Commands
show ap uptime # Look at AP uptime and Association uptime columns and make sure those are not too much apart, otherwise there are AP flappings
show wcdb database all | include Num # Client count
show wcdb database all | count LEARN|169.254. # Clients with self assigned IP, there may be a problem with the DHCP
show ap dot11 24ghz load-info # Make sure the channel utilization is not high
show ap dot11 5ghz load-info
show wireless exclusion # Make sure the clients are not listed if disabled, Make sure the clients are not stuck if enabled with 0 timer value.
show wireless client summary | i Idle # Make sure the clients are not stuck in this state.

Sample output

#show ap uptime
Number of APs : 498
AP Name               Ethernet MAC       AP Up Time                                Association Up Time
---------------------------------------------------------------------------------------------------------------------------
tsim-01-1             00aa.0100.02f0     1 day 8 hours 42 minutes 15 seconds       1 day 8 hours 38 minutes 57 seconds
tsim-01-2             00aa.0100.03f0     1 day 8 hours 42 minutes 15 seconds       1 day 8 hours 38 minutes 57 seconds
tsim-01-3             00aa.0100.04f0     1 day 8 hours 42 minutes 14 seconds       1 day 8 hours 38 minutes 56 seconds
#show wcdb database all | include Number
        Total Number of Wireless Clients = 3486
#show ap dot11 24ghz load-info
AP Name                           Radio MAC       Slot  Channel Utilization (%)  Clients
----------------------------------------------------------------------------------------
ap-01-1                         00aa.0100.0200     0                       40        10
ap-01-2                         00aa.0100.0300     0                       30        5
#show wireless exclusionlist
Manually Excluded Clients

Dynamically Excluded Clients

Monitor System traffic / logs

  • Look for % in the logs to get all the syslog messages.
  • Look for the keywords like Trace, CAPWAP, AVL in the system logs
  • Monitor the traffic handled by the CPU and watch for any buffer exhaustion/suspension happening frequently.

Commands
show logging
show platform punt client # Look for failures column for any packets punted to CPU, should be zero. It should not increase fast if it is non-zero
show controllers ethernet-controller port-asic statistics exceptions # Look for any drop counter increments

Sample output

show logging | include Trace|%|AVL|CAPWAP
*May  8 11:50:53.940 PDT: %LINEPROTO-5-UPDOWN: Line protocol on Interface Capwap3, changed state to up
-Traceback= 1#b1a3e3fd1d3138ac635fe72eb77c44c3  :F39AF000+20DA8A3 :F39AF000+1380F83 :F39AF000+1382D17 :F39AF000+138C274

#show platform punt client
  tag      buffer        jumbo    fallback     packets   received   failures
                                            alloc   free  bytes    conv  buf
27       0/1024/2048     0/5       0/5        0     0          0     0     0
65536    0/1024/1600     0/0       0/512  2687494 2687494  171999616     0     0
65537    0/ 512/1600     0/0       0/512  174604 174604   34383056     0     0
65538    0/   5/5        0/0       0/5        0     0          0     0     0
65539    0/2048/1600     0/16      0/512  60657819 60657819 2060395227     0     0
65540    0/ 128/1600     0/8       0/0        0     0          0     0     0
65541    0/ 128/1600     0/16      0/32   1490044 1490729  340860488     0     0
65542    0/ 768/1600     0/4       0/0    5623288 5646925  337397580     0     0
65544    0/  96/1600     0/4       0/0        0     0          0     0     0
65545    0/  96/1600     0/8       0/32       0     0          0     0     0
65546    0/ 512/1600     0/32      0/512  2482982 2914147  240875710     0     0
65547    0/  96/1600     0/8       0/32       0     0          0     0     0
65548    0/ 512/1600     0/32      0/256      0     0          0     0     0
65551    0/ 512/1600     0/0       0/256      0     0          0     0     0
65556    0/  16/1600     0/4       0/0        0     0          0     0     0
65557    0/  16/1600     0/4       0/0        0     0          0     0     0
65558    0/  16/1600     0/4       0/0        0     0          0     0     0
65559    0/  16/1600     0/4       0/0        0     0          0     0     0
65560    0/  16/1600     0/4       0/0        0     0          0     0     0
65561    0/ 512/1600     0/0       0/256  100170460 129432612 1711903376     0     2
65562    0/ 512/1600     0/0       0/256  3740090 9643238 1266448617     0     0
65563    0/ 512/1600     0/0       0/256      0     0          0     0     0
65565    0/ 512/1600     0/16      0/256      0     0          0     0     0
65566    0/ 512/1600     0/16      0/256      0     0          0     0     0
65567    0/ 512/1600     0/16      0/256      0     0          0     0     0
65568    0/ 512/1600     0/16      0/256      0     0          0     0     0
65583    0/   1/1        0/0       0/0        0     0          0     0     0
131071    0/  96/1600     0/4       0/0        0     0          0     0     0
fallback pool: 0/1500/1600
jumbo pool:    0/128/9300

More Information

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: