cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
6107
Views
0
Helpful
12
Replies

High memory usage on Cisco MDS 9124e

TZwanikken
Level 1
Level 1

Our monitoring tool is reporting high memory usage on a MDS 9124e switch.
The used memory percentage exeeds 90% all the time, for the last couple of months.

Most of the time the usage is around 95%. CPU usage however, never exeeds 10%. 

The same statistics are shown when the command show system resources is executed.

NX-OS version 5.0(4)

Is this normal behaviour? Please advice..

1 Accepted Solution

Accepted Solutions

Oops sorry I meant Show system internal memory-status.  This will tell you if you have currently reached the memory threshold.  If it says Memstatus:ok and your monitoring software shows 95% in use it is probably the CSCtn66877 bug.

View solution in original post

12 Replies 12

Brian Morrissey
Cisco Employee
Cisco Employee

Hi,

Have you tried running "show processes memory"?  This will give you a breakdown of the memory usage.

Well I did. But i don't know whether or not these numbers are ok.

PID    MemAlloc  MemLimit    MemUsed     StackBase/Ptr      Process
-----  --------  ----------  ----------  -----------------  ----------------
    1    147456  0           1695744     7ffffe70/7ffff9a0  init
    2         0  0           0                  0/0         ksoftirqd/0
    3         0  0           0                  0/0         desched/0
    4         0  0           0                  0/0         events/0
    5         0  0           0                  0/0         khelper
   10         0  0           0                  0/0         kthread
   29         0  0           0                  0/0         kblockd/0
   66         0  0           0                  0/0         pdflush
   67         0  0           0                  0/0         pdflush
   68         0  0           0                  0/0         kswapd0
   69         0  0           0                  0/0         aio/0
  961         0  0           0                  0/0         kjournald
  966         0  0           0                  0/0         kjournald
1525         0  0           0                  0/0         kjournald
1532         0  0           0                  0/0         kjournald
1543         0  0           0                  0/0         kjournald
1791    167936  0           1945600     7ffffda0/7ffffcc0  portmap
1798    348160  0           2281472     7ffffd90/7ffffcb0  rpc.nfsd
1800    188416  0           2125824     7ffffd90/7ffffbc0  rpc.mountd
1812  31674368  0           43810816    7ffffd60/7fffeec0  sysmgr
2069         0  0           0                  0/0         mping-thread
2070         0  0           0                  0/0         mping-thread
2108         0  0           0                  0/0         redun_kthread
2109         0  0           0                  0/0         redun_timer_kth
2210         0  0           0                  0/0         sdip-mts-thread
2342    430080  31227468    10719232    7ffffa60/7ffff8e0  tftpd
2343   5926912  66837056    35303424    7ffffa30/7ffff4c0  syslogd
2344   1028096  47603987    18538496    7ffffa70/7fffe9b0  sdwrapd
2346   4943872  0           28487680    7ffffa70/7fff8350  platform
2365         0  0           0                  0/0         ls-notify-mts-t
2383   5926912  0           26914816    7ffffa30/100e5960  syslogd
2384   5926912  0           26914816    7ffffa30/30c58980  syslogd
2433    634880  30878707    18067456    7ffffa60/7fffdff0  pfm_dummy
2435    155648  0           1691648     7ffffa50/7ffff990  klogd
2446   2588672  244438374   23154688    7ffffa80/7ffff250  vshd
2447    475136  32296140    11272192    7ffff9e0/7fffe720  lmgrd
2448    921600  29666700    17326080    7ffffa50/7ffff620  licmgr
2449    868352  25817088    18362368    7ffffa80/7ffff2a0  fs-daemon
2450    782336  23514726    16302080    7ffffa50/7fff34c0  feature-mgr
2451    573440  16779488    14249984    7ffffa80/7ffff840  fcfwd
2452    663552  23239884    15880192    7ffffa60/7ffff8b0  confcheck
2455   1060864  31302233    18718720    7ffffa60/7fffea50  capability
2462    483328  0           3067904     7ffff9b0/7ffff880  cisco
2463   4501504  602120780   48562176    7ffffa20/7ffff3f0  clis
2464   1888256  70258636    24268800    7ffffa80/7fffe410  xbar
2466   1757184  121766796   20754432    7ffffa70/7ffff410  vsan
2467    905216  49245363    24764416    7ffffaa0/7ffff400  ttyd
2468    548864  36683750    13348864    7ffffa20/7ffff870  tcpudp_dummy
2469    770048  16234310    13762560    7ffffa70/7ffff680  sysinfo
2470   1495040  56379558    21528576    7ffffa80/7fffe620  span
2471    610304  29563072    16875520    7ffffa80/7ffff5b0  sksd
2472   1036288  30007296    22298624    7ffffa60/7fffe5f0  sensor_usd
2473   1167360  253699212   20770816    7ffffa60/7ffff4c0  scheduler
2474   1249280  32167308    19357696    7ffffa70/7fffe620  plugin
2475    598016  70859712    14413824    7ffffa70/7ffff3f0  plog_sup
2476    548864  36683750    13348864    7ffffa20/7ffff870  pktmgr_dummy
2477    548864  36683750    13348864    7ffffa10/7ffff860  netstack_dummy
2478   1056768  113815750   18612224    7ffffa50/7fffc0e0  mvsh
2479    548864  20450918    13348864    7ffffa50/7ffff850  mping_server
2480    778240  24348262    16887808    7ffffa40/7fffeb40  lun_zone
2481    548864  36683750    13348864    7ffffa30/7ffff880  ip_dummy
2483    700416  31227468    10989568    7ffffa60/7fffeae0  fcanalyzer
2484   1134592  30815628    18452480    7ffffa80/7fffe620  fc2d
2485   1171456  34086694    21118976    7ffffa60/7fffe610  evms
2486   1183744  31392345    18751488    7ffffa60/7fffe610  evmc
2487   1327104  31227468    24043520    7ffff990/7ffff780  dcos-xinetd
2488    749568  26366771    18731008    7ffffa80/7ffff4d0  core-dmon
2489    782336  107103744   17498112    7ffffa40/7fffd210  cimserver
2490    704512  32296140    18300928    7ffffa70/7fffdf30  bios_daemon
2491   1601536  307715161   31797248    7ffffa50/7ffff1d0  ascii-cfg
2493  10887168  81587545    34004992    7ffffa60/7fffe950  SystemHealth
2494   3153920  32066355    24199168    7ffffa40/7fffe640  securityd
2495   1531904  39428505    30609408    7ffffa50/7fffe860  cert_enroll
2496    950272  26384793    18931712    7ffffa50/7fffe680  aaa
2497    733184  76784576    19800064    7ffffa70/7fffed80  obfl
2498   1585152  76775564    20017152    7ffffa60/7fffe310  device-alias
2499    712704  23807590    16510976    7ffffa80/7ffff760  rdl
2500   1404928  32559296    19795968    7ffffa60/7fffe630  port-resources
2501   1196032  31550041    19083264    7ffffa80/7fffe860  epp
2502    700416  0           2600960     7ffffa60/7fffec30  rpcapd
2505   2670592  50086572    20467712    7ffffaa0/7ffff970  rib
2506   2019328  34532748    22200320    7ffffa70/7fffe620  acl
2509   1630208  141827558   29044736    7ffffa60/7ffff8f0  cdp
2510   2162688  60016915    30167040    7ffffa30/7fffe410  radius
2511    548864  36683750    13348864    7ffffa20/7ffff870  ipv6_dummy
2512   1761280  56803084    21729280    7ffffa60/7fffe3f0  port-channel
2513    688128  50432166    15933440    7ffffa70/7ffff730  dstats
2514   1318912  46068915    22016000    7ffffa60/7fffe630  ethport
2515   3158016  85210048    28676096    7ffffa70/7fffe3f0  port
2517    913408  26614579    19218432    7ffffa80/7fffea20  ismic
2521   1896448  59601062    24727552    7ffffa80/7fffd1f0  module
2522  10346496  103991219   55066624    7ffffa10/7fffe440  snmpd
2525   1630208  80280921    23244800    7ffffa70/7fffe7f0  ExceptionLog
2526    876544  33041395    20111360    7ffffa70/7ffff3a0  bootvar
2527    897024  25699942    18382848    7ffffa80/7fffea30  tcap
2529    466944  0           2818048     7ffffa10/7ffff6c0  dhcpd
2535  10887168  0           25616384    7ffffa60/1013e240  ohms_sup
2537         0  0           0                  0/0         wdpunch_thread
2539  17362944  0           24014848    7ffffa10/7fffec90  proc_mgr
2542  10887168  0           25616384    7ffffa60/3102ae40  ohms_sup
2553  17362944  0           24014848    7ffffa10/100a0890  proc_mgr
2554  17362944  0           24014848    7ffffa10/310289e0  proc_mgr
2555  17362944  0           24014848    7ffffa10/318289c0  proc_mgr
2562    147456  0           1671168     7ffffa00/0         insmod
2563    503808  0           4542464     7ffffe40/7ffff940  lc_core_client
2578    544768  0           6840320     7ffffe50/7ffff7d0  plog_lc
2579   1536000  138169011   25726976    7ffffa40/7ffff840  callhome
2581    585728  0           7327744     7ffffe50/7ffffc00  lc_cfg_mgr
2586   8908800  0           14921728    7ffffe50/7ffffbc0  atl_app
2590   8908800  0           14921728    7ffffe50/1001df40  atl_app
2591   8908800  0           14921728    7ffffe50/30a28cf0  atl_app
2592   8908800  0           14921728    7ffffe50/30c28cf0  atl_app
2593   8908800  0           14921728    7ffffe50/30e28cf0  atl_app
2594   8908800  0           14921728    7ffffe50/31028cf0  atl_app
2595   2641920  0           8310784     7ffffe40/7ffffb20  exp_logger_app
2596    520192  0           6623232     7ffffe50/7ffff890  led_mgr
2597    565248  0           7180288     7ffffe40/7ffffb30  lc_image_upgrad
2598   2863104  0           9363456     7ffffe50/7ffffc20  ohms_lc
2599    524288  0           7094272     7ffffe50/7ffff160  obfl_lc
2619   2863104  0           9363456     7ffffe50/100ce640  ohms_lc
2620   2863104  0           9363456     7ffffe50/30a28da0  ohms_lc
2651   1114112  0           8708096     7ffffe50/7ffff9d0  lc_port_cfg
2652    790528  0           7270400     7ffffe50/7ffff640  lc_pcm
2653    524288  0           6725632     7ffffe50/7ffffa40  lc_span
2654    708608  0           6742016     7ffffe50/7ffff9e0  dev_log_lc
2655    520192  0           6619136     7ffffe50/7ffffa30  qos_mgr
2656    520192  0           6533120     7ffffe50/7ffffb20  creditmon
2657    503808  0           6164480     7ffffe50/7ffffc20  memmon
2658    507904  0           6160384     7ffffe50/7ffff990  cpumon
2659   1601536  84398336    21094400    7ffffa90/7ffff100  fcdomain
2660   1974272  48883577    20144128    7ffffa80/7ffff490  fcns
2661   1085440  62135040    16764928    7ffffa80/7ffff550  fdmi
2662   6279168  136155008   25214976    7ffffaa0/7ffff0f0  fspf
2663    745472  35838035    17670144    7ffffa80/7ffff690  rlir
2664    950272  36797728    18735104    7ffffa80/7ffff690  rscn
2673   1339392  129049676   17661952    7ffffa80/7ffff400  zbm
2674   6279168  0           16826368    7ffffaa0/1005e5c0  fspf
2675   6279168  0           16826368    7ffffaa0/30a35e50  fspf
2676   6279168  0           16826368    7ffffaa0/30c1ee50  fspf
2680   1572864  56758028    22069248    7ffffa70/7fffe640  flogi
2681    622592  28378099    15798272    7ffffa90/7ffff130  mcast
2682   1196032  24213094    16900096    7ffffa50/7ffff4c0  scsi-target
2683   6361088  138038348   31080448    7ffffa70/7ffff330  zone
2684   1331200  0           8290304     7ffffe60/7ffffd10  fib
2685    823296  0           7090176     7ffffe60/7fffecf0  pmon
2686   4259840  48572691    22609920    7ffffa80/7ffff6a0  fcs
2687    716800  40729779    17108992    7ffffa80/7ffff410  ipfc
2693   1310720  32505228    19800064    7ffffa70/7fffe5f0  ipconf
2694   1044480  77099968    20185088    7ffffa70/7ffff710  qos
2700   9240576  260960409   37642240    7ffffa70/7fffe600  cfs
2701    782336  31257177    18546688    7ffffa80/7ffff430  vni
2703   1019904  44217113    20545536    7ffffa60/7fffe530  vrrp-eng
2710    704512  25069158    17543168    7ffffa60/7ffff6c0  fscm
2712   1769472  306471616   28696576    7ffffa40/7fffee60  ntp
2714    630784  34968454    16781312    7ffffa90/7ffff980  ipacl
2715   1056768  32099724    19316736    7ffffa70/7fffe5a0  vrrp-cfg
2744   1122304  0           15462400    7ffffee0/7ffffd70  ntpd
2819    442368  0           2240512     7ffffe90/7fffec70  thttpd
2820   2641920  0           8310784     7ffffe40/10021060  exp_logger_app
2821   2641920  0           8310784     7ffffe40/30a28840  exp_logger_app
2828  10346496  0           46678016    7ffffa10/1048af30  snmpd
2829  10346496  0           46678016    7ffffa10/308b5c00  snmpd
2845   4501504  0           40173568    7ffffa20/102f6b70  clis
2846   4501504  0           40173568    7ffffa20/308b3530  clis:clis-cli-t
3037    163840  0           1765376     7ffffd70/7ffffc40  getty
7590   1187840  40509004    16920576    7ffffa60/7ffff7f0  wwn
26520   1024000  0           10526720    7ffffa30/7ffff380  dcos_sshd
26527   1527808  0           30740480    7ffffd60/7fffa7c0  vsh
26580    196608  0           4157440     7ffffc80/7ffffa60  more
26581   1527808  0           30908416    7ffffd60/7fffa300  vsh
26582    651264  0           2363392     7ffffa00/7ffff3b0  ps

All processes: MemAlloc = 427778048

Hi,

Are you seeing any memory alerts on the switch itself or in show system internal memory-alert-log?

Check out this bug in 5.0.4 (http://tools.cisco.com/Support/BugToolKit/search/getBugDetails.do?method=fetchBugDetails&bugId=CSCtn66877). 

Yes, got the following:

MINOR ALERT INFO

***** /proc/memory_events *****

Alert MINOR Reached at 1340205346.000991194

Alert MINOR_ALERT_PASSED Reached at 1337660667.000249375

Alert MINOR Reached at 1337660658.000897416

Alert MINOR_ALERT_PASSED Reached at 1337523175.000948159

Alert MINOR Reached at 1337523173.000831953

Oops sorry I meant Show system internal memory-status.  This will tell you if you have currently reached the memory threshold.  If it says Memstatus:ok and your monitoring software shows 95% in use it is probably the CSCtn66877 bug.

Switch# show system internal memory-status

MemStatus: OK

Looks like you are right on that 5.0.4 bug, thanks!

Brian, show system resources gives me the following output:

Memory usage:    516128K total,    492680K used,     23448K free

That makes a total of 95% memory utilization... Why does this counter tell me the same as our monitoring tool? Is it this counter that is causing the issues?

The show system resources and snmp trap both give incorrect values for memory utilization.

It turns out to be a problem with the WWN manager process. Crashes of this process will result in writing error information to core files which stay loaded in memory. This memory is not getting released, which will in turn lead to high memory utilization.

Fix 1: Restart switch to free up core files from memory.

Fix 2: Register SR with Cisco. They will load a "Dplug" which removes the core files from memory without restarting the switch.

After this you have to upgrade to a new NX-OS version, to prevent the bug for claiming memory.       

Since this question was already answered, I don't know if it's appropriate to continue to use this particular discussion, so please let me know if I should start a new discussion.

I just noticed this same symptom on my 9124's, but am running a much earlier version of NX-OS: 4.1.(3a).

Is that same bug resolved in either the 4.1 or 4.2 series of NX-OS?

I'm always leery of moving to a different generation of firmware simply because the number is higher, so if that bug has been resolved in either of those series, is there any reason to move to 5.0 or 5.2?

By the same token, if it hasn't been resolved in 4.1 or 4.2, does it make sense to stick with 5.0, or is there any value to upgrading to 5.2?

Hi Grant,

This problem is fixed in 5.2(0.248)S0, 4.2(8.60)S0, 4.2(8)S17, 5.0(6.120)S0.

Here are the new features for nxos 5 (http://www.cisco.com/en/US/docs/switches/datacenter/mds9000/compatibility/matrix/featrlis_5x.html).  If you are planning on upgrading be sure to look at the release notes as support for some switches and modules were dropped in 5.0+.

Thanks for the response, Brian

About the only advantage in 5.x I see for myself is the addition of LDAP authentication for CLI added to 5.0(1a).

Looks like support for the 9124 was skipped in the 5.2(1) release and included in the 5.2(2) release. Interesting...

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: