07-03-2012 03:52 AM
Our monitoring tool is reporting high memory usage on a MDS 9124e switch.
The used memory percentage exeeds 90% all the time, for the last couple of months.
Most of the time the usage is around 95%. CPU usage however, never exeeds 10%.
The same statistics are shown when the command show system resources is executed.
NX-OS version 5.0(4)
Is this normal behaviour? Please advice..
Solved! Go to Solution.
07-04-2012 06:46 AM
Oops sorry I meant Show system internal memory-status. This will tell you if you have currently reached the memory threshold. If it says Memstatus:ok and your monitoring software shows 95% in use it is probably the CSCtn66877 bug.
07-03-2012 04:55 PM
Hi,
Have you tried running "show processes memory"? This will give you a breakdown of the memory usage.
07-04-2012 12:10 AM
Well I did. But i don't know whether or not these numbers are ok.
PID MemAlloc MemLimit MemUsed StackBase/Ptr Process
----- -------- ---------- ---------- ----------------- ----------------
1 147456 0 1695744 7ffffe70/7ffff9a0 init
2 0 0 0 0/0 ksoftirqd/0
3 0 0 0 0/0 desched/0
4 0 0 0 0/0 events/0
5 0 0 0 0/0 khelper
10 0 0 0 0/0 kthread
29 0 0 0 0/0 kblockd/0
66 0 0 0 0/0 pdflush
67 0 0 0 0/0 pdflush
68 0 0 0 0/0 kswapd0
69 0 0 0 0/0 aio/0
961 0 0 0 0/0 kjournald
966 0 0 0 0/0 kjournald
1525 0 0 0 0/0 kjournald
1532 0 0 0 0/0 kjournald
1543 0 0 0 0/0 kjournald
1791 167936 0 1945600 7ffffda0/7ffffcc0 portmap
1798 348160 0 2281472 7ffffd90/7ffffcb0 rpc.nfsd
1800 188416 0 2125824 7ffffd90/7ffffbc0 rpc.mountd
1812 31674368 0 43810816 7ffffd60/7fffeec0 sysmgr
2069 0 0 0 0/0 mping-thread
2070 0 0 0 0/0 mping-thread
2108 0 0 0 0/0 redun_kthread
2109 0 0 0 0/0 redun_timer_kth
2210 0 0 0 0/0 sdip-mts-thread
2342 430080 31227468 10719232 7ffffa60/7ffff8e0 tftpd
2343 5926912 66837056 35303424 7ffffa30/7ffff4c0 syslogd
2344 1028096 47603987 18538496 7ffffa70/7fffe9b0 sdwrapd
2346 4943872 0 28487680 7ffffa70/7fff8350 platform
2365 0 0 0 0/0 ls-notify-mts-t
2383 5926912 0 26914816 7ffffa30/100e5960 syslogd
2384 5926912 0 26914816 7ffffa30/30c58980 syslogd
2433 634880 30878707 18067456 7ffffa60/7fffdff0 pfm_dummy
2435 155648 0 1691648 7ffffa50/7ffff990 klogd
2446 2588672 244438374 23154688 7ffffa80/7ffff250 vshd
2447 475136 32296140 11272192 7ffff9e0/7fffe720 lmgrd
2448 921600 29666700 17326080 7ffffa50/7ffff620 licmgr
2449 868352 25817088 18362368 7ffffa80/7ffff2a0 fs-daemon
2450 782336 23514726 16302080 7ffffa50/7fff34c0 feature-mgr
2451 573440 16779488 14249984 7ffffa80/7ffff840 fcfwd
2452 663552 23239884 15880192 7ffffa60/7ffff8b0 confcheck
2455 1060864 31302233 18718720 7ffffa60/7fffea50 capability
2462 483328 0 3067904 7ffff9b0/7ffff880 cisco
2463 4501504 602120780 48562176 7ffffa20/7ffff3f0 clis
2464 1888256 70258636 24268800 7ffffa80/7fffe410 xbar
2466 1757184 121766796 20754432 7ffffa70/7ffff410 vsan
2467 905216 49245363 24764416 7ffffaa0/7ffff400 ttyd
2468 548864 36683750 13348864 7ffffa20/7ffff870 tcpudp_dummy
2469 770048 16234310 13762560 7ffffa70/7ffff680 sysinfo
2470 1495040 56379558 21528576 7ffffa80/7fffe620 span
2471 610304 29563072 16875520 7ffffa80/7ffff5b0 sksd
2472 1036288 30007296 22298624 7ffffa60/7fffe5f0 sensor_usd
2473 1167360 253699212 20770816 7ffffa60/7ffff4c0 scheduler
2474 1249280 32167308 19357696 7ffffa70/7fffe620 plugin
2475 598016 70859712 14413824 7ffffa70/7ffff3f0 plog_sup
2476 548864 36683750 13348864 7ffffa20/7ffff870 pktmgr_dummy
2477 548864 36683750 13348864 7ffffa10/7ffff860 netstack_dummy
2478 1056768 113815750 18612224 7ffffa50/7fffc0e0 mvsh
2479 548864 20450918 13348864 7ffffa50/7ffff850 mping_server
2480 778240 24348262 16887808 7ffffa40/7fffeb40 lun_zone
2481 548864 36683750 13348864 7ffffa30/7ffff880 ip_dummy
2483 700416 31227468 10989568 7ffffa60/7fffeae0 fcanalyzer
2484 1134592 30815628 18452480 7ffffa80/7fffe620 fc2d
2485 1171456 34086694 21118976 7ffffa60/7fffe610 evms
2486 1183744 31392345 18751488 7ffffa60/7fffe610 evmc
2487 1327104 31227468 24043520 7ffff990/7ffff780 dcos-xinetd
2488 749568 26366771 18731008 7ffffa80/7ffff4d0 core-dmon
2489 782336 107103744 17498112 7ffffa40/7fffd210 cimserver
2490 704512 32296140 18300928 7ffffa70/7fffdf30 bios_daemon
2491 1601536 307715161 31797248 7ffffa50/7ffff1d0 ascii-cfg
2493 10887168 81587545 34004992 7ffffa60/7fffe950 SystemHealth
2494 3153920 32066355 24199168 7ffffa40/7fffe640 securityd
2495 1531904 39428505 30609408 7ffffa50/7fffe860 cert_enroll
2496 950272 26384793 18931712 7ffffa50/7fffe680 aaa
2497 733184 76784576 19800064 7ffffa70/7fffed80 obfl
2498 1585152 76775564 20017152 7ffffa60/7fffe310 device-alias
2499 712704 23807590 16510976 7ffffa80/7ffff760 rdl
2500 1404928 32559296 19795968 7ffffa60/7fffe630 port-resources
2501 1196032 31550041 19083264 7ffffa80/7fffe860 epp
2502 700416 0 2600960 7ffffa60/7fffec30 rpcapd
2505 2670592 50086572 20467712 7ffffaa0/7ffff970 rib
2506 2019328 34532748 22200320 7ffffa70/7fffe620 acl
2509 1630208 141827558 29044736 7ffffa60/7ffff8f0 cdp
2510 2162688 60016915 30167040 7ffffa30/7fffe410 radius
2511 548864 36683750 13348864 7ffffa20/7ffff870 ipv6_dummy
2512 1761280 56803084 21729280 7ffffa60/7fffe3f0 port-channel
2513 688128 50432166 15933440 7ffffa70/7ffff730 dstats
2514 1318912 46068915 22016000 7ffffa60/7fffe630 ethport
2515 3158016 85210048 28676096 7ffffa70/7fffe3f0 port
2517 913408 26614579 19218432 7ffffa80/7fffea20 ismic
2521 1896448 59601062 24727552 7ffffa80/7fffd1f0 module
2522 10346496 103991219 55066624 7ffffa10/7fffe440 snmpd
2525 1630208 80280921 23244800 7ffffa70/7fffe7f0 ExceptionLog
2526 876544 33041395 20111360 7ffffa70/7ffff3a0 bootvar
2527 897024 25699942 18382848 7ffffa80/7fffea30 tcap
2529 466944 0 2818048 7ffffa10/7ffff6c0 dhcpd
2535 10887168 0 25616384 7ffffa60/1013e240 ohms_sup
2537 0 0 0 0/0 wdpunch_thread
2539 17362944 0 24014848 7ffffa10/7fffec90 proc_mgr
2542 10887168 0 25616384 7ffffa60/3102ae40 ohms_sup
2553 17362944 0 24014848 7ffffa10/100a0890 proc_mgr
2554 17362944 0 24014848 7ffffa10/310289e0 proc_mgr
2555 17362944 0 24014848 7ffffa10/318289c0 proc_mgr
2562 147456 0 1671168 7ffffa00/0 insmod
2563 503808 0 4542464 7ffffe40/7ffff940 lc_core_client
2578 544768 0 6840320 7ffffe50/7ffff7d0 plog_lc
2579 1536000 138169011 25726976 7ffffa40/7ffff840 callhome
2581 585728 0 7327744 7ffffe50/7ffffc00 lc_cfg_mgr
2586 8908800 0 14921728 7ffffe50/7ffffbc0 atl_app
2590 8908800 0 14921728 7ffffe50/1001df40 atl_app
2591 8908800 0 14921728 7ffffe50/30a28cf0 atl_app
2592 8908800 0 14921728 7ffffe50/30c28cf0 atl_app
2593 8908800 0 14921728 7ffffe50/30e28cf0 atl_app
2594 8908800 0 14921728 7ffffe50/31028cf0 atl_app
2595 2641920 0 8310784 7ffffe40/7ffffb20 exp_logger_app
2596 520192 0 6623232 7ffffe50/7ffff890 led_mgr
2597 565248 0 7180288 7ffffe40/7ffffb30 lc_image_upgrad
2598 2863104 0 9363456 7ffffe50/7ffffc20 ohms_lc
2599 524288 0 7094272 7ffffe50/7ffff160 obfl_lc
2619 2863104 0 9363456 7ffffe50/100ce640 ohms_lc
2620 2863104 0 9363456 7ffffe50/30a28da0 ohms_lc
2651 1114112 0 8708096 7ffffe50/7ffff9d0 lc_port_cfg
2652 790528 0 7270400 7ffffe50/7ffff640 lc_pcm
2653 524288 0 6725632 7ffffe50/7ffffa40 lc_span
2654 708608 0 6742016 7ffffe50/7ffff9e0 dev_log_lc
2655 520192 0 6619136 7ffffe50/7ffffa30 qos_mgr
2656 520192 0 6533120 7ffffe50/7ffffb20 creditmon
2657 503808 0 6164480 7ffffe50/7ffffc20 memmon
2658 507904 0 6160384 7ffffe50/7ffff990 cpumon
2659 1601536 84398336 21094400 7ffffa90/7ffff100 fcdomain
2660 1974272 48883577 20144128 7ffffa80/7ffff490 fcns
2661 1085440 62135040 16764928 7ffffa80/7ffff550 fdmi
2662 6279168 136155008 25214976 7ffffaa0/7ffff0f0 fspf
2663 745472 35838035 17670144 7ffffa80/7ffff690 rlir
2664 950272 36797728 18735104 7ffffa80/7ffff690 rscn
2673 1339392 129049676 17661952 7ffffa80/7ffff400 zbm
2674 6279168 0 16826368 7ffffaa0/1005e5c0 fspf
2675 6279168 0 16826368 7ffffaa0/30a35e50 fspf
2676 6279168 0 16826368 7ffffaa0/30c1ee50 fspf
2680 1572864 56758028 22069248 7ffffa70/7fffe640 flogi
2681 622592 28378099 15798272 7ffffa90/7ffff130 mcast
2682 1196032 24213094 16900096 7ffffa50/7ffff4c0 scsi-target
2683 6361088 138038348 31080448 7ffffa70/7ffff330 zone
2684 1331200 0 8290304 7ffffe60/7ffffd10 fib
2685 823296 0 7090176 7ffffe60/7fffecf0 pmon
2686 4259840 48572691 22609920 7ffffa80/7ffff6a0 fcs
2687 716800 40729779 17108992 7ffffa80/7ffff410 ipfc
2693 1310720 32505228 19800064 7ffffa70/7fffe5f0 ipconf
2694 1044480 77099968 20185088 7ffffa70/7ffff710 qos
2700 9240576 260960409 37642240 7ffffa70/7fffe600 cfs
2701 782336 31257177 18546688 7ffffa80/7ffff430 vni
2703 1019904 44217113 20545536 7ffffa60/7fffe530 vrrp-eng
2710 704512 25069158 17543168 7ffffa60/7ffff6c0 fscm
2712 1769472 306471616 28696576 7ffffa40/7fffee60 ntp
2714 630784 34968454 16781312 7ffffa90/7ffff980 ipacl
2715 1056768 32099724 19316736 7ffffa70/7fffe5a0 vrrp-cfg
2744 1122304 0 15462400 7ffffee0/7ffffd70 ntpd
2819 442368 0 2240512 7ffffe90/7fffec70 thttpd
2820 2641920 0 8310784 7ffffe40/10021060 exp_logger_app
2821 2641920 0 8310784 7ffffe40/30a28840 exp_logger_app
2828 10346496 0 46678016 7ffffa10/1048af30 snmpd
2829 10346496 0 46678016 7ffffa10/308b5c00 snmpd
2845 4501504 0 40173568 7ffffa20/102f6b70 clis
2846 4501504 0 40173568 7ffffa20/308b3530 clis:clis-cli-t
3037 163840 0 1765376 7ffffd70/7ffffc40 getty
7590 1187840 40509004 16920576 7ffffa60/7ffff7f0 wwn
26520 1024000 0 10526720 7ffffa30/7ffff380 dcos_sshd
26527 1527808 0 30740480 7ffffd60/7fffa7c0 vsh
26580 196608 0 4157440 7ffffc80/7ffffa60 more
26581 1527808 0 30908416 7ffffd60/7fffa300 vsh
26582 651264 0 2363392 7ffffa00/7ffff3b0 ps
All processes: MemAlloc = 427778048
07-04-2012 05:30 AM
Hi,
Are you seeing any memory alerts on the switch itself or in show system internal memory-alert-log?
Check out this bug in 5.0.4 (http://tools.cisco.com/Support/BugToolKit/search/getBugDetails.do?method=fetchBugDetails&bugId=CSCtn66877).
07-04-2012 05:51 AM
Yes, got the following:
MINOR ALERT INFO
***** /proc/memory_events *****
Alert MINOR Reached at 1340205346.000991194
Alert MINOR_ALERT_PASSED Reached at 1337660667.000249375
Alert MINOR Reached at 1337660658.000897416
Alert MINOR_ALERT_PASSED Reached at 1337523175.000948159
Alert MINOR Reached at 1337523173.000831953
07-04-2012 06:46 AM
Oops sorry I meant Show system internal memory-status. This will tell you if you have currently reached the memory threshold. If it says Memstatus:ok and your monitoring software shows 95% in use it is probably the CSCtn66877 bug.
07-04-2012 07:38 AM
Switch# show system internal memory-status
MemStatus: OK
Looks like you are right on that 5.0.4 bug, thanks!
07-04-2012 07:57 AM
Brian, show system resources gives me the following output:
Memory usage: 516128K total, 492680K used, 23448K free
That makes a total of 95% memory utilization... Why does this counter tell me the same as our monitoring tool? Is it this counter that is causing the issues?
07-16-2012 06:24 AM
The show system resources and snmp trap both give incorrect values for memory utilization.
07-16-2012 07:18 AM
It turns out to be a problem with the WWN manager process. Crashes of this process will result in writing error information to core files which stay loaded in memory. This memory is not getting released, which will in turn lead to high memory utilization.
Fix 1: Restart switch to free up core files from memory.
Fix 2: Register SR with Cisco. They will load a "Dplug" which removes the core files from memory without restarting the switch.
After this you have to upgrade to a new NX-OS version, to prevent the bug for claiming memory.
07-13-2012 07:37 PM
Since this question was already answered, I don't know if it's appropriate to continue to use this particular discussion, so please let me know if I should start a new discussion.
I just noticed this same symptom on my 9124's, but am running a much earlier version of NX-OS: 4.1.(3a).
Is that same bug resolved in either the 4.1 or 4.2 series of NX-OS?
I'm always leery of moving to a different generation of firmware simply because the number is higher, so if that bug has been resolved in either of those series, is there any reason to move to 5.0 or 5.2?
By the same token, if it hasn't been resolved in 4.1 or 4.2, does it make sense to stick with 5.0, or is there any value to upgrading to 5.2?
07-16-2012 06:47 AM
Hi Grant,
This problem is fixed in 5.2(0.248)S0, 4.2(8.60)S0, 4.2(8)S17, 5.0(6.120)S0.
Here are the new features for nxos 5 (http://www.cisco.com/en/US/docs/switches/datacenter/mds9000/compatibility/matrix/featrlis_5x.html). If you are planning on upgrading be sure to look at the release notes as support for some switches and modules were dropped in 5.0+.
07-16-2012 11:33 AM
Thanks for the response, Brian
About the only advantage in 5.x I see for myself is the addition of LDAP authentication for CLI added to 5.0(1a).
Looks like support for the 9124 was skipped in the 5.2(1) release and included in the 5.2(2) release. Interesting...
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide