10-11-2021 08:15 AM - edited 10-11-2021 08:17 AM
Good day guys,
I am seeing a lot of error messages in my distribution switch and think this might be related to the bug found in the discussions
Cisco Bug: CSCvd45973 - Catalyst 3850/3650 - memory leak in platform_mgr process
But I am not sure because it should not be as my IOS version is different.
Can someone please help me with this?
Switch Ports Model SW Version SW Image Mode
------ ----- ----- ---------- ---------- ----
* 1 38 WS-C3850-24XU 16.6.1 CAT3K_CAA-UNIVERSALK9 INSTALL
#sh ver
Cisco IOS XE Software, Version 16.06.01
Cisco IOS Software [Everest], Catalyst L3 Switch Software (CAT3K_CAA-UNIVERSALK9-M), Version 16.6.1, RELEASE SOFTWARE (fc2)
Log Buffer (4096 bytes):
8% exceeds critical level 95%
Oct 11 13:20:59: %PLATFORM-3-ELEMENT_CRITICAL: Switch 1 R0/0: smand: 1/RP/0: Used Memory value 98% exceeds critical level 95%
Oct 11 13:31:09: %PLATFORM-3-ELEMENT_CRITICAL: Switch 1 R0/0: smand: 1/RP/0: Used Memory value 98% exceeds critical level 95%
Oct 11 13:41:19: %PLATFORM-3-ELEMENT_CRITICAL: Switch 1 R0/0: smand: 1/RP/0: Used Memory value 98% exceeds critical level 95%
Oct 11 13:51:29: %PLATFORM-3-ELEMENT_CRITICAL: Switch 1 R0/0: smand: 1/RP/0: Used Memory value 98% exceeds critical level 95%
Oct 11 14:01:39: %PLATFORM-3-ELEMENT_CRITICAL: Switch 1 R0/0: smand: 1/RP/0: Used Memory value 98% exceeds critical level 95%
Oct 11 14:11:49: %PLATFORM-3-ELEMENT_CRITICAL: Switch 1 R0/0: smand: 1/RP/0: Used Memory value 98% exceeds critical level 95%
Oct 11 14:21:59: %PLATFORM-3-ELEMENT_CRITICAL: Switch 1 R0/0: smand: 1/RP/0: Used Memory value 98% exceeds critical level 95%
Oct 11 14:32:09: %PLATFORM-3-ELEMENT_CRITICAL: Switch 1 R0/0: smand: 1/RP/0: Used Memory value 98% exceeds critical level 95%
Oct 11 14:42:19: %PLATFORM-3-ELEMENT_CRITICAL: Switch 1 R0/0: smand: 1/RP/0: Used Memory value 98% exceeds critical level 95%
Oct 11 14:52:29: %PLATFORM-3-ELEMENT_CRITICAL: Switch 1 R0/0: smand: 1/RP/0: Used Memory value 98% exceeds critical level 95%
Oct 11 15:02:39: %PLATFORM-3-ELEMENT_CRITICAL: Switch 1 R0/0: smand: 1/RP/0: Used Memory value 98% exceeds critical level 95%
Oct 11 15:12:49: %PLATFORM-3-ELEMENT_CRITICAL: Switch 1 R0/0: smand: 1/RP/0: Used Memory value 98% exceeds critical level 95%
Oct 11 15:22:59: %PLATFORM-3-ELEMENT_CRITICAL: Switch 1 R0/0: smand: 1/RP/0: Used Memory value 98% exceeds critical level 95%
Oct 11 15:24:17: %EVENTLIB-3-CPUHOG: Switch 1 R0/0: fman_fp_image: read asyncon 0xaac6361a18: 1614ms, Traceback=1#2df98b66647654561dd2d2aa5a0b71a5 binos:FFEF693000+10CF0 binos:FFEF69300+10CF0 tdl_17b23f588d:FFC5543000+1DC3D33A tdl_17b23f588d:FFC5543000+1DC41790 tdl_17b23f588d:FFC5543000+1DC413E8 tdl_17b23f588d:FFC5543000+1DC9F7DC tdl_17b23f588d:FFC5543000+1DC9F75C uipeerFFEF1D9000+1B178 cdllib_pi:FFEED3F000+102364 cdllib_pi:FFEED3F000+10297C cdllib_pi:FFEED3F000+553E4
Oct 11 15:33:09: %PLATFORM-3-ELEMENT_CRITICAL: Switch 1 R0/0: smand: 1/RP/0: Used Memory value 98% exceeds critical level 95%
Oct 11 15:43:19: %PLATFORM-3-ELEMENT_CRITICAL: Switch 1 R0/0: smand: 1/RP/0: Used Memory value 98% exceeds critical level 95%
Oct 11 15:53:29: %PLATFORM-3-ELEMENT_CRITICAL: Switch 1 R0/0: smand: 1/RP/0: Used Memory value 98% exceeds critical level 95%
Oct 11 16:03:39: %PLATFORM-3-ELEMENT_CRITICAL: Switch 1 R0/0: smand: 1/RP/0: Used Memory value 98% exceeds critical level 95%
Oct 11 16:13:49: %PLATFORM-3-ELEMENT_CRITICAL: Switch 1 R0/0: smand: 1/RP/0: Used Memory value 98% exceeds critical level 95%
Oct 11 16:23:59: %PLATFORM-3-ELEMENT_CRITICAL: Switch 1 R0/0: smand: 1/RP/0: Used Memory value 98% exceeds critical level 95%
Oct 11 16:34:09: %PLATFORM-3-ELEMENT_CRITICAL: Switch 1 R0/0: smand: 1/RP/0: Used Memory value 98% exceeds critical level 95%
Oct 11 16:37:32: %EVENTLIB-3-CPUHOG: Switch 1 R0/0: smand: read asyncon 0xaaba19fea8: 1172ms, Traceback=1#d287fa6f80c2b16205d9ad957641a746 binos:FFED0D0000+10CF0 binos:FFED0D0000+10CF0 d:FFEDA04000+C57A ld:FFEDA04000+CBA4 ld:FFEDA04000+1A970
Oct 11 16:37:32: %EVENTLIB-3-CPUHOG: Switch 1 R0/0: smand: read asyncon 0xaaba19fea8: 2425ms, Traceback=1#d287fa6f80c2b16205d9ad957641a746 binos:FFED0D0000+10CF0 binos:FFED0D0000+10CF0 d:FFEDA04000+BCE6 ld:FFEDA04000+C5E8 ld:FFEDA04000+CBA4 ld:FFEDA04000+1A970
Oct 11 16:44:10: %PLATFORM-3-ELEMENT_CRITICAL: Switch 1 R0/0: smand: 1/RP/0: Used Memory value 98% exceeds critical level 95%
Oct 11 16:54:20: %PLATFORM-3-ELEMENT_CRITICAL: Switch 1 R0/0: smand: 1/RP/0: Used Memory value 99% exceeds critical level 95%
Oct 11 16:54:54: %IOSXE_INFRA-6-PROCPATH_CLIENT_HOG: IOS shim client 'sman dc 0 bipc' has taken 1161 msec (runtime: 486 msec) to process a 'unknown' message
Oct 11 17:04:30: %PLATFORM-3-ELEMENT_CRITICAL: Switch 1 R0/0: smand: 1/RP/0: Used Memory value 98% exceeds critical level 95%
Solved! Go to Solution.
10-12-2021 03:54 AM
@MarcelSmal wrote:
Pid Text Data Stack Dynamic RSS Total Name -------------------------------------------------------------------------------- 21216 114 269488 136 89392 269488 3090508 fed main event
Memory leak caused by the "fed main event".
@MarcelSmal wrote:
Memory (kB) Slot Status Total Used (Pct) Free (Pct) Committed (Pct) 1-RP0 Critical 3979944 3923308 (99%) 56636 ( 1%) 5423972 (136%)
99% memory used. I would give this switch another day or two before it will crash.
This is probably CSCvn46171 or CSCvi03924.
@MarcelSmal wrote:
I was not thinking of going to the latest version from the current one.
Not my call. My recommendation is to upgrade the firmware of the switch to the latest/last version of 16.6.10.
10-11-2021 09:26 AM - edited 10-11-2021 09:27 AM
Some time the bugs may reappears some version even though they are not mentioned.
#show processes memory
check if the new version available upgrade. try good upgrade and stable - cat3k_caa-universalk9.16.09.06.SPA.bin ( denali 3.X is good version my point of view)
10-11-2021 01:03 PM
Hi Balaji,
Thanks for your update.
I was thinking of updating the IOS but not sure what version I will test this on the backup switch and see what I can see.
This is my process memory output:
Processor Pool Total: 873325200 Used: 268910264 Free: 604414936
lsmpi_io Pool Total: 6295128 Used: 6294296 Free: 832
https://software.cisco.com/download/home/286285429/type/282046477/release/Fuji-16.9.3
10-11-2021 03:26 PM
WARNING: Do NOT upgrade to 16.12.X.
Post the complete output to the following commands:
The switch/stack is running on 16.6.1. Be prepared to upgrade to a later (and newer) release of 16.6.X train, like 16.6.10.
10-12-2021 03:13 AM
Hi Leo,
I was not thinking of going to the latest version from the current one.
Here is the output of the switch.
SW#sh platform software status control-processor brief
Load Average
Slot Status 1-Min 5-Min 15-Min
1-RP0 Healthy 0.46 0.30 0.25
Memory (kB)
Slot Status Total Used (Pct) Free (Pct) Committed (Pct)
1-RP0 Critical 3979944 3923308 (99%) 56636 ( 1%) 5423972 (136%)
CPU Utilization
Slot CPU User System Nice Idle IRQ SIRQ IOwait
1-RP0 0 1.60 2.40 0.00 93.80 0.00 0.10 2.10
1 1.89 0.99 0.00 96.00 0.00 0.09 0.99
2 0.99 0.69 0.00 96.30 0.00 0.00 1.99
3 1.00 1.50 0.00 94.90 0.00 0.00 2.60
4 1.30 1.10 0.00 97.60 0.00 0.00 0.00
5 3.50 0.30 0.00 96.19 0.00 0.00 0.00
SW#sh processes memory platform sorted location switch 1 R0
System memory: 3979944K total, 3927120K used, 52824K free,
Lowest: 52824K
Pid Text Data Stack Dynamic RSS Total Name
--------------------------------------------------------------------------------
21216 114 269488 136 89392 269488 3090508 fed main event
31454 177306 615084 136 120 615084 2331724 linux_iosd-imag
19824 302 112992 136 2436 112992 1692940 sif_mgr
18316 1003 114212 136 2344 114212 1424112 platform_mgr
18885 27 740564 136 721716 740564 1413560 tams_proc
29428 284 37584 136 7472 37584 971464 cli_agent
1920 646 77388 136 25580 77388 871128 smand
30329 170 38124 136 3484 38124 847704 dbm
28455 8771 41240 136 7764 41240 824796 fman_fp_image
2149 127 49524 0 268 49524 774308 smd
30728 8422 32900 136 2288 32900 730788 fman_rp
1150 430 25472 136 1288 25472 727644 repm
3138 251 26804 136 6348 26804 712704 tms
13906 38 22876 136 1340 22876 707848 bt_logger
15008 518 24584 136 2376 24584 706256 hman
16207 146 22500 136 2676 22500 704768 lman
687 40 20840 136 1144 20840 702436 psd
17179 198 19208 136 400 19208 701812 nif_mgr
21159 409 21572 136 1508 21572 701600 stack_mgr
29691 114 20472 136 1080 20472 701200 cmm
501 45 21156 136 1820 21156 698888 plogd
20891 61 20588 136 1020 20588 698644 epc_ws_liaison
15888 73 20056 136 432 20056 693280 keyman
18693 222 19736 136 836 19736 692520 tamd_proc
3406 604 2052 132 132 2052 212376 libvirtd
3375 754 1284 132 132 1284 17176 virtlogd
13554 7 1332 136 148 1332 16504 auto_upgrade_se
95 314 2256 132 132 2256 12432 systemd-journal
16380 974 7464 136 5760 7464 10104 ncd.sh
26049 974 6324 136 5628 6324 9972 issu_stack.sh
15410 974 7328 136 5628 7328 9972 issu_stack.sh
13074 974 7196 136 5500 7196 9844 auto_upgrade_cl
17751 974 6184 136 4452 6184 8796 periodic.sh
31122 7 1440 136 148 1440 8632 rotee
30300 7 1440 136 148 1440 8632 rotee
29677 7 1440 136 148 1440 8632 rotee
29008 7 1440 136 148 1440 8632 rotee
28554 7 1440 136 148 1440 8632 rotee
27879 7 1440 136 148 1440 8632 rotee
21112 7 1456 136 148 1456 8632 rotee
20664 7 1440 136 148 1440 8632 rotee
20233 7 1440 136 148 1440 8632 rotee
19516 7 1416 136 148 1416 8632 rotee
10-12-2021 03:54 AM
@MarcelSmal wrote:
Pid Text Data Stack Dynamic RSS Total Name -------------------------------------------------------------------------------- 21216 114 269488 136 89392 269488 3090508 fed main event
Memory leak caused by the "fed main event".
@MarcelSmal wrote:
Memory (kB) Slot Status Total Used (Pct) Free (Pct) Committed (Pct) 1-RP0 Critical 3979944 3923308 (99%) 56636 ( 1%) 5423972 (136%)
99% memory used. I would give this switch another day or two before it will crash.
This is probably CSCvn46171 or CSCvi03924.
@MarcelSmal wrote:
I was not thinking of going to the latest version from the current one.
Not my call. My recommendation is to upgrade the firmware of the switch to the latest/last version of 16.6.10.
10-12-2021 04:51 AM
Hi Leo,
Thank you very much I see this now! I will update the switch firmware and revert back with an update!
01-19-2022 07:49 AM
Just to close this off. We did not update the switch. The switch was replaced by a new one.
 
					
				
				
			
		
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide