02-17-2021 03:33 AM
Hello, Cisco community.
I operate 2 stacked SG350GXs for a half of the year.
I find it very disapointing that during each snmp poll from prometheus, cpu load spikes to 90-95%, making switch almost unmanageble.
Web sessions literally stop and i can clearly see the degraded perfomance of ssh console.
So this is a regular cpu usage with no snmp activity enabled (12%, 8%, 12%)
And this happens when i reenable my prometheus job (93%,58%,48%)
Software version is v2.5.5.47 / RTESLA2.5.5_930_364_286.
Is there some ways to tweak my perfomance?
Thanks in advance!
02-17-2021 03:37 AM
Looks some other users in the community also reported sometime back the same issue around SNMP, Do you really need all the ports to monitor using SNMP, if not tweak only required ports to Monitor and check if that resolves the issue. make only required SNMP polling adjustments.
02-17-2021 04:29 AM
Yes, i'll indeed try to make some ajustments to my poller, but my another question: do cpu load affect my network perfomance as well?
Thanks!
02-17-2021 04:38 AM
YES, it will do, if there is no CPU headroom, how the device can process any other requests, rather go crash?
02-17-2021 05:44 AM
Well i sort of hoped that this particular CPU completely incapable of performing the simpliest of snmp operations is not used for switching purposes.
I've just ajusted my snmp-exporter to collect only interface counters, no other data, and it gave no positive changes. Scrape is still 20-28s long and CPU is spiking as well.
02-17-2021 05:53 AM
well i find it confusing and rediculous.
# HELP snmp_scrape_duration_seconds Total SNMP time scrape took (walk and processing). # TYPE snmp_scrape_duration_seconds gauge snmp_scrape_duration_seconds 32.984104957 # HELP snmp_scrape_pdus_returned PDUs returned from walk. # TYPE snmp_scrape_pdus_returned gauge snmp_scrape_pdus_returned 9361 # HELP snmp_scrape_walk_duration_seconds Time SNMP walk/bulkwalk took. # TYPE snmp_scrape_walk_duration_seconds gauge snmp_scrape_walk_duration_seconds 32.968419379
02-24-2021 01:23 AM
I feel sorry for upping my post. But i would really like to see more suggestions.
02-24-2021 02:40 AM
what is the outcome if you disable SNMP ?
02-24-2021 02:42 AM
well, as we expected: average cpu utilization falls to 5-10%
02-24-2021 04:26 AM
Looks for me bug here, open a TAC case with SMB teram, they may offer some solution or they add this as bug work with you for new release.
02-24-2021 04:46 AM
We do not have such a bug logged in our database. Please contact the Cisco STAC centre and raise a support ticket. Contact details are as follows:
https://www.cisco.com/c/en/us/support/web/tsd-cisco-small-business-support-center-contacts.html
Regards,
Martin
08-01-2023 11:11 PM
Case: 695938654
Our is related to CBS350 stacks with large VLAN (~160) count and SNMP.
Stacks with high VLAN count and SNMP disabled are back at under 10% CPU
Stacks with low VLAN count (~16) and SNMP enabled are under 10% CPU
08-21-2023 07:06 AM
Hello, cisco commuinity. Since i was unable to use my account i used to send the first message of this thread, i had to register another.
After oassupport's reply, i checked my stack again just to discover it's 100% cpu utilization. Well, i tried disabling snmp, but to no avail. That stack still serves as aggregation-level switch and it's configs are rarely changed. I can even add that there were no significant changes in configs since 02-17-2021.
So the last saturday we applied the latest firmware (2.5.9.16) and the problem seems to be gone. I even ajusted snmp poller to poll stack every 15sec, instead of 60 secs.
core-stack#sh cpu ut
CPU utilization service is on.
CPU utilization
---------------
five seconds: 11%; one minute: 32%; five minutes: 31%
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide