Hello,
I've a problem with Nexus 9000 SNMP stack. We've topology where are 4 SNMP servers (Prometheus model) tries to pull interface data every 30 seconds. Seems like for these type of Nexuses 9000 C9372PX (NXOS: version 7.0(3)I2(1a)) this is too much to process and im starting to get SNMP timeouts. I tried to do debug snmp errors:
i saw these kinds of messages:
2019 Jul 1 12:50:33.842322 snmpd: pm_cache_fill_phy_ethernet_entry_from_local_data: snmp filled up if_counter for if_index = 0x1a006400 num_phy_ethernet_cache_hits = 6151671
2019 Jul 1 12:50:33.842521 snmpd: pm_cache_get_phy_ethernet_port_stats: cache is valid for if_index = 0x1a006600 node_time = 45216597306123477, sys_time = 45216613650973431 eth_cache_hits = 6151670, cache_misses = 0, num_lc_errors = 0, po_cache_hits = 0 eth_port_channel_dstat_errors = 0
seems like SNMP catche is filling up and SNMP stack gets unresponsive. Is there any way to improve SNMP buffer or etc. to make that 4x30secs requests doable for nexus ? Firstly i thought there could be an issue with control-policy, but seems like management class dropped packets counter doesnt increase when situation happens.
Also, i've newer Nexuses in my topology (cisco Nexus9000 93180YC-EX chassis ). This one runs OK without any problem.
Any help ?
Thanks