07-18-2011 01:25 PM
Hello,
How can I troubleshoot, what can I do if we get poller errors in HUM suddently? It was running some days only.
Its an installation in our solution center and Im in comparing the results with Cacti and Nagios/PnP - there is no problem at the same device and interfaces with this tools.
Steffen
SCSwitchB
| ||||||||||||||||||||||||||||||
|
07-21-2011 12:33 AM
Are Nagios and Cacti polling the same objects? Capture all SNMP packets between the LMS server and this device for two polling cycles. You can also try walking ifHCInOctets and ifHCOutOctets on this device to see if those counters are populated for those interfaces. The sniffer trace would be better, though, as that would show what HUM sees.
07-21-2011 12:58 AM
Nagios and Cacti are polling the same logical objects. When I walk through ifHCInOctets and ifHCOutOctets on this device I see them matched to the raw interface indexes, not to the ifname or ifdescription. How could I check, if HUM poller matches the right raw snmp indexes yet to can compare? As mentioned the measurement was working for couple of days and result was seen and presented with Histo-Graph-IT portlet. There was no changes in the poller configuration, but because Im not administering our solutioncenter don’t know, if device has been rebooted.
Cacti and the way our Nagios implementation does traffic monitoring has If-Index –Auto-Remapping mechanism and HUM does it? If Index mapping is the issue in HUM, is it possible to fix it in HUM with little special hint, without the need to recreate the poller and complete reconfiguration of the Histo-Graph-IT portlets?
07-21-2011 01:06 AM
HUM works off of ifIndex. If the index is not being persisted, then HUM could be polling the wrong index values. That's why I suggested the sniffer trace. Alternaitvely, recreating the poller may fix this. Depending on how you created the poller you can try editing it under Monitor > Performance Settings > Setup > Pollers to correct the instances being polled.
07-21-2011 07:00 AM
OK, the instances in question are 10 and 11, wireshark for HUM shows:
838 240.706962 172.16.1.251 172.16.1.244 SNMP 188 get-request 1.3.6.1.2.1.31.1.1.1.15.11 1.3.6.1.2.1.31.1.1.1.10.11 1.3.6.1.2.1.31.1.1.1.6.11 1.3.6.1.2.1.31.1.1.1.15.10 1.3.6.1.2.1.31.1.1.1.10.10 1.3.6.1.2.1.31.1.1.1.6.10 1.3.6.1.2.1.1.3.0
839 240.713406 172.16.1.244 172.16.1.251 SNMP 184 get-response 1.3.6.1.2.1.31.1.1.1.15.11 1.3.6.1.2.1.31.1.1.1.10.11 1.3.6.1.2.1.31.1.1.1.6.11 1.3.6.1.2.1.31.1.1.1.15.10 1.3.6.1.2.1.31.1.1.1.10.10 1.3.6.1.2.1.31.1.1.1.6.10 1.3.6.1.2.1.1.3.0
In the wireshark details you will see, that HUM is asking with SNMPv1 instead of v2c and the results are NULL
snmpget -v 2c -c xxx 172.16.1.244 1.3.6.1.2.1.31.1.1.1.10.10 1.3.6.1.2.1.31.1.1.1.10.11
IF-MIB::ifHCOutOctets.10 = Counter64: 259090792138
IF-MIB::ifHCOutOctets.11 = Counter64: 57740247882
admin@netmon-v2-7:~> snmpget -v 1 -c xxx 172.16.1.244 1.3.6.1.2.1.31.1.1.1.10.10 1.3.6.1.2.1.31.1.1.1.10.11
Error in packet
Reason: (noSuchName) There is no such variable name in this MIB.
Failed object: IF-MIB::ifHCOutOctets.10
Error in packet
Reason: (noSuchName) There is no such variable name in this MIB.
Failed object: IF-MIB::ifHCOutOctets.11
Because it was working without changing the poller already, why switched HUM to SNMPv1? Because this seamed to be the cause.
snmpwalk -v 2c -c xxx 172.16.1.244 1.3.6.1.2.1.31.1.1.1 > scswitchb_ifhres.log
egrep '\.10 =|\.11 =' scswitchb_ifhres.log
IF-MIB::ifName.10 = STRING: Gi3/2
IF-MIB::ifName.11 = STRING: Gi3/3
IF-MIB::ifInMulticastPkts.10 = Counter32: 712704
IF-MIB::ifInMulticastPkts.11 = Counter32: 712673
IF-MIB::ifInBroadcastPkts.10 = Counter32: 885
IF-MIB::ifInBroadcastPkts.11 = Counter32: 1682
IF-MIB::ifOutMulticastPkts.10 = Counter32: 4958076
IF-MIB::ifOutMulticastPkts.11 = Counter32: 6376603
IF-MIB::ifOutBroadcastPkts.10 = Counter32: 136690
IF-MIB::ifOutBroadcastPkts.11 = Counter32: 4411400
IF-MIB::ifHCInOctets.10 = Counter64: 177198795376
IF-MIB::ifHCInOctets.11 = Counter64: 253258465167
IF-MIB::ifHCInUcastPkts.10 = Counter64: 261618773
IF-MIB::ifHCInUcastPkts.11 = Counter64: 392032656
IF-MIB::ifHCInMulticastPkts.10 = Counter64: 712704
IF-MIB::ifHCInMulticastPkts.11 = Counter64: 712673
IF-MIB::ifHCInBroadcastPkts.10 = Counter64: 885
IF-MIB::ifHCInBroadcastPkts.11 = Counter64: 1682
IF-MIB::ifHCOutOctets.10 = Counter64: 258945801205
IF-MIB::ifHCOutOctets.11 = Counter64: 57688650531
IF-MIB::ifHCOutUcastPkts.10 = Counter64: 303166706
IF-MIB::ifHCOutUcastPkts.11 = Counter64: 174156660
IF-MIB::ifHCOutMulticastPkts.10 = Counter64: 4958081
IF-MIB::ifHCOutMulticastPkts.11 = Counter64: 6376609
IF-MIB::ifHCOutBroadcastPkts.10 = Counter64: 136690
IF-MIB::ifHCOutBroadcastPkts.11 = Counter64: 4411404
IF-MIB::ifLinkUpDownTrapEnable.10 = INTEGER: enabled(1)
IF-MIB::ifLinkUpDownTrapEnable.11 = INTEGER: enabled(1)
IF-MIB::ifHighSpeed.10 = Gauge32: 1000
IF-MIB::ifHighSpeed.11 = Gauge32: 1000
IF-MIB::ifPromiscuousMode.10 = INTEGER: false(2)
IF-MIB::ifPromiscuousMode.11 = INTEGER: false(2)
IF-MIB::ifConnectorPresent.10 = INTEGER: true(1)
IF-MIB::ifConnectorPresent.11 = INTEGER: true(1)
IF-MIB::ifAlias.10 = STRING: #### SCFW1 FE0/0 Transfernetz Vlan 510 Rack Port A3.12 ####
IF-MIB::ifAlias.11 = STRING: #### SCFW2 FE0/1 Transfernetz Vlan 500 Rack Port A3.9 ####
07-24-2011 11:19 AM
That's weird. Obviously v1 is incorrect for 64-bit objects. Are you using a custom template or the built-in interface utilization template?
07-26-2011 02:18 AM
I used the built-in template
07-31-2011 11:46 PM
Sorry, I missed your reply. Looks like you may need to disable the SNMP fallback mechanism then reconfigure your pollers. I know you were trying to avoid that, but it may be required at this point.
First, shutdown Daemon Manager. Then edit NMSROOT/hum/conf/upm-snmp.properties and set
snmp.v2cTov1Fallback to false. Restart Daemon Manager, then delete and reconfigure the problematic poller(s).
08-02-2011 10:10 AM
HUM is still using v1 only
I stopped LMS, then edit upm-snmp.properties
snmp.threads.min=5
snmp.threads.max=20
snmp.maxRetry=1
snmp.timeoutSecs=3
snmp.maxVbLimit=50
snmp.retryPolicy=com.cisco.nm.upm.dal.LinearRetryPolicy
snmp.v2cTov1Fallback=false
than started LMS again, delete the Poller und created new, problem remains, HUM seams to ignore this file
All poller that can live with v1 are working without errors.
All other LMS applications are using v2c and working properly
08-03-2011 12:15 AM
This device may already be known to HUM as only being able to support SNMPv1. This is persisted in the database. If you can remove the device from DCR and readd it at this point, the problem may go away. However, if you want a more tactical approach, you can open a TAC service request, and TAC can walk you through fixing the UPM database to make sure the SNMP version is v2c for this device.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide