cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1467
Views
0
Helpful
9
Replies

LMS4.0.1 - HUM Poller errors

STEFFEN NEUSER
Level 4
Level 4

Hello,

How can I troubleshoot, what can I do if we get poller errors in HUM suddently? It was running some days only.

Its an installation in our solution center and Im in comparing the results with Cacti and Nagios/PnP - there is no problem at the same device and interfaces with this tools.

Steffen

SCSwitchB


MIB VariableInstanceFailure StatusFailure CountLast Failed ReasonLast Failed
ifHCInOctetsGi3/2Permanent458No Such Instance - The specified instance is not availableMon, Jul 18 2011, 22:02:01 CEST
ifHCOutOctetsGi3/2Permanent458No Such Instance - The specified instance is not availableMon, Jul 18 2011, 22:02:01 CEST
ifHCOutOctetsGi3/3Permanent458No Such Instance - The specified instance is not availableMon, Jul 18 2011, 22:02:01 CEST
ifHCInOctetsGi3/3Permanent458No Such Instance - The specified instance is not availableMon, Jul 18 2011, 22:02:01 CEST
9 Replies 9

Joe Clarke
Cisco Employee
Cisco Employee

Are Nagios and Cacti polling the same objects?  Capture all SNMP packets between the LMS server and this device for two polling cycles.  You can also try walking ifHCInOctets and ifHCOutOctets on this device to see if those counters are populated for those interfaces.  The sniffer trace would be better, though, as that would show what HUM sees.

Nagios and Cacti are polling the same logical objects. When I walk through ifHCInOctets and ifHCOutOctets on this device I see them matched to the raw interface indexes, not to the ifname or ifdescription. How could I check, if HUM poller matches the right raw snmp indexes yet to can compare? As mentioned the measurement was working for couple of days and result was seen and presented with Histo-Graph-IT portlet. There was no changes in the poller configuration, but because Im not administering our solutioncenter don’t know, if device has been rebooted.

Cacti and the way our Nagios implementation does traffic monitoring has If-Index –Auto-Remapping mechanism and HUM does it? If Index mapping is the issue in HUM, is it possible to fix it in HUM with little special hint, without the need to recreate the poller and complete reconfiguration of the Histo-Graph-IT portlets?

HUM works off of ifIndex.  If the index is not being persisted, then HUM could be polling the wrong index values.  That's why I suggested the sniffer trace.  Alternaitvely, recreating the poller may fix this.  Depending on how you created the poller you can try editing it under Monitor > Performance Settings > Setup > Pollers to correct the instances being polled.

OK, the instances in question are 10 and 11, wireshark for HUM shows:

838 240.706962 172.16.1.251 172.16.1.244 SNMP 188 get-request 1.3.6.1.2.1.31.1.1.1.15.11 1.3.6.1.2.1.31.1.1.1.10.11 1.3.6.1.2.1.31.1.1.1.6.11 1.3.6.1.2.1.31.1.1.1.15.10 1.3.6.1.2.1.31.1.1.1.10.10 1.3.6.1.2.1.31.1.1.1.6.10 1.3.6.1.2.1.1.3.0

839 240.713406 172.16.1.244 172.16.1.251 SNMP 184 get-response 1.3.6.1.2.1.31.1.1.1.15.11 1.3.6.1.2.1.31.1.1.1.10.11 1.3.6.1.2.1.31.1.1.1.6.11 1.3.6.1.2.1.31.1.1.1.15.10 1.3.6.1.2.1.31.1.1.1.10.10 1.3.6.1.2.1.31.1.1.1.6.10 1.3.6.1.2.1.1.3.0

In the wireshark details you will see, that HUM is asking with SNMPv1 instead of v2c and the results are NULL

snmpget -v 2c -c xxx 172.16.1.244 1.3.6.1.2.1.31.1.1.1.10.10 1.3.6.1.2.1.31.1.1.1.10.11

IF-MIB::ifHCOutOctets.10 = Counter64: 259090792138

IF-MIB::ifHCOutOctets.11 = Counter64: 57740247882

admin@netmon-v2-7:~> snmpget -v 1 -c xxx 172.16.1.244 1.3.6.1.2.1.31.1.1.1.10.10 1.3.6.1.2.1.31.1.1.1.10.11

Error in packet

Reason: (noSuchName) There is no such variable name in this MIB.

Failed object: IF-MIB::ifHCOutOctets.10

Error in packet

Reason: (noSuchName) There is no such variable name in this MIB.

Failed object: IF-MIB::ifHCOutOctets.11

Because it was working without changing the poller already, why switched HUM to SNMPv1? Because this seamed to be the cause.

snmpwalk -v 2c -c xxx 172.16.1.244 1.3.6.1.2.1.31.1.1.1 > scswitchb_ifhres.log

egrep '\.10 =|\.11 =' scswitchb_ifhres.log

IF-MIB::ifName.10 = STRING: Gi3/2

IF-MIB::ifName.11 = STRING: Gi3/3

IF-MIB::ifInMulticastPkts.10 = Counter32: 712704

IF-MIB::ifInMulticastPkts.11 = Counter32: 712673

IF-MIB::ifInBroadcastPkts.10 = Counter32: 885

IF-MIB::ifInBroadcastPkts.11 = Counter32: 1682

IF-MIB::ifOutMulticastPkts.10 = Counter32: 4958076

IF-MIB::ifOutMulticastPkts.11 = Counter32: 6376603

IF-MIB::ifOutBroadcastPkts.10 = Counter32: 136690

IF-MIB::ifOutBroadcastPkts.11 = Counter32: 4411400

IF-MIB::ifHCInOctets.10 = Counter64: 177198795376

IF-MIB::ifHCInOctets.11 = Counter64: 253258465167

IF-MIB::ifHCInUcastPkts.10 = Counter64: 261618773

IF-MIB::ifHCInUcastPkts.11 = Counter64: 392032656

IF-MIB::ifHCInMulticastPkts.10 = Counter64: 712704

IF-MIB::ifHCInMulticastPkts.11 = Counter64: 712673

IF-MIB::ifHCInBroadcastPkts.10 = Counter64: 885

IF-MIB::ifHCInBroadcastPkts.11 = Counter64: 1682

IF-MIB::ifHCOutOctets.10 = Counter64: 258945801205

IF-MIB::ifHCOutOctets.11 = Counter64: 57688650531

IF-MIB::ifHCOutUcastPkts.10 = Counter64: 303166706

IF-MIB::ifHCOutUcastPkts.11 = Counter64: 174156660

IF-MIB::ifHCOutMulticastPkts.10 = Counter64: 4958081

IF-MIB::ifHCOutMulticastPkts.11 = Counter64: 6376609

IF-MIB::ifHCOutBroadcastPkts.10 = Counter64: 136690

IF-MIB::ifHCOutBroadcastPkts.11 = Counter64: 4411404

IF-MIB::ifLinkUpDownTrapEnable.10 = INTEGER: enabled(1)

IF-MIB::ifLinkUpDownTrapEnable.11 = INTEGER: enabled(1)

IF-MIB::ifHighSpeed.10 = Gauge32: 1000

IF-MIB::ifHighSpeed.11 = Gauge32: 1000

IF-MIB::ifPromiscuousMode.10 = INTEGER: false(2)

IF-MIB::ifPromiscuousMode.11 = INTEGER: false(2)

IF-MIB::ifConnectorPresent.10 = INTEGER: true(1)

IF-MIB::ifConnectorPresent.11 = INTEGER: true(1)

IF-MIB::ifAlias.10 = STRING: #### SCFW1 FE0/0 Transfernetz Vlan 510 Rack Port A3.12 ####

IF-MIB::ifAlias.11 = STRING: #### SCFW2 FE0/1 Transfernetz Vlan 500 Rack Port A3.9 ####

That's weird.  Obviously v1 is incorrect for 64-bit objects.  Are you using a custom template or the built-in interface utilization template?

I used the built-in template

Sorry, I missed your reply.  Looks like you may need to disable the SNMP fallback mechanism then reconfigure your pollers.  I know you were trying to avoid that, but it may be required at this point.

First, shutdown Daemon Manager.  Then edit NMSROOT/hum/conf/upm-snmp.properties and set

snmp.v2cTov1Fallback to false.  Restart Daemon Manager, then delete and reconfigure the problematic poller(s).

HUM is still using v1 only

I stopped LMS, then edit upm-snmp.properties

snmp.threads.min=5

snmp.threads.max=20

snmp.maxRetry=1

snmp.timeoutSecs=3

snmp.maxVbLimit=50

snmp.retryPolicy=com.cisco.nm.upm.dal.LinearRetryPolicy

snmp.v2cTov1Fallback=false

than started LMS again, delete the Poller und created new, problem remains, HUM seams to ignore this file

All poller that can live with v1 are working without errors.

All other LMS applications are using v2c and working properly

This device may already be known to HUM as only being able to support SNMPv1.  This is persisted in the database.  If you can remove the device from DCR and readd it at this point, the problem may go away.  However, if you want a more tactical approach, you can open a TAC service request, and TAC can walk you through fixing the UPM database to make sure the SNMP version is v2c for this device.

Review Cisco Networking for a $25 gift card