08-20-2024 07:28 AM - edited 08-26-2024 08:12 AM
Hey there!
I have created an snmp exporter on my PC, and trying to scrape with the UI, it fails only on the oid: 1.3.6.1.2.1.47.1.1.1.1.7 (entPhysicalName) on a specific index.
What is weird is that I have this issue on two switches, which are different, Switch A and switch B (Both are NXOS )
both of the switches are in a VPC with the same device-type and same configuration with another switch, SwitchA2 and switchB2.
On both SwitchA2 and SwitchB2 the scrape works great, yet on switchA & B it gets stuck on a certain index. (different on each switch).
When trying to run snmpbulkwalk on my PC without the snmp_exporter it works with 0 problems the only difference is that one is run with a command and the other with the exporter.
Checking with tcp, it seems likes it tries to GetBulk(36) with no response.
Any leads? Any think I can try? I tried updating to snmp_exporter 0.26.0 and still it fails.
08-20-2024 11:41 PM
- Because it fails on two different models with the particular OID , I still consider it a problem related to snmp_explorer , you may for instance examine the logs on the switches after trying or logging from snmp_explorer (if available) , or running in debug mode (again if available)
M.
08-21-2024 02:31 AM
I'm not sure how to debug the scrape on the switch side.
on the snmp_exporter I receive timeout.
```
An error has occurred serving metrics:
error collecting metic Desc{fqName: "snmp_error", help: "Error scrapping target", constLabels: {module="switch"}, variableLabels: {}}: error walking target <target-ip/hostname>: request timeout (after 2 retries)
```
There some many options of `debug snmp` on the switch, I don't know what to even look for.
It is important to say that looking at some logs it looks like it stops mid scrape.
08-21-2024 02:52 AM
- Checkout : https://www.robustperception.io/using-snmpbulkwalk-to-debug-snmp_exporter-issues/
M.
08-21-2024 03:52 AM
Nothing here applies to me case.
It is definitely the integration between the snmp_exporter and the switches.
When running snmpbulkwalk alone it works flawlessly. And when running a scrape from a snmp-exporter it is stuck mid scrape and then timeouts.
It is like the switch refuses to answer after a certain amount or responses.
Can't it be something with the switch configuration?
08-21-2024 05:52 AM
- I don't think that it is related to configurations on the switches. What could happen is that their software version is too old for instance and need an upgrade. Also check if you have timeout settings on snmp_explorer to tweak with.
M.
08-21-2024 06:09 AM
It can't be both of those reasons.
I've tried setting the timeout to 60 seconds and it still fails.
A similar switch, same ios version everything is the same it works perfectly.
It is only 2 switches out of 300.
08-21-2024 06:27 AM
- I acknowledge the current situation , but I myself have no further alternatives to offer
M.
08-21-2024 08:09 AM
- There is still something I was thinking off: resource exhaustion due to long uptimes for those two. Then a reboot of the switches could be tried, but that of course may not be possible immediately,
M.
08-26-2024 08:10 AM
Sadly it didn't help..
I was also wrong in the post.
Both of the switches are NXOS,
1. N9k c93180YC-FX
NXOS 9.3.7
2. N9K C93180YC-FX3
NXOS 10.1.1
08-26-2024 08:38 AM
- If we use tools in the public domain and they don't work in the end with almost everything tried then we must consider another tool to proceed for what we need.
M.
08-26-2024 08:42 AM
I would usually agree, but I have 400 switches that work flawlessly and only two of the same kind not working
08-26-2024 09:48 AM
- Remain alternatives are : + contacting the owner
+ Use latest version including checking for recent patches 'from yesterday'
+ Post in user forum(s) for the tool (if available)
M.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide