The helpful folks at TAC have been trying to troubleshoot one of my last and biggest pending items, which was the perceived inability of DFM to manage the devices on our network. This was a rather puzzling issue, as the other LMS components (CS, CM, RME, etc.) had no apparent issues whatsoever doing everything I asked them to do. After countless hours trying to troubleshoot DFM discovery errors ("questioned" with SNMP timeout despite the fact that all other LMS modules manage the same devices perfectly fine), an alert TAC engineer finally asked whether or not these devices were, in fact actually supported by DFM 3.2 - low and behold, a can of worms opened up!
The best current guess is rather confusing to me: There is a Cisco document out there suggesting that NONE of our devices are among those supported by DFM, while I did find another Cisco document that somewhat contradicts that notion. I’d like to think that this must be confusing (or at least very little known) to TAC as well, since nobody over there considered this a potential culprit for the first almost three weeks of troubleshooting around the globe during countless WebEx sessions. We basically went through everything imaginable (process monitoring with full debugging, complete removal and new installation of DFM only, complete clean-up and re-initialization of all module databases – and in the process tearing down most of my configurations and settings –, to a midnight conference call with developers in India).
The end result appears to be that DFM functionality will not be available to me – please confirm. What are the alternatives? Any rhyme or reason to Cisco not supporting these device types? Any plans to ever do so?
I run a variety of devices on my network, most of them being 3560G, 3560E and 6504E switches, pretty much bread-and-butter variety of basic Cisco devices. Why on earth would there even be a question that these are or are not supported by all LMS modules?
Argument AGAINST support in DFM 3.2:
Argument IN FAVOR of support in DFM 3.2:
According to that list, our 6504E with IOS is fully supported by DFM 3.2 with LMS 3.2, and so are the 3560G and 2950 series switches, the 2500 series router. However, the 3560E series switches are not listed as supported.
Are we seeing ghosts here or have other people had device support issues with DFM?
Solved! Go to Solution.
There appears to be a problem with the DFM engines. This is not an issue with device support. If if your devices were unsupported, you would not be seeing the symptoms you are seeing. You'll need to go back to TAC as I'm betting EMC will need to get involved to look into the DFM server operation.
DFM 3.2 supports a number of 3560 switches including 3560E. I would have to know the specific sysObjectID to confirm whehter or not support exists, though. Typically, when we see devices in a Questioned state in DFM because of an SNMP timeout, I would think that DFM is using SNMPv3, and the engine ID has been duplicated across a number of devices. For SNMPv1 and v2c, if other applications on the ssame server are working, then so should DFM. Troubleshooting this is best done with a sniffer trace to confirm DFM is sending the requests and the devices are replying.
Hi Joseph -
Many thanks for your reply. We do use SNMP version v2c; all other modules use the same default credentials and work fine with the same devices, and we performed all connectivity tests (nslookup, dmctl -s DFM invoke SM_System::SM-System nameToAddr/addrToName, as well as snmpwalk from server to several devices) without problems - it is just DFM that has issues.
Also, I'd love to take your word re/ support of 3560 and 3560E but looking at the Cisco documents I referenced you will find that that's potentially not the case.
In case you have access to TAC info: SR#615028875
Oh, and here are some representative sysObjectID responses:
RFC1213-MIB::sysObjectID.0 = OID: CISCO-PRODUCTS-MIB::catalyst3560G48PS
RFC1213-MIB::sysObjectID.0 = OID: CISCO-PRODUCTS-MIB::catalyst3560E24TD
RFC1213-MIB::sysObjectID.0 = OID: CISCO-PRODUCTS-MIB::ciscoWSC6504E
All three of these device types are supported by DFM 3.2. You may need to go to Common Services > Software Center > Device Update to download the latest DFM device support update, though.
Yep, you should have support for all of them. Again, lack of device support would not cause a device to go to Questioned. Since you're using v2c, I would deploy a sniffer trace, and see if you can capture a rediscovery cycle where a device goes to Questioned.
I'll have to ask TAC with help on this, it's slightly beyond my skill/experience level. I'll let you know about the results as soon as I have them...
Thanks again for taking the time to advise.
I ran a packet capture using the LMS Device Center tool for UDP ports 161 and 162. Just after beginning the capture process, initially set for 5 minutes, I submitted all 59 devices for rediscovery in DFM. All turned to status learning, then back to status questioned, at which time I ended the packet capture. Attached is the output file.
Well, with all due respect, this is the Cisco LMS - Device Center - Tools - Packet Capture feature that I used.
Okay, so now I dumped Wireshark onto that server (mind you I am not a gearhead who's very familiar with it) and tried to capture/sniff what's going on there. I simply start a live capture on the physical interface and set the filter to "SNMP present" - I do see a handful of incoming SNMP traps every once in a while when an interface on a switch goes up or down somewhere, but starting a rediscovery does not create any visible traffic with this filter. What should I be specifically looking for as a filter setting?
The built-in packet capture is fine. When you stop your capture, the window that shows the list of packet captures should display the new capture file. When you click on this, you'll get a file that ends with ".jet." What you posted was a log file from the Tomcat servlet engine. The .jet file will be a binary file that can be opened in something like Wireshark.
In Wireshark, you'll want to setup a capture filter of "udp port 161". That can be done under Capture > Options. When you do a rediscover in DFM, you should see a few packets (at least four) to/from the device being rediscovered.
All of these packets are SNMP traps. There are no polling packets here. Did you have a filter enabled for udp/161? What error do you see in DFM after the discovery fails?
As you can see in the attached screenshot, the filter was set to ports udp/161 and udp/162, so both polling and trap traffic should have been captured. DFM always does the same during rediscovery attempts... Questioned->Learning->Questioned.
I verified the results using Wireshark - with filter "SNMP present" (presumably that means any udp/161 and udp/162 traffic) all I see during rediscovery are unrelated traps sent by switches on the network (interface up/down, etc.) but no outbound polling packets.