cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2579
Views
0
Helpful
6
Replies

DCNM-LAN [Version: 7.1(1)] - Device Discovery timeout

RajkumarG1
Level 1
Level 1

Dear All,

 

I have DCNM 7.1 installed in Windows Server 2008 R2 (64Bit) as a VM and i often see "Discovery timeout" on Nexus 7710. This is resulting in inconsistent report generation and the switch becomes unmanageable. Can anyone throw some light.

 

1 Accepted Solution

Accepted Solutions

Eric Scott
Cisco Employee
Cisco Employee

Hi Raj,

The issue is usually caused by slow snmp responses from the devices which are longer than the timeout value in DCNM.  You can check the average snmp round trip time in the fms_dump.1 log file.  To generate the log file, you have to run the techsupport script, which is located in $INSTALLDIR/dcm/fm/bin/techsupport.bat, then the fms_dump.1 log appears in $INSTALLDIR/dcm/fm/log/.  Open fms_dump and search for the string 'SNMP stats'.  It will look something like this,

***************************** SNMP stats *****************************

Average Round Trip Delay, etc:
        14.2.26.76(tcp/v3/eriscott):    11 ms, Rx: 153900 / 210 = (732 bytes avg), 0 discards, 0 SETs
        14.2.26.79(tcp/v3/eriscott):    32 ms, Rx: 110895 / 147 = (754 bytes avg), 2 discards, 0 SETs

The average round trip should be no more than 100 ms, otherwise, you will see periodic timeouts.

In the DCNM web GUI, you can change the snmp timeout and retry values on the [Admin > server properties] page.  Use the search tool (Magnifying glass in upper right corner) to search for snmp.timeout and snmp.retries.   I'd increase the timeout to 20000 ms and 2 retries.

Thanks,

Eric

View solution in original post

6 Replies 6

Eric Scott
Cisco Employee
Cisco Employee

Hi Raj,

The issue is usually caused by slow snmp responses from the devices which are longer than the timeout value in DCNM.  You can check the average snmp round trip time in the fms_dump.1 log file.  To generate the log file, you have to run the techsupport script, which is located in $INSTALLDIR/dcm/fm/bin/techsupport.bat, then the fms_dump.1 log appears in $INSTALLDIR/dcm/fm/log/.  Open fms_dump and search for the string 'SNMP stats'.  It will look something like this,

***************************** SNMP stats *****************************

Average Round Trip Delay, etc:
        14.2.26.76(tcp/v3/eriscott):    11 ms, Rx: 153900 / 210 = (732 bytes avg), 0 discards, 0 SETs
        14.2.26.79(tcp/v3/eriscott):    32 ms, Rx: 110895 / 147 = (754 bytes avg), 2 discards, 0 SETs

The average round trip should be no more than 100 ms, otherwise, you will see periodic timeouts.

In the DCNM web GUI, you can change the snmp timeout and retry values on the [Admin > server properties] page.  Use the search tool (Magnifying glass in upper right corner) to search for snmp.timeout and snmp.retries.   I'd increase the timeout to 20000 ms and 2 retries.

Thanks,

Eric

Hi Eric,

 

Thanks for your response.

I have attached a screenshot and I find the average response to be in the range of 80ms. I have followed your advise and made changes to snmp retries and timeout. I will revert to you with the progress. Thanks in advance.

Hi Eric,

 

It is still not helping the cause. I can still see "discovery timeout"

Thanks in advance.

The events page only shows the N7700s as switching between manageable and unmanageable.

Does this occur with the N3Ks? 

Are the N3Ks and the N7700s in different locations?

When DCNM shows the N7700s as unreachable, are the switches reachable via ping or snmp from DCNM?

If the device is in an unmanaged state, can you force a rediscovery to get the device to be reachable again?

In the $INSTALLDIR/dcm/fm/logs/fmsrever.log file, do you see snmp timeouts around the time that the device goes unmanageable?

Thanks

Dear Eric,

N3K is in internet segment behind firewall and N7K is in intranet and all four devices are in same location. N3K does become unmanageable, but very rarely and the reports generated are consistent.

 

When DCNM shows discovery timeout for N7K they are reachable via ping but SNMP timeouts are noted.

I did try to force rediscovery, but in vain.

 

Thanks in advance.

 

 

Hi Eric,

 

The issue got sorted out after increasing the snmp.timeout value to 70000 and snmp.retries to 3. Thanks a lot for your valuable inputs.

Review Cisco Networking for a $25 gift card