cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1185
Views
0
Helpful
2
Replies

Cannot backup running configurations in RME 4.3.2, but startup config fine

Aaron.Koves
Level 1
Level 1

Platform: LMS 3.2.1 with RME 4.3.2 on Windows 2003

I'm having a problem with several devices that were backing up fine until this week - suddenly they aren't backing up their running configurations, but RME is fetching their startup configurations fine and VTP backups are fine.

At first I thought it might be timeouts, so I used inline edit to incease the telnet timeout for a device to 180s. However, the job fails well within this time period (debug shows on i/o error?).

My order of protocols is SSH, Telnet, TFTP.

I took a stab in the dark that this suggested a database problem so I picked one at random and deleted it from DCR, and readded it and it worked. However, for the other 48 devices affected it did not.

I'm wondering if I need to do anything to the RME database to get things back to where they were? Do I need to reinitialize the RME database, and if I do that what do I lose?

Some relevant debug information for the interesting thread (Thread-517) in an attached file, but here's the summary:

[ Fri Jan 18  14:02:31 GMT 2013 ],INFO ,[Thread-11],com.cisco.nm.rmeng.dcma.configmanager.ConfigManager,updateArchiveForDevice,1505,CM0060 PRIMARY STARTUP Config fetch SUCCESS for 3750-cg-4Central, version number 1 archived.

[ Fri Jan 18  14:02:48 GMT 2013 ],ERROR,[Thread-11],com.cisco.nm.rmeng.dcma.configmanager.ConfigManager,updateArchiveForDevice,1357,PRIMARY RUNNING Config fetch Failed for 3750-cg-4Central

[ Fri Jan 18  14:02:56 GMT 2013 ],INFO ,[Thread-11],com.cisco.nm.rmeng.dcma.configmanager.ConfigManager,updateArchiveForDevice,1505,CM0060 VLAN RUNNING Config fetch SUCCESS for 3750-cg-4Central, version number 1 archived.

Looks like it's shown here:

Sending the command: show running

[ Fri Jan 18  15:03:20 GMT 2013 ],DEBUG,[Thread-517],com.cisco.nm.xms.xdi.transport.cmdsvc.LogAdapter,debug,31, > In LocalProxy.process( OpSend )

[ Fri Jan 18  15:03:20 GMT 2013 ],DEBUG,[Thread-517],com.cisco.nm.xms.xdi.transport.cmdsvc.LogAdapter,debug,31,SessionContext(1) found

[ Fri Jan 18  15:03:20 GMT 2013 ],DEBUG,[Thread-6],com.cisco.nm.rmeng.dcma.configmanager.ConfigManager,getStatusOfRequest,937,Total devices due for reqId = 1358521395296 is 1

[ Fri Jan 18  15:03:21 GMT 2013 ],DEBUG,[Thread-517],com.cisco.nm.xms.xdi.transport.cmdsvc.LogAdapter,debug,31,Setting Transport Value as : 2000

[ Fri Jan 18  15:03:21 GMT 2013 ],DEBUG,[Thread-517],com.cisco.nm.xms.xdi.transport.cmdsvc.LogAdapter,debug,31,<SessionContext:1>_session._stack.size() == 2

[ Fri Jan 18  15:03:21 GMT 2013 ],DEBUG,[Thread-517],com.cisco.nm.xms.xdi.transport.cmdsvc.LogAdapter,debug,31,<SessionContext:1>(local)_stack.size() == 2

[ Fri Jan 18  15:03:21 GMT 2013 ],DEBUG,[Thread-517],com.cisco.nm.xms.xdi.transport.cmdsvc.LogAdapter,debug,31,<SessionContext:1>leveling stacks to index == 1

[ Fri Jan 18  15:03:21 GMT 2013 ],DEBUG,[Thread-517],com.cisco.nm.xms.xdi.transport.cmdsvc.LogAdapter,debug,31,Exiting levelStacks()

[ Fri Jan 18  15:03:23 GMT 2013 ],ERROR,[Thread-517],com.cisco.nm.xms.xdi.transport.cmdsvc.LogAdapter,error,19,IOException received during block() of Channel[UInt32[ 0 ]:UInt32[ 3 ]]

[ Fri Jan 18  15:03:23 GMT 2013 ],DEBUG,[Thread-517],com.cisco.nm.xms.xdi.transport.cmdsvc.LogAdapter,printStackTrace,51,stacktracejava.net.SocketTimeoutException: Read timed out Sending the command: show running
[ Fri Jan 18  15:03:20 GMT 2013 ],DEBUG,[Thread-517],com.cisco.nm.xms.xdi.transport.cmdsvc.LogAdapter,debug,31, > In LocalProxy.process( OpSend )
[ Fri Jan 18  15:03:20 GMT 2013 ],DEBUG,[Thread-517],com.cisco.nm.xms.xdi.transport.cmdsvc.LogAdapter,debug,31,SessionContext(1) found
[ Fri Jan 18  15:03:20 GMT 2013 ],DEBUG,[Thread-6],com.cisco.nm.rmeng.dcma.configmanager.ConfigManager,getStatusOfRequest,937,Total devices due for reqId = 1358521395296 is 1
[ Fri Jan 18  15:03:21 GMT 2013 ],DEBUG,[Thread-517],com.cisco.nm.xms.xdi.transport.cmdsvc.LogAdapter,debug,31,Setting Transport Value as : 2000
[ Fri Jan 18  15:03:21 GMT 2013 ],DEBUG,[Thread-517],com.cisco.nm.xms.xdi.transport.cmdsvc.LogAdapter,debug,31,<SessionContext:1>_session._stack.size() == 2
[ Fri Jan 18  15:03:21 GMT 2013 ],DEBUG,[Thread-517],com.cisco.nm.xms.xdi.transport.cmdsvc.LogAdapter,debug,31,<SessionContext:1>(local)_stack.size() == 2
[ Fri Jan 18  15:03:21 GMT 2013 ],DEBUG,[Thread-517],com.cisco.nm.xms.xdi.transport.cmdsvc.LogAdapter,debug,31,<SessionContext:1>leveling stacks to index == 1
[ Fri Jan 18  15:03:21 GMT 2013 ],DEBUG,[Thread-517],com.cisco.nm.xms.xdi.transport.cmdsvc.LogAdapter,debug,31,Exiting levelStacks()
[ Fri Jan 18  15:03:23 GMT 2013 ],ERROR,[Thread-517],com.cisco.nm.xms.xdi.transport.cmdsvc.LogAdapter,error,19,IOException received during block() of Channel[UInt32[ 0 ]:UInt32[ 3 ]]
[ Fri Jan 18  15:03:23 GMT 2013 ],DEBUG,[Thread-517],com.cisco.nm.xms.xdi.transport.cmdsvc.LogAdapter,printStackTrace,51,stacktracejava.net.SocketTimeoutException: Read timed out

2 Replies 2

Joel Monge
Cisco Employee
Cisco Employee

Looks like the telnet/SSH timeout was set very low:

Telnet timeout : 2

Set this to 36 seconds (default) and try a few devices again.  If not, full job log with ArchiveMgmt-ConfigJob debugging will be helpful.

Hi Joel,

thanks for your reply. The telnet timeout was actually set to 180s for that device when the job was run. I came to the conclusion that it wasn't actually telnet timing out (as the job failed in less than 180s), but rather the Java socket, so I edited the NMS_ROOT\objects\cmf\data\cmdsvc.properties file as follows:

#################################################################################################
#            #
#  cmdsvc.properties - This file can be used to change the default timeout and delays in cmdsvc #
#            #
#################################################################################################

# max timeout value for waiting for login prompt from a device(in ms)

#LoginTimeout=2000

# Sets the initial value of the socket timeout while doing login (in ms)

InitialTransportTimeout=5000

# Set the timeout value of the transport.  This usually corresponds to the socket's timeout value.(in ms)

TransportTimeout=45000

# Set the delay to sleep before proceeding.(in ms)

TuneSleepMillis=500

# Set delay after creating socket and before doing any communication.(in ms)

DelayAfterConnect=300

# Set delay between read process (in ms)

#ReadDelay=10

This fixed my issue. Glad this file was introduced in 4.3

regards,

Aaron