cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
4186
Views
0
Helpful
29
Replies

Cannot Connect to ANIserver

rodrigopitta
Level 1
Level 1

Hi.

I'm after a reboot on LMS 2.6 server, the CM home page started showing the following message: "Cannot connect to ANI Server since it is down". I followed the procedures as in the thread "Ciscoworks ANI server cannot connect, Joe Clark please help", but, in the end, after I restarted yhe Damon Manager, all seervices came up, except for the ANI database engine. After I started this serrvice manually, it came up, but the ANI.log file reappeared and the message on the CM home became "Please wait... ANIServer is still initializing."

The result for the pdreg command is below:

C:\>pdreg -l ANIDbEngine
        Process      = ANIDbEngine
        Path         = C:\PROGRA~1\CSCOpx\objects\db\win32\dbsrv9.exe
        Flags        =
        Startup      = Started automatically at boot.
        Dependencies = Not applicable

The result for the pdshow command is attached.

Hope Joe Clark or anyone can help me to.

Thanks in advance.

29 Replies 29

What about the output of the HOSTNAME command from the DOS shell?

It is correct:

C:\>hostname

cwsctx03rjorp

Is it possible to restore a backup only for the JRM, without loosing DCR or RME information?

It looks like there might be a timing issue.  Shutdown Daemon Manager, then empty out the contents of the jrm.log.  Restart Daemon Manager, then when pdshow returns valid data, post that output along with the new jrm.log.

Please find attached the jrm.log, the old jrm.log (as jrm.old.log) and the pdshow output.

After restarting the Daemon Manager, the processe sm_server.exe started consuming a great amount of CPU time. This has happened before and a new restart in Daemon Manager solved it. Do you think this is somehow related to the jrm problem?

There is no jrm problem according to this pdshow and this log.  In fact, I see no process problem at all.  As for sm_server, it is expected for these processes to take a lot of CPU time on restart.  These are the DFMServer processes that have to rebuild the internal topology of the network.  They will also take considerable CPU time if you are sending lots of traps to LMS, or if they are currently doing polling of the devices.

DFM has a very particular profile when it comes to scalability.  You need to make sure you are not managing too many interfaces/ports and that are you are not sending too many traps to DFM.

For the record:

I restored a backup on a new LMS instalation and the problem persisted. I had no more time and, so, I reinitialized the database and started everything form zero manually. Now, it is working fine. 

Regarding the DFM scalability, I have only 1600 devices, do you think this number could be too much?

Thank you for the help.

Possibly.  Post the output of the following command:

NMSROOT/objects/smarts/bin/sm_tpmgr -s DFM --sizes

This command asks for a DFM User and none of the users I created when installing LMS worked.  Is there anyway I can retrieve or reset this credentials?

C:\>"c:\Program Files\CSCOpx\objects\smarts\bin\sm_tpmgr" -s DFM --sizes

Server DFM User: admin

admi's Password: XXXXXXXXXXXX

[23-Jul-2010 9:44:29 AM+836ms] t@11124

ASL-E-ERROR_INIT_BACKEND-While initializing server connection to 'DFM'

SM-EREFUSED-No process is connected to the specified location

The password is probably just admin.  However, it can be found in the NMSROOT/objects/smarts/conf/serverConnect.conf file.

That is what serverConnect.conf says. But the output for the command is still the same.

Post the output of the pdshow command.

Hi.

Although I have installed a fresh LMSinstalation, after rediscovering all devices, the new installation of LMS is showing the same error in RME homepage: "JRM Service could be down. Check whether JRM  services are running.".

By the pdshow results, I can see JRM is still "Waiting to initialize" since last fryday. The sm_server process is no longer live and the CPU activity is low righnow. The pdshow results and jrm.log are attached.

Do you think DFM is causing this problem? My great concerns are RME and CM, so I could reduce DFM functionality, or even disable it, if necessary.

Jrm is taking way too long to start.  It is only allowed 30 seconds, but it looks like it's taking over four minutes.  Assuming the server is idle, you can try doing:

pdterm jrm

pdexec jrm

To see if it starts.  If it does, it could be that the server is too busy during Daemon Manager start time, and the CPU load is preventing jrm from starting correctly.  What are the specs of this server?

Before your last message, I've disabled all DFM polling and SNMP trap receiving. After that, the RME home started working normally and I was able to manage jobs once again. You were right, the server coudn't handle all the traps and the polling, but even after disabling those,  the CM home still shows the JRM down message. I tried stopping and restarting the JRM, like you suggested, but the problem persisted.

The system is a Windows 2003 on a VMWare ESXi. The hardware is a dual Xeon with 4GB of dedicated RAM to the LMS server. I don't have the clock informations now, but I can provide it tomorrow.

P.S.: Is there any clean way to prevent DFM from processing SNMP traps? For test purpooses, I changed the SNMP port, but it makes Windows to generate lots of ICMP port  unreachable packets and I don't intend to leave it this way.

Unfortunately, VMWare is not supported in LMS 2.6.  That could be contributing to your performance issues.  If you need to use VMWare, you will have to upgrade to LMS 3.x.  The only way to stop DFM from processing the traps is to stop sending them to DFM.  You can change your device configs to only send traps which DFM understands.  See http://www.cisco.com/en/US/partner/docs/net_mgmt/ciscoworks_device_fault_manager/2.0_IDU_2.0.6/user/guide/TrapFwd.html for the list.

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: