Cisco Prime Infrastructure 2.0 - NMS Service stopped

Dhwanit.vaidya · ‎12-19-2013

hi ,

did anyone faced this issue with PI 2.0 on VMware environment .

even after reimage its same status

CiscoPrime/admin# show application status NCS

Health Monitor Server is starting.

Ftp Server is running

Database server is running

Tftp Server is running

Matlab Server is running

NMS Server is stopped.

CNS Gateway with port 11011 is down

CNS Gateway SSL with port 11012 is down

CNS Gateway with port 11013 is down

CNS Gateway SSL with port 11014 is down

Plug and Play Gateway config, image and resource are down on https

Plug and Play Gateway config, image and resource are down on http

Plug and Play Gateway is stopped.

SAM Daemon is running ...

DA Daemon is running ...

Syslog Daemon is running ...

Rob Johnson · ‎12-19-2013

Prime has a patch for 2.0 -

pi_update_2.0-3.zip

You can find it here:::

http://software.cisco.com/download/release.html?mdfid=284422771&flowid=45323&softwareid=284272933&release=2.0.10&relind=AVAILABLE&rellifecycle=&reltype=all

Here's a link to the readme:::

http://www.cisco.com/web/software/284272933/92867/README_for_PI_2_0_UBF_Patch.rtf

It fixes the bug below which I'll bet is what is causing your problem:::

CSCuh73208 PI 1.3 /2.0 server crashes when accessing maps on the server

The readme shown above tells you how to patch it....

jerry.larson · ‎02-06-2014

I just installed this patch and now I have NMS server is stopped.

powys · ‎02-13-2014

Ours is patched but we have two large site maps which consistently crash the system when viewed or edited.

Don't know if it's down to the number of APs or the number of obstacles.

Two maps with 29 and 32 APs respectively: both crash the web interface, ncs stop and ncs start required through an SSH session in order to recover.

Another two maps with 25 and 26 APs respectively (our third and fourth biggest number of APs on a single map): no problem.

If SSH'd into CPI at the time of the crash, this error has been spotted:-

*** glibc detected *** /opt/CSCOlumos/jre/bin/java: corrupted double-linked list: 0x0000000059aecfb0 ***

*** glibc detected *** /opt/CSCOlumos/jre/bin/java: free(): invalid pointer: 0x000000005a887b80 ***

These maps all worked correctly under CPI 1.3 (in fact we never experienced a crash under 1.3) but both 2.0 and 2.0.3 crash with the 29 AP and 32 AP maps. We've had to warn staff with system monitoring accounts not to view them.

Joshua Hall · ‎02-27-2014

Same issue here -- except these are brand new maps, and I've only got 2 APs on the map. It seems to crash maps at random. Exporting, deleting, and re-importing doesn't fix the problem either. I get a bug ID in the status bar when the issue happens if viewing in IE, but the bug toolkit says it's not a public bug ID and asks where I even saw the ID...

Frustrating that it happens randomly. These maps aren't exactly a 30 second deal to set up...

powys · ‎12-18-2014

Just a final update on this: we've just upgraded to 2.1.2 (in-place upgrade) and it's fixed the issue where maps with more than 28 APs crashed the web service. Our biggest map has 32 APs and over 1,500 obstacles and it's working perfectly once more.

Takes a while to reboot (about 15 minutes for all services to start) but at least we know that this is normal behaviour for our system and not to panic.

mikeya.herbert · ‎11-16-2015

I am running v2.1.2 and i am experiencing the same issue. did you install a patch along with the upgrade to v2.1.2? what version did you upgrade from? I upgraded from v1.4

m.o.andersson_2 · ‎04-23-2014

Not sure if my problem is the connected to this to this bug or not but NCS fails to start. I get this output:

Health Monitor is running, with an error.
failed to start PI on startup Health Monitor
Ftp Server is running
Database server is running
Tftp Server is running
Matlab Server is running
NMS Server is stopped.
CNS Gateway with port 11011 is down
CNS Gateway SSL with port 11012 is down
CNS Gateway with port 11013 is down
CNS Gateway SSL with port 11014 is down
Plug and Play Gateway Broker with port 61617 is down
Plug and Play Gateway config, image and resource are down on https
Plug and Play Gateway config, image and resource are down on http
Plug and Play Gateway is stopped.
SAM Daemon is running ...
DA Daemon is running ...
Syslog Daemon is running ...

The startup takes forever, and looking at the launchout.log shows only the following error:

# cat /opt/CSCOlumos/logs/launchout.log
Starting Health Monitor as a primary
Checking for Port 8082 availability... OK
truststore used is /opt/CSCOlumos/conf/truststore
truststore used is /opt/CSCOlumos/conf/truststore
CERT MATCHED :
Updating web server configuration file ...
Starting Health Montior Web Server...
Health Monitor Web Server Started.
Starting Health Monitor Server...
Health Monitor Server Started.
Database server started.
HMMain: StartNCS method with kill stale
WCSAdmin::startServices

Processing Service Name: Ftp
Starting Remoting Service: Ftp Server

Processing Service Name: Database
Database is already running.
Stopping Xvfb...
Starting Xvfb...
system property before init instance: null
Starting Remoting Instance: Ftp Server
Checking for Port 20558 availability... OK
Starting up FTP server
Started FTP
FTP Server started
Starting Remoting Service Web Server Ftp Server...
Remoting Service Web Server Ftp Server Started.
Starting Remoting Service Ftp Server...

Processing Service Name: Tftp
Remoting 'Ftp Server' started successfully.
Starting Remoting Service: Tftp Server

Processing Service Name: Matlab
Starting Remoting Service: Matlab Server
Stopping Xvfb...
Starting Xvfb...
system property before init instance: null
Starting Remoting Instance: Tftp Server
Checking for Port 20559 availability... OK
Starting Remoting Service Web Server Tftp Server...
Remoting Service Web Server Tftp Server Started.
Starting Remoting Service Tftp Server...
Remoting 'Tftp Server' started successfully.

Processing Service Name: NMS Server
Stopping Xvfb...
Starting Xvfb...
system property before init instance: null
Starting Remoting Instance: Matlab Server
Checking for Port 20555 availability... OK
Starting Remoting Service Web Server Matlab Server...
Remoting Service Web Server Matlab Server Started.
Starting Remoting Service Matlab Server...
Remoting 'Matlab Server' started successfully.
DEPENDENCY CHECK: Database
Starting NMS Server
Checking for running servers.
Checking if DECAP is running.
00:00 DECAP is not running.
00:00 Check complete. No servers running.
00:11 DECAP setup complete.
Starting Server ...

Done waiting DB initialization
Starting SAM daemon...
Done.
Starting DA daemon...
Starting DA syslog daemon...
Creating Application Context
Failed to create Application context

Im not sure if upgrading will do the trick, but want to make sure so i dont destroy anything even more if i try to fix my problem. But i cannot access the web GUI either so i will need to do it with CLI.

Im not sure how to get any more information about the problem either, is there any way to do a debug or anything to se why the "Application context" fails?

Cisco Prime Infrastructure
------------------------------------------
Version : 2.0.0.0.294

Cheers // Mattias

Chris Murray · ‎08-29-2015

I'm having the same issue. Did you ever find a resolution?

m.o.andersson_2 · ‎08-30-2015

Backup, fresh install and upgrade to latest version... Im afraid i don't remember exactly how i solved it, it was a long time ago. :)

Giuliano Gerardi · ‎05-09-2014

Did anyone solved?

same issue HERE