06-08-2015 12:50 AM - edited 07-05-2021 03:21 AM
Hello, experts!!!
2 Prime NCS'es setup with HA
both suddenly went down with the following ncs stat message after 3x retarts:
Cisco-NCS-Pri/admin# ncs stat
Health Monitor is running, with an error.
initHealthMonitor(): can not start DB
Ftp Server is Stopped
Database server is stopped
Tftp Server is Stopped
Matlab Server is Stopped
NMS Server is stopped.
CNS Gateway with port 11011 is down
CNS Gateway SSL with port 11012 is down
CNS Gateway with port 11013 is down
CNS Gateway SSL with port 11014 is down
Plug and Play Gateway Broker with port 61617 is down
Plug and Play Gateway config, image and resource are down on https
Plug and Play Gateway config, image and resource are down on http
Plug and Play Gateway is stopped.
SAM Daemon is not running ...
DA Daemon is not running ...
Syslog Daemon is not running ...
Compliance engine is not running
Cisco-NCS-Pri/admin#
what can be done to restore it to its last known state?
06-08-2015 07:03 AM
Found the following link please check.
http://www.cisco.com/c/en/us/td/docs/net_mgmt/prime/infrastructure/1-2/user/guide/prime_infra_ug/maint_sys_health.html#wp1070549
http://www.cisco.com/c/en/us/td/docs/net_mgmt/prime/infrastructure/2-0/administrator/guide/PIAdminBook/config_HA.html
06-09-2015 01:39 AM
We have made backup of the *.gpg files including the most recent ones.
If I issue the ncs db reinitdb:
1. Will the following stopped services be restored to operational states:
Health Monitor is running, with an error.
runHealthMonitor(): failed to start Database
Ftp Server is Stopped
Database server is stopped
Tftp Server is Stopped
Matlab Server is Stopped
NMS Server is stopped.
CNS Gateway with port 11011 is down
CNS Gateway SSL with port 11012 is down
CNS Gateway with port 11013 is down
CNS Gateway SSL with port 11014 is down
Plug and Play Gateway Broker with port 61617 is down
Plug and Play Gateway config, image and resource are down on https
Plug and Play Gateway config, image and resource are down on http
Plug and Play Gateway is stopped.
SAM Daemon is not running ...
DA Daemon is not running ...
Syslog Daemon is not running ...
Compliance engine is not running
2. Will the maps and the plotted APs be available again?
3. Will the discovered devices’ backup files still be intact?
BR’s,
Neil
06-09-2015 03:06 AM
Symptom:
NMS process not starting after restoring the PI 2.1 backup.
PI22-pro-234/admin# ncs status
Health Monitor is running, with an error.
failed to start PI on startup Health Monitor
Matlab Server Instance 1 is running
Ftp Server is running
Database server is running
Matlab Server is running
Tftp Server is running
NMS Server is stopped.
Matlab Server Instance 2 is running
CNS Gateway with port 11011 is down
CNS Gateway SSL with port 11012 is down
CNS Gateway with port 11013 is down
CNS Gateway SSL with port 11014 is down
Plug and Play Gateway Broker with port 61617 is down
Plug and Play Gateway config, image and resource are down on https
Plug and Play Gateway is stopped.
SAM Daemon is running ...
DA Daemon is running ...
Conditions:
Restoring the PI 2.1 backup
Workaround:
Contact the TAC to schedule a WebEx session to have a workaround implemented.
Further Problem Description:
log4j: Adding appender named [LogFileAppenderAEMSConfiguration] to category [com.cisco.aems.utils].
deviceStatusUpdateHook - Object: com.cisco.ifm.inventoryserviceimpl.DeviceStatusUpdateHook@688fbfea
[ResourceClassLoader@27524a91] warning at Type 'IfmConfigTemplatesRestVirtualDomainFilter' (no debug info available)::0 no match for this type name: com.cisco.ifm.template.importExport [Xlint:invalidAbsoluteTypeName]
[ResourceClassLoader@33d74da4] warning at Type 'IfmConfigTemplatesRestVirtualDomainFilter' (no debug info available)::0 no match for this type name: com.cisco.ifm.template.importExport [Xlint:invalidAbsoluteTypeName]
[ResourceClassLoader@46740ca0] warning at Type 'IfmConfigTemplatesRestVirtualDomainFilter' (no debug info available)::0 no match for this type name: com.cisco.ifm.template.importExport [Xlint:invalidAbsoluteTypeName]
Stopping
[ResourceClassLoader@45c3571c] warning at Type 'IfmConfigTemplatesRestVirtualDomainFilter' (no debug info available)::0 no match for this type name: com.cisco.ifm.template.importExport [Xlint:invalidAbsoluteTypeName]
Application context could not be created. Will now exit
###STARTUP FAILED###
org.springframework.beans.factory.access.BootstrapException: Unable to return specified BeanFactory instance: factory key [applicationContext-main], from group with resource name [classpath*:beanRefContext.xml]; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'applicationContext-main' defined in URL [file:/opt/CSCOlumos/conf/beanRefContext.xml]: Instantiation of bean failed; nested exception is org.springframework.beans.BeanInstantiationException: Could not instantiate bean class [org.springframework.context.support.ClassPathXmlApplicationContext]: Constructor threw exception; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'deploymentHandler': Invocation of init method failed; nested exception is java.lang.NullPointerException
Related cause: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'mapService.createInstance' defined in class path resource [META-INF/spring/rfm-application-context.xml]: Invocation of init method failed; nested exception is java.lang.NullPointerException
06-10-2015 05:29 PM
Hello Neil,
Following are the possible issues that can occur in the high-availability environment:
The primary or secondary Prime Infrastructure goes down during the high-availability registration process.
The primary or secondary Prime Infrastructure goes down during the failback process.
The secondary Prime Infrastructure goes down during the failover process.
The possible causes for the above issues can be that the database or the NMS server has failed to start.
Will the following stopped services be restored to operational states:
Failover should be considered temporary. The failed primary Prime Infrastructure should be restored to normal as soon as possible, and failback is initiated. The longer it takes to restore the failed primary Prime Infrastructure, the longer the other Prime Infrastructure sharing that secondary Prime Infrastructure must run without failover support.
Will the discovered devices’ backup files still be intact?
Yes, It will keep the backup files intact.
1. Make sure that you have a backup before starting the high-availability registration or initiating the failback process.
2. If there is any issue with starting the database or the process, complete the following in the primary Prime Infrastructure:
a. Run the following command to re-create a new database:
/opt/CSCOlumos/bin/dbmigrate.sh recreateDB
or run the following command in admin console to re-create a new database:
ncs run reset db
b. Run the following command to remove the existing database:
rm /opt/CSCOlumos/.dbCreated
c. Stop all the processes.
d. Start all the processes.
e. Restore the backup and continue with the high-availability registration.
For more information Please go through the below link :
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide