05-09-2021 04:06 PM - edited 05-09-2021 04:12 PM
The purpose of this document is to present the different troubleshooting steps to take when some service from the Cisco IM & Presence Service Server have not started gracefully.
The IM&P Services have the following states:
Started - The service is active and running
Starting - The service is in the transition from Stop to Started
Stopped - The services are not started, could be because it was stopped manually or it is not activated.
Stopping - The service is in the transition from Started to Stop.
Always keep in mind that after a reboot of the IM&P node, the following alert will be generated:
The Cisco IM and Presence Data Monitor has detected that database replication is not complete, and/or that the Cisco Sync Agent sync from Cisco Unified Communications Manager is not complete. Some services will remain in the "Starting" state until replication and the Cisco Sync Agent sync are completed.
The message not necessarily means that the services remain in the "Starting" state since the alert was generated.
This is expected as the IM&P Data Monitor will commence monitoring the services as soon as the IM&P comes up from a reboot or boot. The first thing that the Monitor Service will detect is that all the main services are in the process of being starting, which will trigger the message.
Run the command: utils service list to confirm that the services are actually Started, and if they are, feel free to delete the alert to keep the Notification Alerts clean
The first step to troubleshoot the services not Starting is to understand which services are the ones not started.
If most of the services are on Starting state or just some of them, once identified the services, you need to verify if they are dependant on one from the other or not. We will see this in more detail later.
It is important to validate the legend that appears on the right side of the services that are stopped, commonly we can identify:
One of the most common issues that are found on the IM&P Subscriber after a restart is to see almost all of the services on STARTING state, while the IM&P Publisher shows all the services started.
The common cause of this behaviour is given by a restart of the IM&P Subscriber without disabling the High Availability from the Presence Redundancy Groups.
The solution to fix this problem is the following:
Step 1. Disable HA
Step 2. Run the following command on both IM&P nodes
Step 3. Wait around 5 minutes and run utils service list again to confirm that the services are now Started.
Step 4. Once all the services are Started on the Subscriber, you require to run the following command on both IM&P nodes:
Step 5. Re-enable the High Availability from the Presence Redundancy Groups
Although uncommon, there have been scenarios where some network services do not start on the IM&P Publisher, these are:
Impact: The XCP, Presence Engine and SIP Proxy services won't start, as those are dependant on the Network services listed. This will cause that the IMDB does not replicate and the Jabber users are unable to log in.
Solution:
Step 1. Disable HA
Step 2. Start manually each service in the following order:
Keep in mind
If the previous steps have not worked, a TAC case should be opened for further troubleshooting. Keep in mind that the following outputs and logs will be required and that some traces will require to be set to debug before the reproduction of the issue, i.e., attempt to start the service manually.
This is one of the main services within the system.
Impact: Cannot access certain features on the Server webpage, Jabber users and their features might get compromised, IDS DB replication gets broken.
Causes: The most common causes identified for this issue are:
Solution: Unfortunately, there are no straight solution steps for this service not starting. The suggestions are:
Step 1. Disable HA
Step 2. Restart A Cisco DB replicator
Step 3. Restart A Cisco DB, if it remains in starting state, try stopping it and then starting it.
The best approach here is to engage Cisco TAC for further investigation, and the following information will be required for them:
Impact: The IM&P database won't be synchronized across the IM&P nodes and IM&P clusters (Inter-cluster peering)
If the ICSA service is not starting, there are possible 3 main reasons why it is not:
If the restart of the node or reason 1 does not help the service to come up, a TAC case should be opened for further troubleshooting. Keep in mind that the following outputs and logs will be required and that some traces will require to be set to debug before the reproduction of the issue, i.e., attempt to start the service manually.
For the Cisco Presence Engine service, there are several variants that we should be taking into account to understand why it is not starting and how to make it start.
Step 1. Keep the IM&P Sub in the PRG
Step 2. Disable HA
Step 3. Restart first the Cisco SIP Proxy Service, wait until it starts.
Step 4. Restart the Cisco PE service, wait until it starts.
Step 5. Steps 3 and 4 are required to be performed on the IM&P Publisher first, then on the subscriber.
2. If the IM&P Subscriber is already added into the PRG, and the PE remains in stopped or starting state, that could be related to a mismatch in the DB Replication between the two IM&P nodes, and the following query should be executed: run sql select * from enterprisenode.
What that query will display is the id of the node, the subclusterid of the node (which is the PRG id), name or IP address and other values. What we want to focus on, is that both IM&P nodes share the same subclusterid value.
3. If all services are started, except for the PE, and steps 1 and 2 were verified:
Step 1. Run set replication-sync monitor disable on both IM&P nodes
Step 2. Wait around 5 minutes and if not started, attempt to start the service manually utils service start Cisco Presence Engine
Step 3. Run set replication-sync monitor enable (either if the service started or not)
4. If steps 1, 2 or 3 did not help to make the PE service start, we might be facing either two scenarios that will require the access of the remote account of the server to validate.
Scenario 1: Validate the PE process.
Scenario 2: If you are running version 12.5 it is highly probable to be hitting the following defect: CSCvg94247
Therefore, perform the following:
Step 1. Make sure the Cisco Presence Engine Service is set to debug.
Step 2. Attempt steps 1, 2 and 3.
- And if you find discrepancies with step 2 you might want to be TAC on the call.
Step 3. If after the above steps, the service remains in starting state, collect the following logs for the timeframe of the attempt, and engage Cisco TAC:
Impact: Synchronization of DB Tables from CUCM to IM&P will not be completed, impacting mainly the end-user synchronization across the cluster.
Solution: Review the following checklist.
If the above actions do not help to solve the problem, you will need to engage Cisco TAC for further troubleshooting. Keep in mind that the following outputs and logs will be required and that some traces will require to be set to debug before the reproduction of the issue, i.e., attempt to start the service manually.
These services which are: Cisco XCP Directory Service, Cisco XCP File Transfer Manager, Cisco XCP Message Archives and Cisco XCP XMPP Federation, are disabled by default unless you used the feature of each service.
Even though your IMP has those services as activated, they won’t start unless you configure each feature for each service.
For instance:
The Cisco XCP Directory Service supports the integration of XMPP clients with the LDAP directory to allow users to search and add contacts from the LDAP directory.
To start this service you need to configure LDAP search settings for third-party XMPP clients (Choose Cisco Unified CM IM and Presence Administration > Application > Third-Party Clients > Third-Party LDAP Settings.)
You use Cisco XCP Directory Service to allow users of a third-party XMPP client to search and add contacts from the LDAP directory.
If you turn on the Cisco XCP Directory Service, but you do not configure the LDAP server, and LDAP search settings for third-party XMPP clients, the service will start, and then stop again as in your case.
To configure third-party XMPP directory:
This service allows you to use a server-side file transfer solution called managed file transfer.
MFT allows an IM and Presence service client, such as Cisco Jabber to transfer files to other users, ad hoc group chats and persistent chat.
The service will not start if the configuration for MFT is not in place.
To activate and use MFT:
The Cisco XCP Message Archiver service supports the IM Compliance feature. The IM Compliance feature logs all messages sent to and from the IM and Presence server, including point-to-point messages, and messages from adhoc (temporary) and permanent chat rooms for the Chat feature. Messages are logged to an external Cisco-supported database.
The service will not start if the configuration for compliance is not in place.
How to use message archiver:
The Cisco XCP XMPP Federation Connection Manager supports interdomain federation with third party enterprises such as IBM Lotus Sametime, Cisco Webex Meeting Center, GoogleTalk, and another IM and Presence enterprise, over the XMPP protocol.
Again this service won’t start until XMPP federation is configured.
How to configure XMPP federation:
I had issues with SYNC Agent would not start. It turned out to be the root/intermediate certs had expired. Uploaded new ones and started right away
Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: