cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1441
Views
10
Helpful
4
Replies

ISE 3.0 Patch 5 SLOW REPLICATION ERROR

jitendrac
Level 1
Level 1

Hi All ,

We have following setup of ISE for our customer 

DC - 1 PAN + MnT ( Administration, Monitoring) and 2 PSN (Policy Service) (Total 3 node)

DR - 1 PAN + MnT ( Administration, Monitoring) and 2 PSN (Policy Service) (Total 3 node)

Our DC node (PAN + MnT) is Primary Admin and Primary Monitoring 

ISE Version 3.0.0.458 Installed Patches 5

Under Administration-->System-->Deployment All nodes showing green tick

We are getting continuous following critical syslog message

Event Name: SLOW_REPLICATION_ERROR

Event Description: Replication is slow

Category: Error

ObjectName=Slow Replication,

OperationMessageText=Node XXXXXXX has slow replication since this node is not consuming messages for past 11493 minutes.

The number of pending messages are 45410

Status of this node is SYNC COMPLETED

Below are the overrall details of the nodes

  1. Seq No in Primary : 306284
  2. Seq No in Secondary : 260874
  3. Current Time : 27600747
  4. Primary Seq Time: 27600745
  5. Secondary Seq Time: 27589254
  6. Time of first unconsumed message in Primary: 27589254

Threshold Values : Pending message count > 40000 or Node is not consuming messages for 5 hours,

 

Any Idea what could be issue ?

 

 

 

2 Accepted Solutions

Accepted Solutions

Hi Marcelo Morais ,

 

I checked the timing and checked Operations > Reports > Reports > Endpoint and Users > Authentication Summary > filter Time Range but passed and failed authtnication not crosing 80 and Avg TPS is 0
This setup is all new setup there are hardly any users
However, when i done manually sync error went off and now we are not getting any alarm error as Replication is slow .
My customer told that there was network outage in their environment during said interval. I think that has caused this error

Anyways thanks for your asssitance. Really great to know how community are sharing knowledge 

Thanks

View solution in original post

Hi @jitendrac ,

 excellent news !!!

 80 and 0 are excellent numbers.

 "network outage" cause communication issues between Nodes.

 

Regards

View solution in original post

4 Replies 4

What do your CPU/Memory/Disk allocations look like?  Are these appliances or VMs?  What is the network transport between the nodes?  https://www.cisco.com/c/en/us/td/docs/security/ise/performance_and_scalability/b_ise_perf_and_scale.html

So to be clear you have a total of 6 nodes in your deployment?  2 PAN+MnT and 4 PSNs?  

Which nodes is the XXXX? "OperationMessageText=Node XXXXXXX has slow replication since this node is not consuming messages for past 11493 minutes." A PSN? or PAN+MnT?

Hi @jitendrac ,

 1st at Home > Alarms > click the Slow Replication Error link to open an Alarms: Slow Replication Error window, check  when the Slow Replication Error started during the day.

 2nd at Operations > Reports > Reports > Endpoint and Users > Authentication Summary > filter Time Range = Today and check in Authentication by Device Name window if you are having any high Authentication on an specific NAD(s).

 3rd at Operations > Reports > Reports > Diagnostics > Key Performance Metrics > check the numbers, special attention to Avg TPS (remember the Performance and Scalability Guide for ISE, search for Cisco ISE Scenario-Based Performance).

 

Hope this helps !!!

Hi Marcelo Morais ,

 

I checked the timing and checked Operations > Reports > Reports > Endpoint and Users > Authentication Summary > filter Time Range but passed and failed authtnication not crosing 80 and Avg TPS is 0
This setup is all new setup there are hardly any users
However, when i done manually sync error went off and now we are not getting any alarm error as Replication is slow .
My customer told that there was network outage in their environment during said interval. I think that has caused this error

Anyways thanks for your asssitance. Really great to know how community are sharing knowledge 

Thanks

Hi @jitendrac ,

 excellent news !!!

 80 and 0 are excellent numbers.

 "network outage" cause communication issues between Nodes.

 

Regards

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: