Re: Post 3.4, Patch 4, replication stop with PAN and CLI failed to con

anilraj_003 · ‎12-23-2025

Post 3.4, patch 4, Replication stop between PAN and PSNs, error, Jediss replication failed, CLI access issue-error failed to connect to the server. throwing an error in the debug log :

Error, Failed to connect to server, could not connect to test-ise-01.net.lab/198.XX.XX.01:12001

replication error: from psn debug:-FullSync:- Primary address is null

Marcelo Morais · ‎12-23-2025

@anilraj_003

please

check the 12001 ports on both Nodes:

ise/admin# show ports | include 12001
 ...
 198.xx.xx.01:12001, 
 ...

check the Services for any initializing or not running State:

ise/admin# show application status ise

 ISE PROCESS NAME              STATE     PROCESS ID 
 --------------------------------------------------
 Database Listener             running   8436 
 Database Server               running   203 PROCESSES
 Application Server            running   27824 
 ...

Check if there is anything blocking the communication between the Nodes.

For Cisco ISE port reference:

Cisco ISE Installation Guide - Release 3.4 - Port Reference

Hope this helps !

anilraj_003 · ‎12-23-2025

Thanks for the input. In our case, CLI access is not possible on any node (SSH / CIMC / serial all drop immediately after the login prompt), so we are unable to run 'show application status ise' or verify ports locally. We have a total of 15 Physical nodes, SNS36XX, all of which we can say are technically dead.

Problem : This is not a replication problem — this is an OS / shell / service-layer collapse on all 15 nodes after 3.4 Patch 4, where CLI sessions cannot start, PAN replication services are not responding, and the cluster is effectively brain-dead.

PSN logs confirm repeated connection attempts to PAN on port 12001, but the PAN replication service is not responding, and the Primary address becomes null. This is being investigated with TAC as an OS-level issue requiring recovery.

That’s it.
No back-and-forth needed.

Marcelo Morais · ‎12-24-2025

@anilraj_003 ,

interesting ... I'm a bit curious, no CLI access via SSH or Console, correct ?

Are you able to remove 2x Nodes from your 15-Node Cluster, the SPAN and a PSN, to create a Small Deployment, and test the replication of this new Cluster ?

Note 1: if the answer is yes, the SPAN will be the PPAN of the new Cluster.

Note 2: I'm thinking of testing whether the problem was specific to PPAN or also to SPAN. If SPAN is OK, then you can rebuild your entire Cluster using SPAN.

Please keep us posted about the TAC investigation !

Best regards

anilraj_003 · ‎12-24-2025

Thanks for the suggestion. Unfortunately, in our case, CLI access is not available(lost) on any node (P-PAN, S-PAN, PSNs, pxGrid, MNT). SSH, console, and CIMC KVM all exhibit the same behaviour where authentication succeeds, but the shell session immediately closes. I can't share a screenshot on this platform, but cli output is saying " failed to connect server" after passing login credentials.

We already attempted a PAN role switch to S-PAN as a new P-PAN was introduced, but not help. This indicates the issue is not specific to the P-PAN role but is systemic across the cluster. Saw replication log collected from debug: "org.jgroups.protocols.TUNNEL -:::::- Failed connecting to GossipRouter at pan-test.com/192.168.1.1:12001"

We are currently working with Cisco TAC/BU/Engineering, who are investigating this as an OS-level / recovery scenario. We’ll share updates once TAC completes the analysis.

Marcelo Morais · ‎12-24-2025

@anilraj_003 ,

thanks for your feedback. Please keep us posted !

Note: I've had issues in the past that would occur only in a Distributed Deployment. When I removed the SPAN from the Cluster and it became a Standalone, the problem was fixed, and from that point on I could recreate the Cluster via this SPAN.

Post 3.4, Patch 4, replication stop with PAN and CLI failed to connect