cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
3736
Views
0
Helpful
7
Replies

ISE NODE NOT REACHABLE when building distributed deployment

vancamt76
Level 1
Level 1

I am trying to build a distributed deployment with the following personas:

2 policy admin nodes

2 monitoring nodes

4 policy service nodes

This was a project that was partially implemented but never in production. It was in a distributed deployment, but half the nodes were no longer working (http errors or devices weren't reachable or could not sync). I decided to start from scratch. All nodes were:

-de-registered

-application was reset to factory defaults on all nodes

-upgraded all 8 nodes to 1.1.4.218 patch 1

-installed all new certs and joined all nodes to the domain

-added to DNS forward and reverse lookup zones

When I make 1 admin node primary and register the other nodes (secondary admin, monitoring, policy services) the nodes successfully register and show up in the deployment window of the primary; however, all the nodes show as NODE NOT REACHABLE. After registration, I've noticed that the registered nodes are still showing as STANDALONE if I access the GUI. I've tried rebooting them manually after registration and they are still unreachable. I have also tried resetting the database user password from the CLI on both admin nodes and the results are always the same.

7 Replies 7

Jatin Katyal
Cisco Employee
Cisco Employee

Have you already gone through this thread.

https://supportforums.cisco.com/thread/2220572

Jatin Katyal
- Do rate helpful posts -

~Jatin

yes I have - in fact, it's still open on my screen . 6 of the nodes are on the same vlan and the remaining 2 nodes are on a separate subnet, but there is no firewall in between. I've also already reset the database user passwords. When I do a port scan on the nodes, I do not see any of them listening on tcp port 1521.

On my last distributed deployment setup I had a few nodes that didnt join successfully.  My gut feeling was I added to many nodes at one time and the sync of all them upset the ISE.  I de-registered the non-sync nodes for the admin node and then made sure the formerly failed nodes appeared ok in standalone mode with all services running.  I then added them back in one at a time waiting for the sync to complete, before moving on to the rest.

But you are saying all of yours have failed to sync?  Did you add them all at the same time?  Are you sure the certificates are valid on all nodes?

Originally I had added them all at the same time. I thought that maybe I just wasn't waiting long enough for the sync. I waited an entire day and all the nodes were still unreachable. At this point, I've de-registered all the nodes, rebooted all the nodes, converted the primary back to standalone (the remaining nodes never converted from standalone to distributed even when I rebooted them after registering despite a message that they were successfully registered), converted one node back to primary and tried to register just the secondary admin node giving it plenty of time to sync; this node is still not reachable from the primary.

I've quadruple checked the certificates on all the nodes, these certs were all added on the same day (just last week) and the default self-signed certs were removed.

I had restored from a backup on the primary so I might just rest the config on that node and try joining the other nodes before I restore again.

Venkatesh Attuluri
Cisco Employee
Cisco Employee

You can do the following:
• For out of  sync issues, which most likely are due to time changes or NTP sync
issues,  you must correct the system time and perform a manual sync up through
the  UI.
• For certificate expiry issues, you must install a valid certificate and  perform a
manual sync up through the UI.
• For a node that has been down  for more than six hours, you must restart the node,
check for connectivity  issues, and perform a manual sync up through the UI.

I have seen the same behavior some times, and it often works if I put in the new nodes IP address instead of FQDN.

That has solved it for me a few times at least

/Kelvin Dam

vancamt76
Level 1
Level 1

Thank you everyone for the replies. Just an update after working with TAC:

We tried de-registering and re-registering the nodes to the primary and received the same problem, restarted all nodes and still wouldn't sync. TAC advised that I should remove the application from the primary, download the .iso from cisco.com and reinstall the application. After doing these steps on the primary, I was able to successfully join all nodes in a distributed deployment.

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: