cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
15635
Views
15
Helpful
14
Replies

ISE Secondary Node unable to join Primary Node

rick505d3
Level 1
Level 1

Hi,

We had a working ISE 2.1 Patch 3 deployment of two nodes. The secondary node was showing some issues so we dropped and rebuilt the VM. The new secondary runs the same software version and patch level. Forward / Reverse DNS entries for both nodes exist. A public CA singed wildcard certificate is installed on both. 

When the node register operation is initiated from Primary node, it asks for the new nodes FQDN, user and password. It detects and authenticates the other node and shows the services to be enabled screen on the new node. Same services (Admin, Monitoring, PSN and Device Admin) are enabled on both nodes. On clicking OK, the primary displays the message that node registration has been successful and data will now be synced to the secondary node. The ISE Deployment tab shows both the nodes in the list. The status of Secondary is registration / data sync in progress. It takes unusually long (3 to 5 hours) for the primary to report the message "Node Registration or Sync failed. Please deregister and register the node again". Deregister and register again results in the same error. 

Regards, 

Rick.

14 Replies 14

Marvin Rhoads
Hall of Fame
Hall of Fame

It sounds like you've done everything correctly.

In my experience the data sync should not take anywhere near that long. There may be some issues in the database (which may have in turn been related to the issues you were seeing on the failed unit - expecially if it was the primary MnT node).

I'd definitely recommend that you take this up with the TAC.

I am having this very same problem on version 2.0 patch 3. I have a TAC case open. Has anyone had success in getting to the bottom of this?

I am not sure if you have the same issue as mine but I have managed to fix my issue.

After I installed patch 3 on ISE 2.1.0.474 on all ISE nodes, they could no longer synchronize.
After I clicked the "Syncup" button, the status of all secondary nodes remained "in progress" for a very long time. Eventually, they will turn to "Not in Sync".
When I registered a node, it would display the message "Node Registration or Sync failed. Please deregister and register the node again".

I fixed the issue by promoting the secondary administration node to become the new PAN, and started synchronization from the new PAN. Then I noticed that it can synchronize other ISE nodes and the original PAN. After I promoted the old PAN back to PAN again, it can also synchronize other ISE nodes. I guessed there was a database failure somewhere with the old PAN and all secondary ISE nodes does not accept the synchronization initiated from that PAN.

Yes, this sounds very similar to what I am seeing. Thanks for the information. TAC is wanting to reset the ISE application after de-registering and then re-join it to the deployment. One thing I really would like to know is if you can see TCP port 12001 on your secondary node. This is the odd thing about my cluster deployment. On the primary node I can telnet and run a show port | include 12001 and see it listening. On the secondary node I cannot see TCP 12001 listening. I believe this port should be open on both devices for replication to work. Could someone verify this port is open on primary and secondary?

TCP 12001 is not listening on my secondary administration node (it also serves as the primary MnT node). Here's a show ports on that node:

Process : timestensubd (3396)
tcp: 127.0.0.1:13226
Process : redis-server (8248)
tcp: 0.0.0.0:6379
Process : ttcserver (3398)
tcp: 127.0.0.1:16876, 0.0.0.0:53385
Process : Decap_main (13549)
tcp: 0.0.0.0:2000
udp: 0.0.0.0:9993
Process : jsvc.exec (13391)
tcp: 10.49.78.12:49, 10.49.78.12:50, 10.49.78.12:51, 10.49.78.12:52, 127.0.
0.1:8888, :::5514, :::9002, :::1099, :::8910, :::61616, :::80, :::9080, :::443,
:::33404, :::9085, :::9090, 127.0.0.1:2020, :::9060, :::8905, :::8009
udp: 0.0.0.0:34042, 10.49.78.12:1700, 0.0.0.0:35845, 10.49.78.12:3799, 10.4
9.78.12:58438, :::63682, 10.49.78.12:20326, :::40095
Process : timestensubd (3393)
tcp: 127.0.0.1:36178
Process : timestend (3389)
tcp: 0.0.0.0:53396
Process : sshd (2475)
tcp: 0.0.0.0:22, :::22
Process : timestensubd (3395)
tcp: 127.0.0.1:20345
Process : master (818)
tcp: 127.0.0.1:25, ::1:25
Process : monit (17357)
tcp: 127.0.0.1:2812
Process : timestensubd (3394)
tcp: 127.0.0.1:19519
Process : jsvc.exec (28213)
tcp: 0.0.0.0:2560, 0.0.0.0:9444
Process : ora_d000_cpm10 (6414)
tcp: :::27758
udp: ::1:38165
Process : java (14895)
tcp: ::1:9200, 127.0.0.1:9200, 10.49.78.12:9300
Process : tnslsnr (6235)
tcp: :::1521, :::1528
Process : java (13436)
tcp: 127.0.0.1:7634, :::17370, :::20514, 127.0.0.1:20515
udp: 0.0.0.0:20514, :::14879, :::31364, :::50070, :::50955, :::20279, :::38
275, :::24959, :::26314, :::26422, :::27457, :::44659, :::62098
Process : java (17423)
tcp: :::9086
Process : ntpd (2952)
udp: 10.49.78.12:123, 127.0.0.1:123, 0.0.0.0:123, fe80::250:56ff:fead:123,
::1:123, :::123
Process : ora_lreg_cpm10 (6406)
udp: ::1:48383
Process : ora_s000_cpm10 (6416)
udp: ::1:54851

Yes this is how mine is. It appears to me that 12001 should only be enabled on the primary. How did you reset the M&T database? I am thinking this is my issue as well.

Thanks so much

No problem.

In my case, I went to the node that /opt was almost full on (the PAN) and typed

application configure ise

option 4 (Reset M&T Database)

It was suggested to do that during a maintenance window, but I didn't wait for that since it only seems to affect administration/MnT and not clients. In my environment, that downtime is tolerable.

Thanks so much. I responded to TAC with what you are seeing as well and he shared with me the admin guide that generally speaks of TCP 12001 needing to be open. I can't find that it talks about on both nodes. Can anyone else confirm that Port 12001 TCP is open on both the primary and secondary?

Thanks 

TCP port 12001 is open on my PAN but not other secondary nodes. I have done the tests when I was troubleshooting the issue. It is still the same after the I resolved the issue. Therefore I believe that is not the cause.

You can go to "Operations>Troubleshoot>Download Logs" and have a look on you PAN. When I tried to access the log to troubleshoot, I noticed that my PAN did not have the gold badge to identify it as an administration node while my secondary administration node had. So I promoted my secondary administration node to be PAN and started synchronisation from the new PAN onto the old PAN. After that, my issue was resolved.

Landon
Level 1
Level 1

Did you ever resolve this? I'm having the same problem. I'm trying to register a node with just PSN/Device Admin, but I'm also running 2.1 patch 3.

Hi Landon, 

We opened a TAC case and the engineer narrowed it down to one/more of these bugs CSCuv95664, CSCvc28417, CSCvc79739. We are given a db patch specific to our conditions that we are yet to apply in a change window.

Regards, 

Rick. 

Thanks, Rick. I'll open up a TAC case and see where it goes. I appreciate the reply.

Were you able to get an resolution to this?

Yes. My problem was related to /opt being full on my PAN (it's also secondary MnT). I reset the M&T database and then I was able to register the new PSN.