07-10-2023 04:10 AM - last edited on 07-10-2023 04:26 AM by rupeshah
NSO generate error when ha-raft create-cluster, I don't know from where ncsd@ is coming! and is added in the front of the hostname
admin@ncs# show ha-raft
ha-raft status role leader
ha-raft status leader nso01.example.com
ha-raft status member [ nso01.example.com nso02.example.com nso03.example.com ]
ha-raft status connected-node [ nso02.example.com nso03.example.com ]
ha-raft status local-node nso01.example.com
SERIAL NUMBER EXPIRATION DATE FILE PATH
--------------------------------------------------------------------------------------------------
xxxx 2033-07-07T10:27:52+00:00 /etc/ncs/ssl/cert/nso01.crt
SERIAL NUMBER EXPIRATION DATE FILE PATH
-----------------------------------------------------------------------------------------------
xxxx 2033-07-07T08:09:57+00:00 /etc/ncs/ssl/cert/ca.crt
ha-raft status log current-index 0
ha-raft status log applied-index 0
ha-raft status log num-entries 11
NODE STATE INDEX LAG
---------------------------------------------------
ncsd@nso02.example.com requires-snapshot 0 0
ncsd@nso03.example.com requires-snapshot 0 0
on the seed node:
admin@ncs# show ha-raft
ha-raft status role stalled
ha-raft status leader nso01.example.com
ha-raft status connected-node [ nso01.example.com nso03.example.com ]
ha-raft status local-node nso02.example.com
SERIAL NUMBER EXPIRATION DATE FILE PATH
--------------------------------------------------------------------------------------------------
xxxx 2033-07-07T10:27:39+00:00 /etc/ncs/ssl/cert/nso02.crt
SERIAL NUMBER EXPIRATION DATE FILE PATH
-----------------------------------------------------------------------------------------------
xxxx 2033-07-07T08:09:57+00:00 /etc/ncs/ssl/cert/ca.crt
ha-raft status log current-index 0
ha-raft status log applied-index 0
ha-raft status log num-entries 0
admin@ncs#
info message:
<INFO> 10-Jul-2023::10:31:45.441 9ccc7c1bdb1d ncs[152]: Leader[raft_server_ha_raft_1, term 26] append failure for follower {raft_identity,raft_server_ha_raft_1,'ncsd@nso02.example.com'}. Follower reports local log ends at 0. <INFO> 10-Jul-2023::10:31:45.442 9ccc7c1bdb1d ncs[152]: Leader[raft_server_ha_raft_1, term 26] append failure for follower {raft_identity,raft_server_ha_raft_1,'ncsd@nso03.example.com'}. Follower reports local log ends at 0.
NSO01 leader node:
<ha-raft>
<enabled>true</enabled>
<cluster-name>amsterdam</cluster-name>
<listen>
<node-address>nso01.example.com</node-address>
</listen>
<seed-nodes>
<seed-node>nso02.example.com</seed-node>
</seed-nodes>
<ssl>
<ca-cert-file>${NCS_CONFIG_DIR}/ssl/cert/ca.crt</ca-cert-file>
<cert-file>${NCS_CONFIG_DIR}/ssl/cert/nso01.crt</cert-file>
<key-file>${NCS_CONFIG_DIR}/ssl/cert/nso01.key</key-file>
</ssl>
</ha-raft>
NSO02 seed node:
<ha-raft>
<enabled>true</enabled>
<cluster-name>amsterdam</cluster-name>
<listen>
<node-address>nso02.example.com</node-address>
</listen>
<seed-nodes>
<seed-node>nso02.example.com</seed-node>
</seed-nodes>
<ssl>
<ca-cert-file>${NCS_CONFIG_DIR}/ssl/cert/ca.crt</ca-cert-file>
<cert-file>${NCS_CONFIG_DIR}/ssl/cert/nso02.crt</cert-file>
<key-file>${NCS_CONFIG_DIR}/ssl/cert/nso02.key</key-file>
</ssl>
</ha-raft>
NOS03:
<ha-raft>
<enabled>true</enabled>
<cluster-name>amsterdam</cluster-name>
<listen>
<node-address>nso03.example.com</node-address>
</listen>
<seed-nodes>
<seed-node>nso02.example.com</seed-node>
</seed-nodes>
<ssl>
<ca-cert-file>${NCS_CONFIG_DIR}/ssl/cert/ca.crt</ca-cert-file>
<cert-file>${NCS_CONFIG_DIR}/ssl/cert/nso03.crt</cert-file>
<key-file>${NCS_CONFIG_DIR}/ssl/cert/nso03.key</key-file>
</ssl>
</ha-raft>
Solved! Go to Solution.
07-31-2023 07:23 AM
Hi Erdem,
The problem is been solved with Cisco TAC.
we simply deleted the snapshot: rm -rf /nso/run/state/raft/ha_raft.1/snapshot.0.0
and now it works again.
07-31-2023 06:55 AM
Hello BasharAziz,
Thanks for sharing. The 'ncsd@' part is implicitly added as it forms the internal representation for node names together with hostname.
Create cluster issue needs to be investigated further. Could you share the raft.log and devel.log?
From what you shared, it looks like the cluster was formed but later 'nso02.example.com' got stalled for some reason.
Regards,
Erdem
07-31-2023 07:23 AM
Hi Erdem,
The problem is been solved with Cisco TAC.
we simply deleted the snapshot: rm -rf /nso/run/state/raft/ha_raft.1/snapshot.0.0
and now it works again.
08-01-2023 01:31 AM
I am glad that your issue was resolved.
Cheers!
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide