01-16-2025 04:25 AM
Hello community,
I am using NDFC Version 12.2.2.241 with Nexus Dashboard Version 3.2.1i. When i try to restore NDFC, i get the error "cluster is not healthy". I checked my cluster and got the message "cisco-ndfc-controller-elasticsearch: could not fetch component status". I tried to search this error and got this https://bst.cisco.com/quickview/bug/CSCwm51621. It says “In such cases, follow WA's instructions to resolve the error and then proceed with the operation”. What is “follow WA”, what is “WA”. Can anyone explain it to me?
01-22-2025 07:05 AM
This can help you:
Workaround: Without TAC access, this would require a node reboot. To find out which node should be rebooted, do:
1. Login as rescue-user on all primary nodes
2. Run `ls -la /logs/k8/clusterhealth-controller.log`
3. Look at the modification timestamps
4. The node which has the log file with the latest modification TS needs to be rebooted.
With TAC access:
1. Log in as root to one of the nodes
2. kubectl get pods -n kube-system | grep mond
3. kubectl delete pods/-n kube-system (for all 3 pods)
In my case I haved the same error and the solution was reload the OS VM.
Regards.
01-25-2025 09:10 PM
I am currently using ndfc in my lab, and I did an acs reboot clean and restored the process from a backup. The error has disappeared. When I encounter the error again, I will try your method without TAC and report the result here. Thanks for the advice!
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide