Solved: Cisco Nexus Dashboard Fabric Controller Elasticsearch error

config · ‎01-16-2025

Hello community,

I am using NDFC Version 12.2.2.241 with Nexus Dashboard Version 3.2.1i. When i try to restore NDFC, i get the error "cluster is not healthy". I checked my cluster and got the message "cisco-ndfc-controller-elasticsearch: could not fetch component status". I tried to search this error and got this https://bst.cisco.com/quickview/bug/CSCwm51621. It says “In such cases, follow WA's instructions to resolve the error and then proceed with the operation”. What is “follow WA”, what is “WA”. Can anyone explain it to me?

Gerard Paulino · ‎01-22-2025

This can help you:

Workaround: Without TAC access, this would require a node reboot. To find out which node should be rebooted, do:

1. Login as rescue-user on all primary nodes

2. Run `ls -la /logs/k8/clusterhealth-controller.log`

3. Look at the modification timestamps

4. The node which has the log file with the latest modification TS needs to be rebooted.

With TAC access:

1. Log in as root to one of the nodes

2. kubectl get pods -n kube-system | grep mond

3. kubectl delete pods/-n kube-system (for all 3 pods)

In my case I haved the same error and the solution was reload the OS VM.

Regards.

View solution in original post

Gerard Paulino · ‎01-22-2025