02-02-2018 12:15 PM - edited 08-16-2019 09:54 AM
Some CIMC settings are crucial for fabric discovery to function properly. While these settings are usually set to the proper values from the factory, sometimes this is not the case. Follow these steps to confirm proper CIMC settings.
1. Login to the CIMC GUI by typing its ip address on an internet browser
2. Navigate to Admin > Network > Network Settings and confirm that the "NIC Mode" is set to "Dedicated"
3. Navigate to Server > Inventory > Cisco VIC Adapters. Under Adapter Card Properties, confirm that the LLDP is listed as disabled. If this is not the case, please follow this guide to disable the VIC LLDP: https://supportforums.cisco.com/legacyfs/online/attachments/document/files/apic-vic-lldp-fn.pdf
Fabric Discovery relies heavily on the exchange and comparison of LLDP information. Go through the below steps to make sure that LLDP is working correctly and each device has the correct information.
1. An APIC may be connected up to two different leaves; however, they run in an active/standby mode. The APIC will only attempt to discover the leaf that is connected to its active interface. The troubleshooting effort should therefore be focused on the leaf connected to the active interface. The below command can be run in order to find out which leaf is active.
apic1# cat /proc/net/bonding/bond0 | grep Active
Currently Active Slave: eth2-2
From the above output, we are able to tell that the active interface is the eth2-2. Therefore, we should focus on the leaf that is connected to eth2-2.
2. Now, we are ready to start checking the lldp information. On the active leaf, copy/paste the following command:
moquery -c lldpIf | grep "wiringIssues : \S\|dn" | grep -B 1 wiring
If you do not see any output at all OR if you do not see any output for the leaf interface connected to APIC1, then you are good to move on to Part III.
Below are the possible wiringIssues output and a brief description on what it signifies:
wiringIssues | Description |
fabric-domain-mismatch |
Adjacent node belongs to a different fabric |
ctrlr-uuid-mismatch |
APIC UUID mismatch (duplicate APIC ID) |
wiring-mismatch |
Invalid connection - Leaf to Leaf, Spine to non-leaf, Leaf fabric port to non-spine etc. |
unknown-neighbor |
Adjacent node is not part of fabric |
adjacency-not-detected |
No LLDP adjacency on fabric port (normal if not connected) |
infra-vlan-mismatch |
Mismatch of infra-vlan value |
Most of the above possible output for wiringIssues are self explanatory. The most commonly hit output from above is "infra-vlan-mismatch." See Resolving Infra-Vlan Mismatch section to correct this issue.
Resolving Infra-VLAN Mismatch
=========================
Infra-VLAN mismatch indicates that the two devices are advertising a different infra-vlan between each other. This is often caused by a non properly wiped switch. In order to remedy an infra-vlan-mismatch issue, perform the following:
In order for a entry for the leaf switch to show up under the Fabric Membership folder, the APIC needs to properly process the DHCP Discover message that the leaf switch sends. There is a commonly hit bug (CSCvf12024) for the APIC DHCP that will prevent the above behavior. In order to check if you are running into this issue, run this command on APIC1 CLI:
grep "No subnet declaration for bond0" /var/log/dme/log/dhcpd.bin.log
5402||18-01-22 23:06:12.890+00:00||dhcp||ERROR||||ISC dhcpd: No subnet declaration for bond0. (no IPv4 addresses).||../svc/dhcpd/src/gen/ifc/beh/imp/./DhcpdSvc.cc||53
If you see an output similar to above, you are most likely hitting this bug. Apply the workaround by running this command on APIC1 CLI:
acidiag restart dhcpd
This article covers the most commonly hit issues that prevents the first leaf from being discovered. If the first leaf is still not discovered after following the above troubleshooting steps, open an ACI TAC case and they will be more than happy to assist! I hope the article helped.
Great Ramon !
It worked for me, and the 1st Leaf was recognized in seconds
Thank you so much for the article. Although it helped me in troubleshooting Fabric Discovery, it did not resolve the issue I was having. I think it will be helpful to others to add few pre-requisites that need to be accomplished before even starting the Discovery process, as follows:
1. Make sure every switch is running the same firmware version and it is in ACI mode, not NXOS
2. Make sure the APICs are running the appropriate firmware as well - e.g. switches 13.2 -> Apic 3.2, etc.
3. If you are using newly released transceivers, such as the QSFP 100/40-SR-BD, make sure the firmware you will be starting with supports them. We had to be on 13.1 which is the first one to support these adapters.
4. Make sure the cabling is correct.
Hope this helps someone who is in the same boat I was with my first ACI implementation. Great experience though, so now I know how to troubleshoot initial discovery process.
Thank you for the feedback and very good points.
I have added those pre-requisites onto the article and have also linked very useful matrix that will assist in finding out those information. Thanks!
Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: