cancel
Showing results for 
Search instead for 
Did you mean: 
cancel

[GUIDE_BY_TAC] Fabric Discovery Troubleshooting: Discovering the First Leaf Switch in ACI

5406
Views
35
Helpful
3
Comments

Introduction:

One of the first steps in building your ACI Fabric is to go through Fabric Discovery. While Fabric Discovery is usually a straightforward process, there are various issues that may prevent you from discovering an ACI switch. This article will cover possible reasons for not seeing the first leaf switch entry under Fabric > Inventory > Fabric Membership
 2018-02-02 12_37_41-APIC (172.16.176.130).png

Prerequisites:

  1. Go through the initial setup script of APIC1
  2. APIC1, Spines, and Leaves are all running the same ACI version
    • The version can vary a little; however, its best to have the same exact version to be 100% sure.
  3. Switch version meets minimum supported version for the hardware (check Hardware to Version support matrix here)
  4. Ensure that transceivers used are supported for the switch hardware and current version (check Transceiver Matrix here)
  5. Configured CIMC IP address for the APIC Controllers (Guide)
  6. APIC1 connected to at least 1 leaf switch

 

Solution:

Part I: Checking CIMC Configurations

Some CIMC settings are crucial for fabric discovery to function properly.  While these settings are usually set to the proper values from the factory, sometimes this is not the case. Follow these steps to confirm proper CIMC settings.

 

1. Login to the CIMC GUI by typing its ip address on an internet browser

2. Navigate to Admin > Network > Network Settings and confirm that the "NIC Mode" is set to  "Dedicated"

2018-02-02 12_29_05-Cisco Integrated Management Controller WebUI.png

 

 

3. Navigate to Server > Inventory > Cisco VIC Adapters. Under Adapter Card Properties, confirm that the LLDP is listed as disabled.  If this is not the case, please follow this guide to disable the VIC LLDP: https://supportforums.cisco.com/legacyfs/online/attachments/document/files/apic-vic-lldp-fn.pdf

CIMC-VIC-LLDP-Disabled.png

 

Part II: Checking LLDP Entry Information

Fabric Discovery relies heavily on the exchange and comparison of LLDP information. Go through the below steps to make sure that LLDP is working correctly and each device has the correct information.

 

1. An APIC may be connected up to two different leaves; however, they run in an active/standby mode.  The APIC will only attempt to discover the leaf that is connected to its active interface.  The troubleshooting effort should therefore be focused on the leaf connected to the active interface. The below command can be run in order to find out which leaf is active. 

 

apic1# cat /proc/net/bonding/bond0 | grep Active
Currently Active Slave: eth2-2

From the above output, we are able to tell that the active interface is the eth2-2.  Therefore, we should focus on the leaf that is connected to eth2-2.

 

2. Now, we are ready to start checking the lldp information.  On the active leaf, copy/paste the following command:

moquery -c lldpIf | grep "wiringIssues : \S\|dn" | grep -B 1 wiring

 

If you do not see any output at all OR if you do not see any output for the leaf interface connected to APIC1, then you are good to move on to Part III.

 

Below are the possible wiringIssues output and a brief description on what it signifies:

wiringIssues Description

fabric-domain-mismatch

Adjacent node belongs to a different fabric

ctrlr-uuid-mismatch

APIC UUID mismatch (duplicate APIC ID)

wiring-mismatch

Invalid connection

    - Leaf to Leaf, Spine to non-leaf, Leaf fabric port to non-spine etc.

unknown-neighbor

Adjacent node is not part of fabric

adjacency-not-detected

No LLDP adjacency on fabric port (normal if not connected)

infra-vlan-mismatch

Mismatch of infra-vlan value

 

Most of the above possible output for wiringIssues are self explanatory.  The most commonly hit output from above is "infra-vlan-mismatch." See Resolving Infra-Vlan Mismatch section to correct this issue.

 

Resolving Infra-VLAN Mismatch

=========================

Infra-VLAN mismatch indicates that the two devices are advertising a different infra-vlan between each other. This is often caused by a non properly wiped switch.  In order to remedy an infra-vlan-mismatch issue, perform the following:

  1. Physically disconnect all undiscovered leaf or spine switches from the APIC and any other leaf or spine switches
  2. Login to the CLI of the undiscovered leaf switch and clean the switch by running the command "setup-clean-config.sh"
  3. After the command completes, run vsh -c "reload"
  4. Repeat steps #2 and #3 for other undiscovered leaf or spine switch
  5. After all the undiscovered switches have been cleaned and reloaded, physically connect back the leaf/spine switches to the fabric

Part III: Check for issues with APIC DHCP

In order for a entry for the leaf switch to show up under the Fabric Membership folder, the APIC needs to properly process the DHCP Discover message that the leaf switch sends.  There is a commonly hit bug (CSCvf12024) for the APIC DHCP that will prevent the above behavior.  In order to check if you are running into this issue, run this command on APIC1 CLI:

 

grep "No subnet declaration for bond0" /var/log/dme/log/dhcpd.bin.log

5402||18-01-22 23:06:12.890+00:00||dhcp||ERROR||||ISC dhcpd: No subnet declaration for bond0. (no IPv4 addresses).||../svc/dhcpd/src/gen/ifc/beh/imp/./DhcpdSvc.cc||53

 

If you see an output similar to above, you are most likely hitting this bug. Apply the workaround by running this command on APIC1 CLI:

acidiag restart dhcpd

 

Conclusion:

This article covers the most commonly hit issues that prevents the first leaf from being discovered. If the first leaf is still not discovered after following the above troubleshooting steps, open an ACI TAC case and they will be more than happy to assist! I hope the article helped.

 

 

 

Comments
Cisco Employee

Great Ramon !

It worked for me, and the 1st Leaf was recognized in seconds 

Beginner

Thank you so much for the article. Although it helped me in troubleshooting Fabric Discovery, it did not resolve the issue I was having. I think it will be helpful to others to add few pre-requisites that need to be accomplished before even starting the Discovery process, as follows:

1. Make sure every switch is running the same firmware version and it is in ACI mode, not NXOS

2. Make sure the APICs are running the appropriate firmware as well - e.g. switches 13.2 -> Apic 3.2, etc.

3. If you are using newly released transceivers, such as the QSFP 100/40-SR-BD, make sure the firmware you will be starting with supports them. We had to be on 13.1 which is the first one to support these adapters. 

4. Make sure the cabling is correct.

 

Hope this helps someone who is in the same boat I was with my first ACI implementation. Great experience though, so now I know how to troubleshoot initial discovery process.

Cisco Employee

Thank you for the feedback and very good points.

 

I have added those pre-requisites onto the article and have also linked very useful matrix that will assist in finding out those information. Thanks!

CreatePlease to create content
Content for Community-Ad
August's Community Spotlight Awards