cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
17288
Views
20
Helpful
3
Replies

ACI - Leaf Discovery issue

ju.mahieu
Level 1
Level 1

Hello,

I'm currently setting up my first ACI environment.
But unfortunately, my APIC is unable to discover the directly connected leaf.

- Affected devices : N9K-C93180YC-EX / APIC-SERVER-L2
- Software version : 2.0(2f)


I have followed the steps described inside the "ACI troubleshooting book" without any success.


You will find below the collected logs on the two devices:
=============================================
**** APIC Log
admin@apic1:sys> cat /mit/uni/fabric/compcat-default/swhw-*/summary | grep model
...
model        : N9K-93180YC-EX
...
admin@apic1:sys> acidiag fnvread
      ID   Pod ID                 Name    Serial Number         IP Address    Role        State   LastUpdMsgId
--------------------------------------------------------------------------------------------------------------

Total 0 nodes

admin@apic1:sys> acidiag verifyapic
openssl_check: certificate details
subject= CN=XXXX,serialNumber=PID:APIC-SERVER-L2 SN:XXXX
issuer= CN=Cisco Manufacturing CA,O=Cisco Systems
notBefore=Aug 11 10:22:54 2016 GMT
notAfter=Aug 11 10:32:54 2026 GMT
openssl_check: passed
ssh_check: passed
all_checks: passed

**** Leaf Log
(none) login: admin
********************************************************************************
     Fabric discovery in progress, show commands are not fully functional
     Logout and Login after discovery to continue to use show commands.
********************************************************************************
(none)# show int status
----------------------------------------------------------------------------------------
 Port           Name                Status     Vlan       Duplex   Speed    Type
----------------------------------------------------------------------------------------
Eth1/1         --                  connected  trunk      full     10G      10Gbase-SR
=========================================

(none)# show lldp neighbors
Capability codes:
  (R) Router, (B) Bridge, (T) Telephone, (C) DOCSIS Cable Device
  (W) WLAN Access Point, (P) Repeater, (S) Station, (O) Other
Device ID            Local Intf      Hold-time  Capability  Port ID
apic1                 Eth1/1          120                    eth2-2

(none)# cat /mit/sys/summary
# System
address      : 0.0.0.0
childAction  :
configIssues :
currentTime  : 2016-09-12T11:02:18.014+00:00
dn           : sys
fabricId     : 1
fabricMAC    : 00:22:BD:F8:19:FF
id           : 0
inbMgmtAddr  : 0.0.0.0
inbMgmtAddr6 : 0.0.0.0
lcOwn        : local
modTs        : 2016-09-12T10:52:41.921+00:00
mode         : unspecified
monPolDn     : uni/fabric/monfab-default
name         :
oobMgmtAddr  : 0.0.0.0
oobMgmtAddr6 : 0.0.0.0
podId        : 1
rn           : sys
role         : leaf
serial       : XXXXX
state        : out-of-service
status       :
systemUpTime : 00:00:11:07.000

Can anyone help ? Thanks in advance
Regards,
Ju

1 Accepted Solution

Accepted Solutions

Jason Williams
Level 1
Level 1

Hi Ju, 

As a first step please log into the APIC GUI and verify that the Fabric -> Inventory -> Fabric Membership page is empty. If you see one serial number, then double click the serial number to assign the serial number a node ID and node name. 

If the page is empty, then read further on for troubleshooting. 

Verify that the date/time are within a close range. Run the date command on the APIC and leaf. 

apic# date
Tue Sep 13 06:14:34 ART 2016

leaf# date
Tue Sep 13 06:14:55 ART 2016

If they are more than a few hours apart, then this may cause issues with fabric discovery. 

Verify that both the APIC and leaf are running same code or at least within the relevant code number [2.0(2)] 

Fabric discovery also requires LLDP then DHCP. According to your post LLDP appears to be fine on the leaf side, but lets verify on the APIC. 

The APIC interfaces going to leaf nodes are in a active-standby bond (bond0). We should find out the active interface of the bond to begin troubleshooting. To find the active interface, try running cat /proc/net/bonding/bond0 

admin@APIC1:~> cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: eth2-2  <-- Eth2-2 is active
MII Status: up
MII Polling Interval (ms): 60
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: eth2-1
MII Status: down
Speed: Unknown
Duplex: Unknown
Link Failure Count: 0
Permanent HW addr: b8:38:61:f7:05:b1
Slave queue ID: 0

Slave Interface: eth2-2
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 1
Permanent HW addr: b8:38:61:f7:05:b2
Slave queue ID: 0

Check LLDP on the active interface of the bond0 using show lldptool in ethx-y

This will display information about the neighbor switch

admin@apic:~> show lldptool in eth2-2
Chassis ID TLV
MAC: 10:05:ca:f5:bb:71
Port ID TLV
Local: Eth1/1
Time to Live TLV
120
Port Description TLV

System Name TLV

System Description TLV
System Capabilities TLV
System capabilities: Bridge, Router
Enabled capabilities: Bridge, Router
Management Address TLV
MAC: 10:05:ca:f5:bb:71
Ifindex: 83886080
Cisco 4-wire Power-via-MDI TLV
4-Pair PoE not supported
Spare pair Detection/Classification not required
PD Spare pair Desired State: Disabled
PSE Spare pair Operational State: Disabled
Cisco Port Mode TLV
0
Cisco Port State TLV
1
Cisco Serial Number TLV
xxxx               <--- Serial #
Cisco Model TLV    
N9K-C9396PX        <--- Model of neighbor
Cisco Firmware Version TLV
n9000-12.0(1q)
Cisco Node Role TLV
1
Cisco Infra VLAN TLV
xxxx
Cisco Node IP TLV
xxxxxxx
Cisco Name TLV
xxxxxxx
Cisco Fabric Name TLV
xxxxxxx
Cisco Node ID TLV
xxxxxxx
Cisco POD ID TLV
1
Cisco Appliance Vector TLV

If LLDP looks okay, then look into DHCP. 

admin@apic:~> moquery -c dhcpPool | grep -B 8 -A 5 2016-02
# dhcp.Pool
id : 2
childAction :
className : pod
dn : prov-1/net-[10.0.0.0/16]/pool-2
endIp : 10.0.128.95
freeIPs : 4294967295
lcOwn : local
modTs : 2016-09-10T13:09:08.203+00:00
rn : pool-2
startIp : 10.0.128.64
status :
type : normal

# dhcp.Pool
id : 3
childAction :
className : protectionchain
dn : prov-1/net-[10.0.0.0/16]/pool-3
endIp : 10.0.48.95
freeIPs : 4294967280
lcOwn : local
modTs : 2016-09-10T13:11:33.392+00:00
rn : pool-3
startIp : 10.0.48.64
status :
type : normal

--
# dhcp.Pool
id : 1
childAction :
className : vip
dn : prov-1/net-[10.0.0.0/16]/pool-1
endIp : 10.0.160.95
freeIPs : 4294967288
lcOwn : local
modTs : 2016-09-09T18:18:37.674+00:00
rn : pool-1
startIp : 10.0.160.64
status :
type : normal

===============================================

Gotchas: Make sure that your CIMC NIC Mode is not set to Shared LOM extended since this impacts your VIC interfaces. VIC interfaces are used to connect to the leaf ports.

View solution in original post

3 Replies 3

Jason Williams
Level 1
Level 1

Hi Ju, 

As a first step please log into the APIC GUI and verify that the Fabric -> Inventory -> Fabric Membership page is empty. If you see one serial number, then double click the serial number to assign the serial number a node ID and node name. 

If the page is empty, then read further on for troubleshooting. 

Verify that the date/time are within a close range. Run the date command on the APIC and leaf. 

apic# date
Tue Sep 13 06:14:34 ART 2016

leaf# date
Tue Sep 13 06:14:55 ART 2016

If they are more than a few hours apart, then this may cause issues with fabric discovery. 

Verify that both the APIC and leaf are running same code or at least within the relevant code number [2.0(2)] 

Fabric discovery also requires LLDP then DHCP. According to your post LLDP appears to be fine on the leaf side, but lets verify on the APIC. 

The APIC interfaces going to leaf nodes are in a active-standby bond (bond0). We should find out the active interface of the bond to begin troubleshooting. To find the active interface, try running cat /proc/net/bonding/bond0 

admin@APIC1:~> cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: eth2-2  <-- Eth2-2 is active
MII Status: up
MII Polling Interval (ms): 60
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: eth2-1
MII Status: down
Speed: Unknown
Duplex: Unknown
Link Failure Count: 0
Permanent HW addr: b8:38:61:f7:05:b1
Slave queue ID: 0

Slave Interface: eth2-2
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 1
Permanent HW addr: b8:38:61:f7:05:b2
Slave queue ID: 0

Check LLDP on the active interface of the bond0 using show lldptool in ethx-y

This will display information about the neighbor switch

admin@apic:~> show lldptool in eth2-2
Chassis ID TLV
MAC: 10:05:ca:f5:bb:71
Port ID TLV
Local: Eth1/1
Time to Live TLV
120
Port Description TLV

System Name TLV

System Description TLV
System Capabilities TLV
System capabilities: Bridge, Router
Enabled capabilities: Bridge, Router
Management Address TLV
MAC: 10:05:ca:f5:bb:71
Ifindex: 83886080
Cisco 4-wire Power-via-MDI TLV
4-Pair PoE not supported
Spare pair Detection/Classification not required
PD Spare pair Desired State: Disabled
PSE Spare pair Operational State: Disabled
Cisco Port Mode TLV
0
Cisco Port State TLV
1
Cisco Serial Number TLV
xxxx               <--- Serial #
Cisco Model TLV    
N9K-C9396PX        <--- Model of neighbor
Cisco Firmware Version TLV
n9000-12.0(1q)
Cisco Node Role TLV
1
Cisco Infra VLAN TLV
xxxx
Cisco Node IP TLV
xxxxxxx
Cisco Name TLV
xxxxxxx
Cisco Fabric Name TLV
xxxxxxx
Cisco Node ID TLV
xxxxxxx
Cisco POD ID TLV
1
Cisco Appliance Vector TLV

If LLDP looks okay, then look into DHCP. 

admin@apic:~> moquery -c dhcpPool | grep -B 8 -A 5 2016-02
# dhcp.Pool
id : 2
childAction :
className : pod
dn : prov-1/net-[10.0.0.0/16]/pool-2
endIp : 10.0.128.95
freeIPs : 4294967295
lcOwn : local
modTs : 2016-09-10T13:09:08.203+00:00
rn : pool-2
startIp : 10.0.128.64
status :
type : normal

# dhcp.Pool
id : 3
childAction :
className : protectionchain
dn : prov-1/net-[10.0.0.0/16]/pool-3
endIp : 10.0.48.95
freeIPs : 4294967280
lcOwn : local
modTs : 2016-09-10T13:11:33.392+00:00
rn : pool-3
startIp : 10.0.48.64
status :
type : normal

--
# dhcp.Pool
id : 1
childAction :
className : vip
dn : prov-1/net-[10.0.0.0/16]/pool-1
endIp : 10.0.160.95
freeIPs : 4294967288
lcOwn : local
modTs : 2016-09-09T18:18:37.674+00:00
rn : pool-1
startIp : 10.0.160.64
status :
type : normal

===============================================

Gotchas: Make sure that your CIMC NIC Mode is not set to Shared LOM extended since this impacts your VIC interfaces. VIC interfaces are used to connect to the leaf ports.

Thank you Jason for your quick reply. You 'll find below my comments.

The clock are currently very different. How can I modify that ?
(none)# date
Tue Sep 13 09:55:26 UTC 2016

apic1# date
Tue Sep 13 02:54:17 UTC 2016

The active interface is :
admin@apic1:~> cat /proc/net/bonding/bond0 | grep Current
Currently Active Slave: eth2-2

The lldp process seems to be ok :
admin@apic1:~> show lldptool in eth2-2
This command is being deprecated on APIC controller, please use NXOS-style equivalent command
Chassis ID TLV
        MAC: 84:3d:c6:af:5b:4d
Port ID TLV
        Local: Eth1/1
Time to Live TLV
        120
Port Description TLV
        Ethernet1/1
System Name TLV
        switch
System Description TLV
        Cisco Nexus Operating System (NX-OS) Software 12.0(2f)
TAC support: http://www.cisco.com/tac
Copyright (c) 2002-2020, Cisco Systems, Inc. All rights reserved.
System Capabilities TLV
        System capabilities:  Bridge, Router
        Enabled capabilities: Bridge, Router
Management Address TLV
        MAC: 84:3d:c6:af:5b:4d
        Ifindex: 83886080
Cisco 4-wire Power-via-MDI TLV
        4-Pair PoE not supported
        Spare pair Detection/Classification not required
        PD Spare pair Desired State: Disabled
        PSE Spare pair Operational State: Disabled
Cisco Port Mode TLV
        0
Cisco Port State TLV
        1
Cisco Serial Number TLV
        FDO202XXX
Cisco Model TLV
        N9K-C93180YC-EX
Cisco Firmware Version TLV
        n9000-12.0(2f)
Cisco Node Role TLV
        1
Cisco Infra VLAN TLV
        4093
Cisco Node ID TLV
        0
End of LLDPDU TLV

The result regarding DHCP :
admin@apic1:~> moquery -c dhcpPool
Total Objects shown: 1
# dhcp.Pool
id           : 1
childAction  :
className    : vip
dn           : prov-1/net-[10.0.0.0/16]/pool-1
endIp        : 10.0.16.95
freeIPs      : 4294967288
lcOwn        : local
modTs        : 2016-09-12T03:16:22.907+00:00
rn           : pool-1
startIp      : 10.0.16.64
status       :
type         : normal

Maybe a DHCP problem ?

Any advice. Thank you

Ju

Nice job Jason !!!

My issue has been solved by a CICM configuration update.

The Nic mode wasn't set. With the dedicated mode, DHCP process works fine and the discovery is successfull.

Thank you so much for your help :-)

Regards,

Ju

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Save 25% on Day-2 Operations Add-On License