cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2429
Views
10
Helpful
8
Replies

First leaf active, spines discovered, registered, but stay discovered

Leo Gal
Level 1
Level 1

Hello guys

I was setting up 2 Fabrics today.

#1 is already set up and upgraded with a little bit of pain and reloading and manual help, but it is up and active and alive.

 

#2 Fabric behaves like this.

Apic sees first leaf, I registered it for the first time (running 4.1.1) and after registration it went inactive and stayed inactive.

(fixed this on first fabric by upgradeing the APIC to 4.2.3)

Ultimately I managed to register the leaf, upgrade APIC and the first leaf as well thinking...ok, it will not go inactive and will

not cause trouble, when on the same software. Good.

Now the first leaf is active, I see both spines discovered (even the second leaf), I correctly register them, but they stay there, discovered and do not transition to Active.

Apic and first leaf are running 14.2(3l), 2 spines and other 3 leafs are running  14.1(1j).

 

Spines show this:

User Access Verification
(none) login: admin
********************************************************************************
Fabric discovery in progress, show commands are not fully functional
Logout and Login after discovery to continue to use show commands.
********************************************************************************
(none)#

so there is something going on

 

I already wiped the fabric completely, setup-clean-config.sh on switches, 

acidiag touch clean, touch setup, reboot on APIC and setup the APIC again

but it behaves the same way.

Any hints please?

Thank you

 

PS: any hints for good technical blog/book, short, about technicalities of Fabric and Multisite, inserting services so i do not need to read 5 x 500 pages books to understand it all will be appreciated as well. Im ccie r&s but need to transition to bgp evpn and aci.

 

Thanks

Leo

 

 

8 Replies 8

Leo Gal
Level 1
Level 1

Update: that first leaf is inactive again

Hi,

 

First, make sure all fabric nodes are running same version.

Also, try the following command on spine/leaf switches which are experiencing problems:

leaf101# show discoveryissues

This command will assist in the diagnosis of common discovery issues.

Depending on the outputs, you might want to take different approach in continuing the troubleshooting.

You can follow the t-shoot steps from Troubleshooting ACI book, Chapter 1 - initial fabric setup.

https://www.booksprints.net/book/the-second-aci-troubleshooting-guide/ 

Regards,

Sergiu

Hi Sergiu
I manually setup all the 4 leaves and 2 spines to the same software,

Did complete cleanup of the whole topology (APIC + all the switches), setup the apic
And ended up in the same situation as before.
Spines are discovered, I register them and they stay there.
The first leaf shows this:


DCB-L2101-Rxy# show discoveryissues
Checking the platform type................LEAF!
Check01 - System state - in-service [ok]
Check02 - DHCP status [ok]
TEP IP: 172.18.74.64 Node Id: 2101 Name: DCB-L2101-Rxy
Check03 - AV details check [ok]
Check04 - IP rechability to apic [ok]
Ping from switch to 172.18.64.1 passed
Check05 - infra VLAN received [ok]
infra vLAN:3967
Check06 - LLDP Adjacency [ok]
Found adjacency with SPINE
Found adjacency with APIC
Check07 - Switch version [ok]
version: n9000-14.2(3l) and apic version: 4.2(3l)
Check08 - FPGA/BIOS out of sync test [ok]
Check09 - SSL check [check]
SSL certificate details are valid
Check10 - Downloading policies [ok]
Check11 - Checking time [ok]
2020-03-31 08:32:40
Check12 - Checking modules, power and fans [FAIL]
Power supply state is shut
Ignore if it's a redudant power supply
Check13 - infra VLAN received [FAIL]
No isis adjacencies found
Ignore if first leaf/spine in an ACI fabric or remote leaf
DCB-L2101-Rxy#

 

What is weard, the first leaf is shown as inactive after some time (noticed now)...maybe thats why those spines have troubles.

I will reload the APIC one more time and lets see.

Hi Leo,

The output looks ok, as is the first discovered leaf. And yes, if the first leaf goes into inactive state, the rest of the fabric cannot be discovered correctly. How is the connectivity between APIC and first leaf? What ports you use on the controller to connect to leaf? Is the interface up? Does it go down, when the leaf goes down? 

 

Regards,

Sergiu

I had a guy flap the interfaces of APIC, suddenly I saw this (the output below) and Spines went from

"Nodes Pending Registration" to "Registered Nodes"

And subsequently I waas able to register all the leaves.

 

Another weard thing is, I checked the cabling

On the APIC inter

2-1 is connected to 1/1 on leaf1

2-2 (and this one is down) is connected to 1/1 on leaf2

should not this interface be up?

 

and last quetion from my todays questionaire...

I did 

cat /var/log/dme/log/dhcpd.bin.log

on APIC

and it just flooded me with millions of rows.

Is there any proper cleanup procedure for all the logging on APICs please?

 

Thank you

Leo

 

DCB-APIC1# [ 3828.268697] Modify tunnel received

[ 3828.273065] [tep0]: vers 4 prot 17 ihl 5 daddr 0x406612ac saddr 0x100000a 0x0 type 0x0

[ 3828.282955] Setting l3 proxy ip 0x406612ac

[ 3828.291674] Modify tunnel received

[ 3828.296052] [tep1]: vers 4 prot 17 ihl 5 daddr 0x406612ac saddr 0x200000a 0x0 type 0x0

[ 3828.305928] Setting l3 proxy ip 0x406612ac

[ 3828.314316] Modify tunnel received

[ 3828.318710] [tep2]: vers 4 prot 17 ihl 5 daddr 0x406612ac saddr 0x300000a 0x0 type 0x0

[ 3828.328597] Setting l3 proxy ip 0x406612ac

[ 3828.335534] Modify tunnel received

[ 3828.339917] [tep3]: vers 4 prot 17 ihl 5 daddr 0x406612ac saddr 0x400000a 0x0 type 0x0

[ 3828.349782] Setting l3 proxy ip 0x406612ac

[ 3828.358022] Modify tunnel received

[ 3828.362423] [tep4]: vers 4 prot 17 ihl 5 daddr 0x406612ac saddr 0x500000a 0x0 type 0x0

[ 3828.372290] Setting l3 proxy ip 0x406612ac

[ 3828.380595] Modify tunnel received

[ 3828.384982] [tep5]: vers 4 prot 17 ihl 5 daddr 0x406612ac saddr 0x600000a 0x0 type 0x0

[ 3828.394875] Setting l3 proxy ip 0x406612ac

[ 3828.403432] Modify tunnel received

[ 3828.407776] [tep6]: vers 4 prot 17 ihl 5 daddr 0x406612ac saddr 0x700000a 0x0 type 0x0

[ 3828.417171] Setting l3 proxy ip 0x406612ac

[ 3828.425285] Modify tunnel received

[ 3828.429425] [tep7]: vers 4 prot 17 ihl 5 daddr 0x406612ac saddr 0x800000a 0x0 type 0x0

[ 3828.438861] Setting l3 proxy ip 0x406612ac

[ 3828.453129] Modify tunnel received

[ 3828.457360] [teplo-1]: vers 4 prot 17 ihl 5 daddr 0x406612ac saddr 0x0 0x0 type 0x0

[ 3828.466509] Setting l3 proxy ip 0x406612ac

[ 3828.548086] Modify tunnel received

[ 3828.552226] [tep0]: vers 4 prot 17 ihl 5 daddr 0x416612ac saddr 0x100000a 0x0 type 0x1

[ 3828.561661] Setting l2 proxy ip 0x416612ac

[ 3828.568260] Modify tunnel received

[ 3828.572371] [tep1]: vers 4 prot 17 ihl 5 daddr 0x416612ac saddr 0x200000a 0x0 type 0x1

[ 3828.581768] Setting l2 proxy ip 0x416612ac

[ 3828.590094] Modify tunnel received

[ 3828.594220] [tep2]: vers 4 prot 17 ihl 5 daddr 0x416612ac saddr 0x300000a 0x0 type 0x1

[ 3828.603614] Setting l2 proxy ip 0x416612ac

[ 3828.611852] Modify tunnel received

[ 3828.616011] [tep3]: vers 4 prot 17 ihl 5 daddr 0x416612ac saddr 0x400000a 0x0 type 0x1

[ 3828.625484] Setting l2 proxy ip 0x416612ac

[ 3828.633780] Modify tunnel received

[ 3828.637913] [tep4]: vers 4 prot 17 ihl 5 daddr 0x416612ac saddr 0x500000a 0x0 type 0x1

[ 3828.647313] Setting l2 proxy ip 0x416612ac

[ 3828.655545] Modify tunnel received

[ 3828.659701] [tep5]: vers 4 prot 17 ihl 5 daddr 0x416612ac saddr 0x600000a 0x0 type 0x1

[ 3828.669154] Setting l2 proxy ip 0x416612ac

[ 3828.677537] Modify tunnel received

[ 3828.681671] [tep6]: vers 4 prot 17 ihl 5 daddr 0x416612ac saddr 0x700000a 0x0 type 0x1

[ 3828.691084] Setting l2 proxy ip 0x416612ac

[ 3828.699399] Modify tunnel received

[ 3828.703563] [tep7]: vers 4 prot 17 ihl 5 daddr 0x416612ac saddr 0x800000a 0x0 type 0x1

[ 3828.713034] Setting l2 proxy ip 0x416612ac

[ 3828.727568] Modify tunnel received

[ 3828.731807] [teplo-1]: vers 4 prot 17 ihl 5 daddr 0x416612ac saddr 0x0 0x0 type 0x1

[ 3828.740973] Setting l2 proxy ip 0x416612ac

Do yo have APIC M3 or L3?

 

If yes, note that the VIC 1455 has 4 ports, port-1, port-2, port-3, and port-4 from left to right.

Port-1 and port-2 is one pair, corresponding to eth2-1 on APIC and port-3 and port-4 is another pair, corresponding to eth2-2 on APIC. Only one connection is allowed for each pair. For example, you can connect one cable to either port-1 or port-2, and connect another cable to either port-3 or port-4 (please do not connect two cables on any pair).

Reference: https://www.cisco.com/c/en/us/td/docs/switches/datacenter/aci/apic/server/M3-L3-server/APIC-M3-L3-Server/C220M5_chapter_00.html 

 

The fabric clean is pretty simple:

On APIC:

 

    acidiag touch clean
    acidiag touch setup
    acidiag reboot

On leaf/spine switches:

 

 

    setup-clean-config.sh
    reload

 

 

Regards,

Sergiu

 

 

OH!!!!, oh, my fault! facepalm

Moved the cable to port-3 and it is up. bond has 2intfs up.

...and the problem with spines was probably caused by this, cause when those ports were flapped one after another it suddenly worked.

Thank you very much!

 

Any hint on the log cleanup on APICs please?

 

Appreciate your attention and speed of response.

Hi,

No worries. I have seen the miscabling for APIC M3/L3 several times, that's why I knew about it :-)

About log cleanup, sorry about suggesting the fabric clean reload. I misunderstood the question.

Normally, you should not worry too much about the APIC and leaf/spine switches logs. This is because, first, some of the dme logs are read only for user admin, and second, the log files are by default rotating and archived:

-rw-r--r-- 2 ifc  admin 13069288 Mar 31 15:09 svc_ifc_policymgr.bin.log
-rw-r--r-- 1 root root   1678019 Mar 30 10:36 svc_ifc_policymgr.bin.log.1.gz
-rw-r--r-- 1 root root   1514324 Mar 30 11:39 svc_ifc_policymgr.bin.log.2.gz
-rw-r--r-- 1 root root   1436470 Mar 30 15:03 svc_ifc_policymgr.bin.log.3.gz

APIC will not run out of space because of the logs (unless there is a bug in the software).

If you really really want to clear the logs, you can see which logs are having admin ownership (ls -l) and echo "" > filename. Or you can open a TAC case and ask them to login using root and clear the logs for all of the logfiles. But again, I really think this is unnecessary.

 

Cheers,

Sergiu

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Save 25% on Day-2 Operations Add-On License