cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
4809
Views
10
Helpful
6
Replies

ACI Spine refuses to join Fabric

RedNectar
VIP
VIP

Hi,

In a lab I help look after we have two leaves and two spines. One of the spines was RMA's and when it returned it refuses to join the fabric completely.

From a completely clean start (setup-clean-config.sh) on all switches, and a eraseconfig setup on all the APICs, this is what happens:

Fabric discovery happens as normal, and each switch is discovered, including the recalcitrant spine.

But in other places, the second spine switch just doesn't show up, or only partially shows.  This is particularly annoying when trying to upgrade the firmware! (This switch is running 11.2(2h), everything else is 12.1(1h))

The only thing I can see in that Spine202 shows n/a under SSL Certificate.  Does anyone know if this is my root cause and how to fix it?

At the console of the switch, I see:

User Access Verification
Spine202 login: admin
********************************************************************************
     Fabric discovery in progress, show commands are not fully functional
     Logout and Login after discovery to continue to use show commands.
********************************************************************************
Spine202#

Does anyone have any clues on how to get the switch online and upgraded?

Chris Welsh

RedNectar aka Chris Welsh.
Forum Tips: 1. Paste images inline - don't attach. 2. Always mark helpful and correct answers, it helps others find what they need.
6 Replies 6

Marcel Zehnder
Spotlight
Spotlight

Hi Chris

Because your APICs running 2.1(1h) code, they won't be able to handle your spine running 1.2(2h) code correctly. The minimum release of your spine must be 1.2(3x).

I would recommend that you decommission spine-202, copy the 2.1(1h) firmware to it and do the following on the spine via console:

setup-bootvars.sh aci-n9000-dk9.12.1.1h.bin
setup-clean-config.sh aci-n9000-dk9.12.1.1h.bin

After that you can reload the switch and rejoin the fabric.

HTH

Marcel


Hi Marcel,

As you suggested, I decommissioned (removed) the spine and I manually upgraded the software and ran the setup scripts. so it now boots with v12.1(1h)

But the situation hasn't changed.

I've done a little more research on the SSL angle (the SSL certificate column shows "n/a" in the Inventory > Fabric Membership list) and checked my certificate using the openssl asn1parse < /securedata/ssl/server.crt command.  (See exhibit below)

It would seem that there is a problem there, and this post indicates that a new certificate will be required to "ensure that your switches will successfully complete the ACI Fabric Discovery process."

So I guess I'll have to chase that up.

Thanks for your help anyway.

Chris

Exhibit: Output from the openssl asn1parse < /securedata/ssl/server.crt command.

Spine202# openssl asn1parse < /securedata/ssl/server.crt
    0:d=0  hl=4 l= 651 cons: SEQUENCE
    4:d=1  hl=4 l= 371 cons: SEQUENCE
    8:d=2  hl=2 l=   1 prim: INTEGER           :64
   11:d=2  hl=2 l=  13 cons: SEQUENCE
   13:d=3  hl=2 l=   9 prim: OBJECT            :sha1WithRSAEncryption
   24:d=3  hl=2 l=   0 prim: NULL
   26:d=2  hl=2 l=  66 cons: SEQUENCE
   28:d=3  hl=2 l=  11 cons: SET
   30:d=4  hl=2 l=   9 cons: SEQUENCE
   32:d=5  hl=2 l=   3 prim: OBJECT            :countryName
   37:d=5  hl=2 l=   2 prim: PRINTABLESTRING   :XX
   41:d=3  hl=2 l=  21 cons: SET
   43:d=4  hl=2 l=  19 cons: SEQUENCE
   45:d=5  hl=2 l=   3 prim: OBJECT            :localityName
   50:d=5  hl=2 l=  12 prim: UTF8STRING        :Default City
   64:d=3  hl=2 l=  28 cons: SET
   66:d=4  hl=2 l=  26 cons: SEQUENCE
   68:d=5  hl=2 l=   3 prim: OBJECT            :organizationName
   73:d=5  hl=2 l=  19 prim: UTF8STRING        :Default Company Ltd
   94:d=2  hl=2 l=  30 cons: SEQUENCE
   96:d=3  hl=2 l=  13 prim: UTCTIME           :140719230710Z
  111:d=3  hl=2 l=  13 prim: UTCTIME           :240716230710Z
  126:d=2  hl=2 l=  89 cons: SEQUENCE
  128:d=3  hl=2 l=  11 cons: SET
  130:d=4  hl=2 l=   9 cons: SEQUENCE
  132:d=5  hl=2 l=   3 prim: OBJECT            :countryName
  137:d=5  hl=2 l=   2 prim: PRINTABLESTRING   :US
  141:d=3  hl=2 l=  11 cons: SET
  143:d=4  hl=2 l=   9 cons: SEQUENCE
  145:d=5  hl=2 l=   3 prim: OBJECT            :stateOrProvinceName
  150:d=5  hl=2 l=   2 prim: UTF8STRING        :CA
  154:d=3  hl=2 l=  16 cons: SET
  156:d=4  hl=2 l=  14 cons: SEQUENCE
  158:d=5  hl=2 l=   3 prim: OBJECT            :localityName
  163:d=5  hl=2 l=   7 prim: UTF8STRING        :SanJose
  172:d=3  hl=2 l=  25 cons: SET
  174:d=4  hl=2 l=  23 cons: SEQUENCE
  176:d=5  hl=2 l=   3 prim: OBJECT            :organizationName
  181:d=5  hl=2 l=  16 prim: UTF8STRING        :Insieme Networks
  199:d=3  hl=2 l=  16 cons: SET
  201:d=4  hl=2 l=  14 cons: SEQUENCE
  203:d=5  hl=2 l=   3 prim: OBJECT            :commonName
  208:d=5  hl=2 l=   7 prim: UTF8STRING        :Insieme
  217:d=2  hl=3 l= 159 cons: SEQUENCE
  220:d=3  hl=2 l=  13 cons: SEQUENCE
  222:d=4  hl=2 l=   9 prim: OBJECT            :rsaEncryption
  233:d=4  hl=2 l=   0 prim: NULL
  235:d=3  hl=3 l= 141 prim: BIT STRING
  379:d=1  hl=2 l=  13 cons: SEQUENCE
  381:d=2  hl=2 l=   9 prim: OBJECT            :sha1WithRSAEncryption
  392:d=2  hl=2 l=   0 prim: NULL
  394:d=1  hl=4 l= 257 prim: BIT STRING

RedNectar aka Chris Welsh.
Forum Tips: 1. Paste images inline - don't attach. 2. Always mark helpful and correct answers, it helps others find what they need.

Hi Chris

Yes looks like a cert issue then. If you do a "acidiag fnvread" on one of your apics, is the reported state of the spine "inactive"? --> See https://supportforums.cisco.com/document/12268081/verifying-fabric-ssl-certificates

You have to contact TAC to generate new certs.

Marcel

Venkata Naveen Chapa
Cisco Employee
Cisco Employee

Hi Chris, 

Log in to the leaf as rescue-user and verify the clock "date -u" make sure your time is correct. 

Verify the installed certificate and its validity. 

openssl asn1parse < /securedata/ssl/server.crt | grep PRINTABLESTRING

openssl asn1parse < /securedata/ssl/server.crt | grep UTF8STRING

Login to APIC

admin@APIC-B-1:~> openssl s_client -state -connect spine202:12515

N/A for SSL certificate in fabric inventory GUI shows that certificate is corrupt or invalid. You might have to open a TAC case to get new certificate. 

Hope this helps 

Thanks 

Naveen

Thanks & Regards Venkata Naveen Chapa

gsidhu
Level 3
Level 3

Hi Chris

Did you manage to get this issue resolved as I believe I have come across the same issue as you. I initially suspected a faulty spine switch so I replaced it but got exactly the same issue.

Just for clarity:

APIC discovers the Spine and this shows up in the topology after I assign it with a node number and name.

After a short while the Spine continuously reboots and no longer displays in the topology.

I have removed Leaf connections to rule out whether issue is caused by a Leaf connection.

I erased the configuration from the Spine using the 'setup-clean-config.sh' command then reloading the switch, deleted from Fabric then rediscovering it.

I even replaced the Spine but I get the same issue.

admin@FMC-Apic-1:~> show version
This command is being deprecated on APIC controller, please use NXOS-style equivalent command
node type   node id  node name      version      
----------  -------  -------------  --------------
controller  1        FMC-Apic-1     2.1(1h)      
leaf        101      FMC-Leaf-101   n9000-12.0(2f)
leaf        102      FMC-Leaf-102   n9000-12.0(2f)
leaf        103      FMC-Leaf-103   n9000-12.0(2f)
spine       202      FMC-Spine-202  n9000-12.1(2e)

From what I can make out from a console session it's trying boot a lower image

Booting aci-n9000-dk9.11.1.4g.bin.

Thanks

gsidhu,

For future postings, if you run into an issue like this; you may want to start a new discussion so that the forum supporters can assist you faster and we can follow the discussion easier in tracking your questions & issues.

First off, your apic, leaf nodes and spine are all running different versions. It is recommended that all nodes be a the same firmware levels to ensure successful operations.

apic = 2.1(1h)
leaf = n9000-12.0(2f)
spine = n9000-12.1(2e)

I would suggest upgrading all the nodes to latest version of the Release train 2.1 which is 2.1(2g). Also, 2.1(1h) is deferred. Always refer to the release notes and the upgrade matrices before performing an upgrade. Also, always export a backup configuration before upgrading also.

In regards to your spine reloads and not being able to join the fabric, we would be just guessing without looking at the logs and access to the spine itself.

Plan action:

* upgrade the APIC to 2.1(2g) or later
* upgrade the leaf and spines to 2.1(2g) or later
* make sure all apics, leafs, and spines are at the same software release.
* verify ACI FABRIC toplogy is as desired
* monitor operations to see if the spine issue returns.
* if you experience the spine issue again, Please open a Cisco TAC Case so an ACI TAC engineer can address your issues.

Thank you for using the ACI Cisco Support Community!

T.

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Save 25% on Day-2 Operations Add-On License