cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
3886
Views
10
Helpful
11
Replies

ACI Spine discovery issues

DanDan
Level 1
Level 1

Hello Everyone!

I have an issue with ACI Fabric discovery.

Fabric was wiped clean and reconfigured.

After logging into the APIC GUI, I was able to register the first Leaf, but then the Spine has never showed up.

I can see the spine with lldp and when I check the interface on the Leaf connecting to the spine, it says (out-or-service),

most likely that is the reason why the DHCP is unable to reach the APIC and then register.

After #show discoveryissues on spine I get the following (showing only failed) :

Check 5 System State

================================================================================
Test01 Check System State FAILED
[Warn] Top System State is : out-of-service
[Info] Node upgrade is in notscheduled state
================================================================================

 

Check 7 BootStrap Status

================================================================================
Test01 Check Bootstrap/L3Out config download FAILED
[Warn] BootStrap/L3OutConfig URL not found
[Info] Ignore this if this node is not an IPN attached device
================================================================================

 

Check 9 DHCP Status


================================================================================
Test01 Check Node Id FAILED
[Error] Valid Node Id not received via DHCP response
Test02 Check Node Name FAILED
[Error] Valid Node name not revevied via DHCP
Test03 Check TEP IP FAILED
[Error] Valid TEP IP not revevied via DHCP
Test04 Check Configured Node Role FAILED
[Error] Valid Node Role not received via DHCP response
Test05 DHCP Msg Stats FAILED
[Info] Total DHCP discover sent by switch : 626

[Error] Cannot retrive DHCP offer stats
[Error] Cannot retrive DHCP request stats
[Error] Cannot retrive DHCP ACK stats
[Fatal-Error] Please check DHCP issues...Aborting command execution

All the other checks have PASSED.

 

Did anyone experienced similar issues?

 

Thank you in advance!

1 Accepted Solution

Accepted Solutions

Hi Robert,

 

I resolved the problem. Fabric is now fully discovered and registered.

The problem was that the Leaf port connecting to the Spine became a

downlink after I wiped it and rebooted. I was not aware of it at first.

 

So then I just converted the interface to the uplink, reloaded and from that point

everything worked just fine.

(Lesson learned, always keep the role of the ports in mind)

 

Thanks for engaging!

View solution in original post

11 Replies 11

Robert Burns
Cisco Employee
Cisco Employee

Is this for a Multipod fabric or single pod?
Which Version of ACI?
Which hardware platfomrs involved?

Robert

Hi Robert,

It is a single pod (lab environment), running on the latest 6.0(1j).

2 x Leaf N9K-C9348GC-FXP

1 x Spine N9K-C9332C

Robert Burns
Cisco Employee
Cisco Employee

Since this is a lab, I would wipe everything once more and restart setup.  I've seen issues when devices aren't properly wiped, or a matter of timing between the wipe & reboot between switches & controllers.  Tip - when you wipe the switches don't reboot them right away.  After the script finishing, leave them in a 'ready-to-reboot' state, then wipe the APIC (touch clean/setup).  When ready reboot them all together.    If you still have the issue again after this, let us know and we'll dig further on this.  
Btw - from the previous configuration to the latest one, did you change any of the initial config parameters? ie. TEP Pool range, subnets, infra VLAN etc?

Robert

Thanks Robert.

I have wiped the Fabric maybe 4-5 times at this point.

I believed that the issue had something to do with improper wipe of the Fabric.

I issued on all the switches : setup-clean-config.sh .

Then on the APIC : acidiag touch clean, acidiag touch setup, acidiag reboot. I gave it a good 10 minutes and

then reloaded all the switches at the same time.

The very first time, the switch rebooted so quickly, that it grabbed the config from the APIC and reconfigured itself.

So the next time I made sure there was enough time in between.

As far as configuration goes, that is exactly same like the previous config.

Thank you.

 

Robert Burns
Cisco Employee
Cisco Employee

Something is stale in the config.  The procedure you're doing sounds good, Try to kick off the switch reboots only once the APIC has completely finished shutting down and the POST screen comes up on the APIC console.  If the switches bounce before the APIC restarts, they will pull stale config as you've seen.  Alternately you could try to disconnect all the switches from the APIC - wipe them each individually, then once all are cleaned and rebooted, reconnect them and proceed with the APIC initial setup.

Robert

Thank you Robert.

I will go ahead and do that right now.

I will keep you posted.

 

Hello again,

 

I did exactly as you said,

I firstly wiped and reloaded the APIC, logged into the kvm in cimc and when I saw the setup script, that's when I reloaded the switches.

Once again I was able to register the Leaf, but spine is nowhere to be seen.

 

Also just to mentioned, it was working just fine. I just wanted to go thru the process to play with it.

And at this point I have been playing for couple of days lol :).

 

 

Hi Robert,

 

I resolved the problem. Fabric is now fully discovered and registered.

The problem was that the Leaf port connecting to the Spine became a

downlink after I wiped it and rebooted. I was not aware of it at first.

 

So then I just converted the interface to the uplink, reloaded and from that point

everything worked just fine.

(Lesson learned, always keep the role of the ports in mind)

 

Thanks for engaging!

Great to hear and thanks for the closure.  That DHCP fault happens if links are moved around (which wasn't your case) but was bang on with accuracy as a downlink can't relay DHCP from the Spine.  I should have considered downlink conversion, but when you said it was working previouisly I didn't think to question this.


Cheers,
Robert

Great news @DanDan ,

You should now mark your own answer as correct - that way if someone else searches for the same problem, they'll see it marked as "Solved" and there will be a link to the solution at the top of the post.

RedNectar aka Chris Welsh.
Forum Tips: 1. Paste images inline - don't attach. 2. Always mark helpful and correct answers, it helps others find what they need.

Hi,

 

Thanks a lot, I have just marked it as solved.

 

Review Cisco Networking for a $25 gift card

Save 25% on Day-2 Operations Add-On License