03-15-2025 06:38 AM - edited 03-17-2025 08:22 AM
Just would like some ideas on how to troubleshoot a leaf/spine upgrade that installed the firmware but then went "unreachable status" for about an hour before I called it a day and left. Here's the upgrade hops that I did.
ACI infrastructure just out of the box
Hardware is 93180 + 9332C
APIC firmware version 5.2(8) upgraded to 6.0.2(f) then to 6.0.8(f)
Spines were at 6.0.2(h) already so I just created an odds and evens group to get to 6.0.8(f)
Summary: 5.2.8 > 6.0.2 > 6.0.8 Is all I did starting first with the APIC and then from 6.0.2h to 6.0.8(f) switches.
I installed both the 32/64 bit versions. I left the "advanced" settings default which is continue on error and no gracefull insertion/removal or whatever.
I did notice that the TEP addresses are not pingable. I did not put management IPs on them before the upgrade (it's pretty much out of box) but I thought that the communication is handled via infra network aka TEP. I tested an even leaf by setting up an OOB so NTP will work but it also did not boot.
I did not have vPCs yet since it's out of box, however everything is physically cabled correctly. Github pre-validation artical mentions this is not required for a successful upgrade. When the leafs went down, I noticed APIC1 bond0 eth2-1 (Odd spine) went down and is now active on 2-2 probably because the TEP address is down and has nothing to do with there not being a vPC.
Any ideas?
03-16-2025 08:14 PM
Hey @KVS7
Please do check below points as initial task to Troubleshoot:
03-17-2025 06:34 AM - edited 03-17-2025 08:24 AM
Thanks Ash, I will try and power cycle but that's about all I can do. it's out of box so the 3 node APIC cluster was healthy or fully fit meaning it has an infra-vlan.
The switch is in a different country so I can't access it until our Raritan orders come in. Any other ideas? I'll get back after power cycle.
03-17-2025 03:12 PM
Hey @KVS7
Until your Raritan orders (I am not fully clear what this order mean) come in, you can do below activities:
1. Review APIC Configuration (Remotely):
Fabric > Inventory > Pod 1 > Fabric Membership
. Verify that the leaf switch is listed. If it's listed but showing as "unreachable," check the "Last Update" time. Is it recent, or does it indicate that the APIC hasn't heard from the leaf in a while?2. Check APIC Health (Remotely):
Faults
tab). Filter by severity (critical, major, minor) and look for any faults related to the leaf switch or the infrastructure VLAN.03-17-2025 11:25 PM
Thanks Ash. Yes all that is good. It's a brand new build out of the box. Leafs were discovered automaticaly and registered, everything was fully fit and software was installed via the correct upgrade path. Everything was fine. Management IPs and TEPs are installed and pingable. However, when the leafs reboot, they down the APIC eth2-1 or 2-2 ports just from a reboot, so I feel like the issue is something else and not the upgrade.
03-18-2025 10:02 PM
Hello @KVS7 ,
I am not sure if I understood the current status of your fabric. You wrote earlier that the switches were unreachable after the upgrade. Then you wrote that everything is discovered and registered.
Does the question relate to the unregistered state or to the leaf reboot impacting the APIC reachability?
03-18-2025 11:20 PM - edited 03-19-2025 01:24 PM
We saw a bug report that shows reboots after upgrades sends the switches into ROMMON mode but we're still testing. We're also testing on our other network that mirrors the bad network and noticed the same issues.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide