01-27-2023 06:25 AM
Looking through this bug, it matches what issue I am experiencing. However, I do not use the activate option because I activate the software later during a maintenance window. With that, this workaround does not work in my case because I am already not using the activate option.
Solved! Go to Solution.
01-30-2023 07:58 AM
so it appears we found why these few switches were failing. Apparently some switches were put into production missing the 'ip scp server enable' command that is in our standard config template. I just enabled that on these 3 switches and code is pushing via Prime fine to them now.
So, I guess we are good now. To wrap up what I was running into (because I hate seeing these threads not state what the fix was):
Prime was failing to push code to 3850 switches due to the file size limit, I had roughly 2.1x the free space when I needed 2.2x. TAC state this was a known bug (per title) and recommended changing the dropdown under activation options to 'install using bundle mode' but leave the activation checkbox unchecked. This workaround allowed me to bypass the free space check. The few switches that were still failing, they were missing the 'ip scp server enable' command. I added that to fix the other few switches that were failing.
01-27-2023 03:28 PM
How many switches are you trying to upgrade?
01-27-2023 07:34 PM
a few hundred stacks
01-27-2023 07:40 PM
I do have a TAC case open regarding this. The engineer recommended not using the activate option. I explained that I already have this option disabled. His next recommendation was to select "convert to bundle mode" under the activation options dropdown but keep the activate unchecked. This is apparently another workaround.
I did try this and managed to get code pushed to a handful of switches tonight. However, I am running across some that complain about SSH connectivity when I know the AAA/TACACS config is exactly the same as the others as the AAA and TACACS configs were pushed via a template push a while back when we went from ACS to ISE. The error is:
"Flash Validation successfully completed.Loading Image File to Device : Copying image cat3k_caa-universalk9.16.12.07.SPA.bin to the flash flash: failed. Error occured while communicating with the device. Check device credentials and SSH/telnet reachability.."
01-27-2023 07:47 PM
@Noclss2000 wrote:
a few hundred stacks
Not a lot. I have about the same numbers but I never use PI nor DNA to upgrade the firmware of any stack.
Read the link I've provided. With my method, the can push the firmware across any time during the day and then schedule the stack to reboot at any time I want.
With my method, I have upgraded several hundreds of stacks, several times, without losing a single switch member or a stack.
01-28-2023 08:50 AM
yeah I've learned not to activate using PI, I typically use it as the SCP server to push the software out to the switches I'm upgrading and then I manually control what switches and when will be upgraded. I've been burnt by PI doing the activate and bricking the switch stack, resulting in me driving in at 4am to rebuild a stack.
01-27-2023 07:52 PM
@Noclss2000 wrote:
Copying image cat3k_caa-universalk9.16.12.07.SPA.bin to the flash flash: failed. Error occured while communicating with the device.
Remote into this stack and manually copy the file, from a TFTP/HTTP server, into this switch. Does this error message come up?
01-28-2023 08:52 AM
It does not. I'm half wondering if these few that failed is just a weird quirk I've seen with PI and some switches. I will get a collection failure error due to snmp or ssh connectivity. However, doing a connectivity test shows it is fine. If I remove/re-add the device with the same credential profile, it then works fine and the errors go away. I may try re-adding one of these 3 switches that failed and see if it then starts working.
01-28-2023 03:35 PM
@Noclss2000 wrote:
I've been burnt by PI doing the activate and bricking the switch stack, resulting in me driving in at 4am to rebuild a stack.
And don't you worry -- DNAC is bound to do the same thing.
01-29-2023 09:00 AM
lol, lovely. I'm holding onto Prime kicking and screaming because there's features missing and a lot of my routers, VG's are no longer supported (I'm in a hospital environment so we don't get to upgrade as fast).
I did try removing the switch from Prime and re-adding it and then try pushing the software to it again. It still fails with the same error. I might toss that at TAC to see what they can make of it.
01-30-2023 07:58 AM
so it appears we found why these few switches were failing. Apparently some switches were put into production missing the 'ip scp server enable' command that is in our standard config template. I just enabled that on these 3 switches and code is pushing via Prime fine to them now.
So, I guess we are good now. To wrap up what I was running into (because I hate seeing these threads not state what the fix was):
Prime was failing to push code to 3850 switches due to the file size limit, I had roughly 2.1x the free space when I needed 2.2x. TAC state this was a known bug (per title) and recommended changing the dropdown under activation options to 'install using bundle mode' but leave the activation checkbox unchecked. This workaround allowed me to bypass the free space check. The few switches that were still failing, they were missing the 'ip scp server enable' command. I added that to fix the other few switches that were failing.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide