08-14-2024 08:48 AM
This is maybe more a general router product or platform question than specific SD-Routing/SD-WAN. Let me know if there is a more suitable sub community to post in!
OK, so I'm testing out SD-Routing using a Manager and Validator running 20.14.1. I have two physical edge routers in the 1100-series, one C1111-4P and one C1111-4PLTEEA. I have successfully been able to update the -4P, but not the -4PLTEEA. The cause of the error is clear, there is not enough space available on the flash.
[13-Aug-2024 15:36:13 UTC] Software Install action submitted for execution
[13-Aug-2024 15:36:18 UTC] Executing device action Software Install
[13-Aug-2024 15:36:18 UTC] Installing and activating software image
[13-Aug-2024 15:36:22 UTC] Current active partition: 17.13.01a.0.1297
[13-Aug-2024 15:36:22 UTC] Upgrade Requested for SW version : 17.14.01a.0.1470.1714097672
[13-Aug-2024 15:36:26 UTC] Configuring upgrade confirm timer to 15 minutes
[13-Aug-2024 15:36:26 UTC] Software Image c1100-universalk9.17.14.01a.SPA.bin
[13-Aug-2024 15:36:26 UTC] Sending requested upgrade action to the device
[13-Aug-2024 15:36:26 UTC] Software image download once started may take upto 60 minutes
[13-Aug-2024 15:36:31 UTC] [in_progress] Started Pre-Upgrade Check: Pre-Upgrade Checks for 'Download and Upgrade' workflow
[13-Aug-2024 15:36:50 UTC] [success] Image Validity Check: Image is compatible with the platform
[13-Aug-2024 15:36:50 UTC] [success] Image compatibility with controller: Image controller-version 20.14 is compatible with vManage version 20.14
[13-Aug-2024 15:36:50 UTC] [failure] Disk Space Check: Not enough Disk space for download and expansion. Required space: 1576738088 Available space: 1480138752
[13-Aug-2024 15:36:50 UTC] Remediation Disk Space Check: Free up some disk space by removing extra files/images and try again
[13-Aug-2024 15:36:50 UTC] [success] System Load Check: System has healthy CPU levels
[13-Aug-2024 15:36:50 UTC] [success] Memory Usage Check: System has healthy Memory levels
[13-Aug-2024 15:36:50 UTC] [success] Config-register Check: Autoboot is enabled
[13-Aug-2024 15:36:50 UTC] [success] Control-connection status Check: Control-connection to vManage is UP
[13-Aug-2024 15:36:50 UTC] [failure] Finished Pre-Upgrade Check: Aborting requested workflow due to failures
[13-Aug-2024 15:36:50 UTC] Pre upgrade check failed.
The problem is rather, what can I delete on the flash without breaking something?
There is no other image on the flash than the currently running version, which is 17.13.1a. This is the one I would like to upgrade to 17.14.1a using the software upgrade in the Manager GUI. Also, Manager GUI does not give me any option to remove any files (I guess this is because it, correctly, does not find any old images loitering on the flash). So, now I'm connecting directly to the device and try to figure out what can be removed from the flash.
This is the content of the flash root folder:
Directory of bootflash:/
114017 drwx 4096 Aug 14 2024 13:42:50 +02:00 tracelogs
89585 drwx 4096 Aug 13 2024 10:47:30 +02:00 core
17 -rw- 30 Apr 23 2024 11:06:22 +02:00 throughput_monitor_params
97729 drwx 4096 Apr 23 2024 11:05:49 +02:00 .prst_sync
16289 drwx 4096 Apr 19 2024 09:01:14 +02:00 dbgd
146595 drwx 4096 Apr 19 2024 08:54:26 +02:00 .dbpersist
25 -rw- 711313200 Apr 19 2024 08:44:35 +02:00 c1100-universalk9.17.13.01a.SPA.bin
122163 drwx 4096 Apr 17 2024 12:57:39 +02:00 ctrl_mng
122161 drwx 4096 Apr 16 2024 16:33:53 +02:00 SHARED-IOX
122162 drwx 4096 Apr 16 2024 16:33:53 +02:00 pcap
73298 drwx 4096 Apr 16 2024 13:46:51 +02:00 pnp-tech
24 -rw- 215 Apr 16 2024 13:46:00 +02:00 .iox_dir_list
81441 drwx 4096 Feb 14 2024 15:49:47 +01:00 sdwan
22 drwx 4096 Feb 14 2024 15:48:38 +01:00 lost+found
65153 drwx 4096 Feb 8 2024 23:25:45 +01:00 syslog
21 -rw- 656 May 26 2023 13:52:10 +02:00 vlan.dat
171026 drwx 4096 May 26 2023 13:34:29 +02:00 vmanage-admin
154739 drwx 4096 Apr 26 2023 11:53:17 +02:00 pnp-info
18 -rw- 15971 Apr 26 2023 11:53:03 +02:00 server.crt
19 -rw- 107 Apr 21 2023 11:48:36 +02:00 pki_certificates
20 -rw- 6343 Apr 19 2023 14:12:45 +02:00 original-xe-config
73302 drwx 4096 Apr 19 2023 11:02:40 +02:00 .cdb_backup
105879 -rw- 6974 Apr 19 2023 11:01:22 +02:00 packages.conf
105877 drwx 4096 Apr 17 2023 18:43:06 +02:00 .sdwaninternal
97731 drwx 4096 Apr 17 2023 18:20:54 +02:00 .keys
105874 drwx 4096 Apr 17 2023 18:20:51 +02:00 fw_upgrade_sysinfo
105873 drwx 4096 Apr 17 2023 18:18:44 +02:00 iox_host_data_share
154738 drwx 4096 Apr 17 2023 18:18:38 +02:00 .attrib
89587 drwx 4096 Apr 17 2023 18:18:23 +02:00 guest-share
162883 drwx 4096 Apr 17 2023 18:18:02 +02:00 onep
171025 drwx 4096 Apr 17 2023 18:17:58 +02:00 .geo
97730 drwx 4096 Apr 17 2023 18:17:21 +02:00 virtual-instance
16 -rw- 1923 Apr 17 2023 18:17:14 +02:00 trustidrootx3_ca_092024.ca
15 -rw- 20109 Apr 17 2023 18:17:14 +02:00 ios_core.p7b
73297 drwx 4096 Apr 17 2023 18:16:35 +02:00 .rollback_timer
I have removed some files and folder or sub folder here and there. But the amount of space recovered has so far not been enough.
Would it be safe to remove everything except the image for the currently running version (and maybe just a few other folders and/or files)?
08-16-2024 07:57 AM - edited 08-16-2024 07:59 AM
After some investigation I managed to find some relatively large files to remove from the /tracelogs/ directory. Files I suspected is not important for the system to run well.
After I removed the files I just got enough available space for the upgrade process to move on and this time it was successfully!
This was just one router in the lab. However, the company I work for have several thousands in production which is in the scope for SD-Routing/-WAN, if just a fraction of them encounters this issue it would be a pain in the "back" to investigate each router to see what, if any, files can be removed.
Is there a more scalable way of solving this or make sure this situation does not happen in the first place?
I really start to like the SD-Routing mode, it is getting more and more useful/powerful in a case where going to full SD-WAN is not yet feasible. I'm looking forward to see what's next!
08-16-2024 08:49 AM
BTW, cannot say same holds true for SD-WAN IOS updates, but usually you can also remove the running IOS from flash to provide space for an upgrade. (Of course, you hope device doesn't reload during upgrade process and newly installed software boots successfully.)
I also recall (???), again for non SD-WAN software, installer might have an option to delete currently running IOS.
08-19-2024 08:26 AM
@Joseph W. Doherty
I was thinking about removing the image of the currently running version but, as you point out, I think it is too risky. I bet a couple of routers would be lost until someone can go to the site and fix the issue. This was just in my lab, but in the production environment our routers are at our customer's sites. So, it is not very easy to get there and fix the issue.
It is possible to remove any unused images from devices directly in Catalyst Manager, so that is nice! However, as it was not any unused images taking up space in this case this, it was not an option.
Something that came to my mind now is that this available disk space-check can probably be optimized a bit. The required available space is about twice the size (1.5GB) of the actually image size (700MB).
[13-Aug-2024 15:36:50 UTC] [failure] Disk Space Check: Not enough Disk space for download and expansion. Required space: 1576738088 Available space: 1480138752
Also, it would be good to make sure there always is enough space on the flash for 2 images, like a reservation that could not be used for anything else, at least not by default.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide