06-04-2024 09:28 AM
I've got a mix of 9120 and 9130 APs. We run EWC controllers on a handful of them. I think they would all work fine with all using EWC images, but I thought that it may be best to only allow the EWC functionality on a small subset of APs. All are running in Flex mode. I have one AP that I use at my home office (remote worker) that is running in flex mode as well. Whenever I do IOS upgrades at the HQ, all the upgrades complete fine, but I have issues with my AP at home. Connected via VPN of course. I understand that with the mix of 9120/9130 APs that I need an external tftp server, and I do have one. It works well, and is accessible to all of the APs.
My AP at home also works well, and runs the current IOS software fine. I recently had it replaced on RMA because I managed to brick it trying to get this image on the unit with brute force. When I go to the Administraton - Software Management page on my EWC, it shows all of the other APs have these versions:
Primary Image (AP/Controller)
17.9.5.47 /17.09.05.0.6450
Backup Image (AP/Controller)
17.9.4.27 /17.09.04a.0.6
I have 40 APs total, and maybe 5 of them are running EWC images. The rest have been converted to CAPWAP. Mine at home also seemed to work fine when running an EWC image but this issue appears after converting to CAPWAP. I was worried that it would not be good to leave a remote AP that should never be the controller, as an EWC because I didn't want it to end up ever accidentally becoming the controller for the HQ network.
When I look at this AP in my list of software, it shows this:
Primary Image (AP/Controller)
17.9.5.47 /
Backup Image (AP/Controller)
17.9.4.27 /
Note that the controller image is missing completely.
When I boot up this AP, it boots up, connects and registers and works fine, but, it then tries to download an image that I assume is the controller images, and it never succeeds. I'll paste below some of the relevant log lines to see if that helps make sense. In this case, I see it join the controller, which is 10.111.50.10. Then, for some reason, it selects the lowest IP address AP and calls it the "primary server", and sets that to 10.111.50.20. Not sure why it doesn't point it to the tftp server?
[*06/04/2024 15:08:18.5151] OOBImageDnld: OOB Image Download in ap_cap_bitmask(2)
[*06/04/2024 15:08:18.5161] Sending Join request to 10.111.50.10 through port 5259, packet size 1376
[*06/04/2024 15:08:23.2671] OOBImageDnld: OOB Image Download in ap_cap_bitmask(2)
[*06/04/2024 15:08:23.2671] Sending Join request to 10.111.50.10 through port 5259, packet size 896
[*06/04/2024 15:08:23.2911] Join Response from 10.111.50.10, packet size 917
[*06/04/2024 15:08:23.2911] AC accepted previous sent request with result code: 0
[*06/04/2024 15:08:23.7731] Received wlcType 2, timer 30
[*06/04/2024 15:08:23.7771] Primary-Replica: role 3, primary server '10.111.50.20', old 0
[*06/04/2024 15:08:23.7861] OOBImageDnld: Do common error handler for OOB image download..
[*06/04/2024 15:08:23.8681] status 'upgrade.sh: Script called with args:[CANCEL]'
[*06/04/2024 15:08:23.9111] do CANCEL, part2 is active part
[*06/04/2024 15:08:23.9351] status 'upgrade.sh: Cleanup tmp files ...'
[*06/04/2024 15:08:23.9601] Primary-Replica: Set to Replica, retry 44, set_tftp_server 0, Primary server 10.111.50.20, Join role Replica, tftp_timeoutCnt 3
[*06/04/2024 15:08:23.9661] Image pre-download request for version 17.9.5.47.
[*06/04/2024 15:08:24.0061] status 'upgrade.sh: Script called with args:[NO_UPGRADE]'
[*06/04/2024 15:08:24.0491] do NO_UPGRADE, part2 is active part
[*06/04/2024 15:08:24.0541] Resetting replica AP upgrade mode to local mode
It runs a script and sees that the primary image needs no upgrade, and I see this:
[*06/04/2024 15:08:24.1141]
[*06/04/2024 15:08:24.1141] CAPWAP State: Image Data
[*06/04/2024 15:08:24.1151] AP image version 17.9.5.47 backup 17.9.4.27, Controller 17.9.5.47
[*06/04/2024 15:08:24.1151] Version is the same, do not need update.
[*06/04/2024 15:08:24.1551] status 'upgrade.sh: Script called with arg
Then it joins the controller:
[*06/04/2024 15:08:24.6041] CAPWAP State: Run
[*06/04/2024 15:08:24.6581] AP has joined controller MY-EWC-C9800-1
[*06/04/2024 15:08:24.6581] wtpHandleImageSwapEwlcmeTimer: time=0
[*06/04/2024 15:08:24.6601] Flexconnect Switching to Connected Mode!
[*06/04/2024 15:08:24.9321] Previous AP mode is 2, change to 2
Now it does some additional comparing of EWC images and starts to try to download it. Again, from 10.111.50.20, which is not the tftp server:
[*06/04/2024 15:08:25.1961] Recvd Image identifier Payload Image type is 1
[*06/04/2024 15:08:25.1961]
[*06/04/2024 15:08:25.1961] Recvd Image identifier Payload version len is 15
[*06/04/2024 15:08:25.1961]
[*06/04/2024 15:08:25.1961] Recvd Image identifier Payload Image version is [17.09.05.0.6450]
[*06/04/2024 15:08:25.1961]
[*06/04/2024 15:08:25.2011] EWLC image primary len=4 IP=10.111.50.20
[*06/04/2024 15:08:25.2011] Controller eWLC image version len: 15 val[17.09.05.0.6450]
[*06/04/2024 15:08:25.2411] status 'upgrade.sh: Script called with args:[COMPARE_MEWLC_VERSION 17.09.05.0.6450]'
[*06/04/2024 15:08:25.2861] do COMPARE_MEWLC_VERSION, part2 is active part
[*06/04/2024 15:08:25.3121] status 'upgrade.sh: MEWLC version: Ctrl 17.09.05.0.6450, AP , src='
[*06/04/2024 15:08:25.3171] eWLC image_download deleted /part1/wlc/shared/me_controller_image_upgrade /part2/mewlc_info.ver
[*06/04/2024 15:08:25.3171] Downloading EWC image from primary 10.111.50.20
[*06/04/2024 15:08:25.3171] EWC capable
[*06/04/2024 15:08:25.3481] /usr/bin/cli_scripts/image_download.sh called with args:[-c]
[*06/04/2024 15:08:25.3491] Cleanup eWLC files ...
[*06/04/2024 15:08:25.4211] Got WSA Server config TLVs
[*06/04/2024 15:08:27.6563]
[*06/04/2024 15:08:27.6563] *** Unable to connect to: 127.0.0.1:4040 - [Errno 111] Connection refused
[*06/04/2024 15:08:27.6563]
[*06/04/2024 15:08:30.1303]
[*06/04/2024 15:08:30.1303] *** Unable to connect to: 127.0.0.1:4040 - [Errno 111] Connection refused
[*06/04/2024 15:08:30.1303]
[*06/04/2024 15:08:30.3573] Efficient join download did not start
[*06/04/2024 15:08:30.3573] Efficient join download did not start
[*06/04/2024 15:08:30.3573] Efficient join download did not start
[*06/04/2024 15:08:30.3573] Efficient join download did not start
[*06/04/2024 15:08:30.3573] Efficient join download did not start
[*06/04/2024 15:08:32.1803]
[*06/04/2024 15:08:32.1803] *** Unable to connect to: 127.0.0.1:4040 - [Errno 111] Connection refused
[*06/04/2024 15:08:32.1803]
[*06/04/2024 15:08:32.3453] Re-Tx Count=1, Max Re-Tx Value=5, SendSeqNum=17, NumofPendingMsgs=12
[*06/04/2024 15:08:32.3453]
[*06/04/2024 15:08:32.4043] AP tag change to MY-REMOTE-PT
[*06/04/2024 15:08:32.4043] Recvd FQDN Config Payload
[*06/04/2024 15:08:32.4043]
[*06/04/2024 15:08:35.3573] Efficient join download did not start
[*06/04/2024 15:08:40.3573] Efficient join download did not start
[*06/04/2024 15:08:45.3573] Efficient join download did not start
[*06/04/2024 15:08:50.3573] Efficient join download did not start
[*06/04/2024 15:08:53.8933] set cleanair [slot0][band0] enabled
[*06/04/2024 15:08:53.9093] set cleanair [slot0][band1] enabled
[*06/04/2024 15:08:53.9413] set cleanair [slot1][band1] enabled
[*06/04/2024 15:08:55.3573] Efficient join download did not start
I see all the notes for "Efficient join download did not start". Not sure why/what that's about.
Then it actually starts to download the image:
[*06/04/2024 15:08:55.4343] % Total % Received % Xferd Average Speed Time Time Time Current
[*06/04/2024 15:08:55.4343] Dload Upload Total Spent Left Speed
36 288M 36 105M 0 0 62684 0 1:20:30 0:29:25 0:51:05 0
36 288M 36 105M 0 0 62684 0 1:20:30 0:29:25 0:51:05 62684
This paste may be hard to read because of the spacing not showing up, but basically it knows theres a 288M file to download. It starts downloading it, and goes pretty slowly. It gets up to about 105M or so and then dies. Generally it dies around the same time.
Then, it starts a countdown to try again, and does the whole thing over again.
[*06/04/2024 15:38:25.3996] Error in transferring file 28
[*06/04/2024 15:38:25.3996] Failed command [tftp://10.111.50.20/mewlc.bin]
[*06/04/2024 15:38:25.4176] Retrying EWC image in 65 seconds
[*06/04/2024 15:39:27.0305] Controller eWLC image version len: 15 val[17.09.05.0.6450]
[*06/04/2024 15:39:27.0705] status 'upgrade.sh: Script called with args:[COMPARE_MEWLC_VERSION 17.09.05.0.6450]'
[*06/04/2024 15:39:27.1165] do COMPARE_MEWLC_VERSION, part2 is active part
[*06/04/2024 15:39:27.1425] status 'upgrade.sh: MEWLC version: Ctrl 17.09.05.0.6450, AP , src='
[*06/04/2024 15:39:27.1475] eWLC image_download deleted /part1/wlc/shared/me_controller_image_upgrade /part2/mewlc_info.ver
[*06/04/2024 15:39:27.1485] Downloading EWC image from primary 10.111.50.20
[*06/04/2024 15:39:27.1485] EWC capable
[*06/04/2024 15:39:27.1755] /usr/bin/cli_scripts/image_download.sh called with args:[-c]
[*06/04/2024 15:39:27.1755] Cleanup eWLC files ...
[*06/04/2024 15:39:32.2685] Efficient join download did not start
[*06/04/2024 15:39:32.2685] Efficient join download did not start
[*06/04/2024 15:39:32.2685] Efficient join download did not start
[*06/04/2024 15:39:32.2685] Efficient join download did not start
[*06/04/2024 15:39:32.2685] Efficient join download did not start
[*06/04/2024 15:39:37.2685] Efficient join download did not start
[*06/04/2024 15:39:42.2685] Efficient join download did not start
[*06/04/2024 15:39:47.2685] Efficient join download did not start
[*06/04/2024 15:39:52.2685] Efficient join download did not start
[*06/04/2024 15:39:57.2685] Efficient join download did not start
[*06/04/2024 15:39:57.3405] % Total % Received % Xferd Average Speed Time Time Time Current
[*06/04/2024 15:39:57.3405] Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- 0:00:04 --:--:-- 0Efficient join download did not start
0 0 0 0 0 0 0 0 --:--:-- 0:00:09 --:--:-- 0Efficient join download did not start
0 0 0 0 0 0 0 0 --:--:-- 0:00:14 --:--:-- 0Efficient join download did not start
0 0 0 0 0 0 0 0 --:--:-- 0:00:19 --:--:-- 0Efficient join download did not start
0 0 0 0 0 0 0 0 --:--:-- 0:00:24 --:--:-- 0Efficient join download did not start
0 0 0 0 0 0 0 0 --:--:-- 0:00:29 --:--:-- 0Efficient join download did not start
0 0 0 0 0 0 0 0 --:--:-- 0:00:34 --:--:-- 0Efficient join download did not start
0 0 0 0 0 0 0 0 --:--:-- 0:00:39 --:--:-- 0Efficient join download did not start
0 0 0 0 0 0 0 0 --:--:-- 0:00:44 --:--:-- 0Efficient join download did not start
0 0 0 0 0 0 0 0 --:--:-- 0:00:49 --:--:-- 0Efficient join download did not start
0 0 0 0 0 0 0 0 --:--:-- 0:00:54 --:--:-- 0Efficient join download did not start
0 0 0 0 0 0 0 0 --:--:-- 0:00:59 --:--:-- 0Efficient join download did not start
0 0 0 0 0 0 0 0 --:--:-- 0:01:00 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:01:00 --:--:-- 0
[*06/04/2024 15:41:02.2704] Efficient join download did not start
[*06/04/2024 15:41:02.2704] Error in transferring file 28
[*06/04/2024 15:41:02.2704] Failed command [tftp://10.111.50.20/mewlc.bin]
[*06/04/2024 15:41:02.2724] Retrying EWC image in 65 seconds
I'm not sure what I can do about this. In an ideal world, I would log on to the AP or controller, and tell it to go get the EWC image off the tftp server and install it. This should work fine. The VPN generally works great for everything, and I'm connected through it all day long. It's a site-to-site hardware VPN and is real reliable. I don't know if I am just not finding a good command to use to force a single AP to download an image from the tftp server, but I'm just at a loss as to getting this AP to be fully upgraded.
The biggest issue isn't that the AP doesn't work. It's that it just continuously all day will keep retrying to download that image, never succeeding. At this point the only way I think I could surely get it upgraded is if I put together another local EWC, and temporarily join it to that one here at home, and then let it re-join the HQ controllers after that.
Any tips to help me overcome this? I'd love to hear that I can download the image direct to the AP from any tftp server, but I'm just not finding the magic combination of commands to get a local download completed.