cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
6385
Views
6
Helpful
14
Replies

9130 APs failed to join controller after code upgrade 17.9.3

DATHOZ
Level 1
Level 1
  • Several CAT 9130 APs failed to join controller after code upgrade here is the syslog error 2023-06-20T07:47:59-06:00 <local7.notice> 172.28.3.50 Jun 20 07:47:59.654 MST: %CAPWAPAC_SMGR_TRACE_MESSAGE-5-AP_JOIN_DISJOIN: Chassis 1 R0/0: wncd: AP Event: AP Name: ap-0032-6-650a Mac: a488.7370.6880 Session-IP: 172.30.9.167[5248] 172.28.3.50[5246] Disjoined DTLS close alert from peer

On the AP I see the following :

[*06/29/2023 20:56:05.8684] systemd[1]: Failed to start capwapd.
[*06/29/2023 20:56:05.8684] systemd[1]: capwapd.service failed.
[*06/29/2023 20:56:05.9230]
[*06/29/2023 20:56:05.9230]
[*06/29/2023 20:56:05.9230] Critical process has crashed repeatedly, This will result in reboot of system
[*06/29/2023 20:56:05.9230] You have 20 seconds to run this to stop the reboot. Good luck
[*06/29/2023 20:56:05.9231]
[*06/29/2023 20:56:05.9231] freeze -w
[*06/29/2023 20:56:05.9231]
show capwap client config
IPC socket server not ready for capwapd. Try after few moments, Errno: 111 Msg_id: 31


AdminState : ADMIN_ENABLED(1)
Name : APCC7F.75AE.49D4
Location : default location
Primary controller name :
Secondary controller name :
Tertiary controller name :
ssh status : Disabled
ApMode : Local
ApSubMode : Not Configured
Link-Encryption : Disabled
OfficeExtend AP : Disabled
Discovery Timer : 10
Heartbeat Timer : 30
Syslog server : 255.255.255.255
Syslog Facility : 0
Syslog level : errors
AP join priority : 1
IP Prefer-mode : Unconfigured
CAPWAP UDP-Lite : Unconfigured
AP retransmit count : 5
AP retransmit timer : 3
AP lsc enable : 0
AP Policy Tag : UNKNOWN
AP RF Tag : UNKNOWN
AP Site Tag : UNKNOWN
AP Tag Source : 0
Slot 0 Config:
Error: Socket open failed
Error: Socket open failed
Slot 1 Config:
Error: Socket open failed
Error: Socket open failed
APCC7F.75AE.49D4#[ OK ] Stopped Cisco image/firmware updater service.
Stoppi Stopping H

 

1 Accepted Solution

Accepted Solutions

DATHOZ
Level 1
Level 1

Just to update this ticket, I worked with TAC on 3 diff 9130s AP and once we run the command on U-Boot and reboot, the AP stays on a loop and it just never loads anything. The plan is to RMA 15 devices. The interesting part is that during the migration I had 500+ 9130 APs but only 15 of them got into this issue. Thanks for the feedback

View solution in original post

14 Replies 14

Hi

Factory reset one AP and test.

That is one of the first things I did. You can see the AP name changed to APCC7F.75AE.49D4. I have a ticket open for about  couple weeks without any success. Tonight I'm trying to move the ticket to the east region  team. Hopefully I have a better success.
Obrigado Flavio

Console into the AP and reboot the AP. 

Post the entire boot-up process.  

I suspect this is CSCwf42824.

If it is CSCwf42824 then it's fixed in 17.12.1 and 17.9.4 which are both out now.

JPavonM
VIP
VIP

I have suffered that behaviour when upgrading to 17.6.4 and to 17.9.3 for all kind of AP models, and after talking to BU they told me they have identified a defect where the new code on the AP is not verified.

Check this document BU shared with me related to bug CSCvx32806.


@JPavonM wrote:
C9800# install add file bootflash: C9800-80-universalk9_wlc.17.03.07.SPA.bin

Thanks for sharing that document.  The document contains a bug.  The syntax (above) is incorrect. 

Reading this document reminds my hack fix to FN-72524.

Rich R
VIP
VIP

Also check my answer here in case it's that: https://community.cisco.com/t5/wireless/c9130axi-e-stuck-downloading-image/m-p/4773527/highlight/true#M251543

DATHOZ
Level 1
Level 1

Just to update this ticket, I worked with TAC on 3 diff 9130s AP and once we run the command on U-Boot and reboot, the AP stays on a loop and it just never loads anything. The plan is to RMA 15 devices. The interesting part is that during the migration I had 500+ 9130 APs but only 15 of them got into this issue. Thanks for the feedback


@DATHOZ wrote:
The plan is to RMA 15 devices. The interesting part is that during the migration I had 500+ 9130 APs but only 15 of them got into this issue.

15 APs needed to be RMA is a significant quantity in my book and is not good. 

Has TAC provided a Bug ID with this behaviour?

Rich R
VIP
VIP

Did you try re-flashing any of those APs using the process at https://www.cisco.com/c/en/us/support/docs/wireless/catalyst-9120axi-access-point/217537-repairing-c9120-c9115-access-points-from.html (it talks about 9115 and 9120 but the firmware for 9130 is provided on the download link too)?

JPavonM
VIP
VIP

I have suffered this too, and few APs from model C9130 went into RMA after multiple workarounds.

It would be interesting to see if those APs would work on 17.9.4 or 17.12.1

Right now i'm deploying a network renewal, and found the issue everytime i turn on my devices (not AP is capable to join the controler).
It's an issue with time, don't know why. The time it shows is around March 2023, when the AP was manufactured. Well, i configure the right time and then all the AP start to join with no issue at all.

That will be a problem if something happens to the AP that has the controller role and no other standby AP is available to take that role. If that happens then all APs will be down until i configure the right time again.

Is NTP correctly configured and synced on the WLC?
This is extremely important and highlighted in the Best Practice guide:
https://www.cisco.com/c/en/us/products/collateral/wireless/catalyst-9800-series-wireless-controllers/guide-c07-743627.html#Wirelessmanagementinterface

Review Cisco Networking for a $25 gift card