cancel
Showing results for 
Search instead for 
Did you mean: 
cancel

Who Me Too'd this topic

New AP not joining after WLC SSO

Hello all,

 

Setup : 3504 WLC HA-pair, 8.5.140 and 2802i Access points

 

This is a new setup and after configuring the WLC and adding the APs, a handful APs did not join due to cabling problems. After these problems were resolved the APs tried to join but failed.

 

These were messages we saw in the WLC logs:

 

*spamApTask3: Jan 02 16:21:28.462: %CAPWAP-3-ENCODE_ERR: [SA]capwap_ac_sm.c:3333 The system has failed to encode Image data request (Requested Ap Image not found) to AP 70:b3:17:4d:96:c0
*spamApTask3: Jan 02 16:21:28.462: %CAPWAP-3-IMAGE_DOWNLOAD_ERR3: [SA]capwap_ac_platform.c:1525 Refusing image download request from Unsupported AP 70:b3:17:4d:96:c0 - unable to open image file /mnt/images/ap.run/ap3g3

 

*spamApTask2: Jan 02 16:21:28.225: %CAPWAP-3-DISC_AP_MGR_ERR1: [SA]capwap_ac_sm.c:2109 The system is unable to process Primary discovery request from AP 70:b3:17:3b:94:a0 on interface (8), VLAN (510), could not get IPv6 AP manager

 

To be clear : IPV6 is not configured anywhere in the network at this location.

 

Disabling IPv6 did not fix anything.

 

The AP logs showed this:

[*01/03/2019 09:44:01.4577] CAPWAP State: Discovery
[*01/03/2019 09:44:01.4580] Got WLC address 10.113.10.240 from DHCP.
[*01/03/2019 09:44:01.5140] Discovery Request sent to 10.113.10.240, discovery type DHCP(2)
[*01/03/2019 09:44:01.5149] Discovery Request sent to 255.255.255.255, discovery type UNKNOWN(0)
[*01/03/2019 09:44:01.5150] Discovery Response from 10.113.10.240
[*01/03/2019 09:44:10.0002] Discovery Response from 10.113.10.240
[*01/03/2019 09:44:10.0000]
[*01/03/2019 09:44:10.0000] CAPWAP State: DTLS Setup
[*01/03/2019 09:44:10.0312] dtls_load_ca_certs: LSC Root Certificate not present
[*01/03/2019 09:44:10.0312]
[*01/03/2019 09:44:10.0345]
[*01/03/2019 09:44:10.0345] CAPWAP State: Join
[*01/03/2019 09:44:10.0357] Sending Join request to 10.113.10.240 through port 5248
[*01/03/2019 09:44:10.0397] Join Response from 10.113.10.240
[*01/03/2019 09:44:10.1148] HW CAPWAP tunnel is ADDED

 

[*01/03/2019 09:44:10.1338] CAPWAP State: Image Data
[*01/03/2019 09:44:10.1644] do PRECHECK, part1 is active part
[*01/03/2019 09:44:10.2981] Image Data Request sent to 10.113.10.240
[*01/03/2019 09:44:27.7254] Discarding msg CAPWAP_WTP_EVENT_REQUEST(type 9) in CAPWAP state: Image Data(10).
[*01/03/2019 09:44:28.2660] Discarding msg CAPWAP_WTP_EVENT_REQUEST(type 9) in CAPWAP state: Image Data(10).
[*01/03/2019 09:44:38.6979] Image download did not start for 30 seconds.
[*01/03/2019 09:44:38.6979] Restarting capwap - image download cannot start.
[*01/03/2019 09:44:38.6980]
[*01/03/2019 09:44:38.6980] Lost connection to the controller, going to restart CAPWAP...
[*01/03/2019 09:44:38.6980]
[*01/03/2019 09:44:38.6982] Restarting CAPWAP State Machine.
[*01/03/2019 09:44:38.7028] Discarding msg CAPWAP_WTP_EVENT_REQUEST(type 9) in CAPWAP state: Image Data(10).
[*01/03/2019 09:44:38.7040]
[*01/03/2019 09:44:38.7040] CAPWAP State: DTLS Teardown
[*01/03/2019 09:44:39.7144] Dropping dtls packet since session is not established. Peer 10.113.10.240-5246, Local 10.113.10.103-5248, conn (nil)
[*01/03/2019 09:44:44.4430] do ABORT, part1 is active part

 

After finding this bug : https://bst.cloudapps.cisco.com/bugsearch/bug/CSCvm69246 I remembered we had to SSO the WLC to change the power input of the primary unit. Aftwerwards we had to SSO again so the primary was active again. What seemd to help is to reset both units again with a reset self and force another round of SSO:

 

Steps:

pri : reset self -> failover to sec

sec : after pri is back again -> reset self - failover to primary

 

After the last reset the remaining APs were able to download the images.

 

Hope this might be useful when bumping into this problem. Based on the logs we weren't abel to find anything except the bug in bug search.

 

 

Who Me Too'd this topic