cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
11309
Views
1
Helpful
22
Replies

Cisco AIR-AP1852I Not Connecting to WLC

KGH0511
Level 1
Level 1

I'm managing a network with 266 of the above WAPs with dual 5520 WLC's. Up until yesterday all WAPs were connected to the controller and everything was humming along nicely. The switch stack that the WLC's connect to had a wobbly and needed to be rebooted. After the reboot I noticed that I had about 10 WAP's not coming back on line. I physcially observed each unit and the LED status is flashing red on all 10 of them.

I took one WAP to the workbench, connected it to the appropriate VLAN (power is via PoE from the switch) connected in over the console with a laptop to see what is going on. I could see it wouldn't connect to the controller and discovery failed unable to open SSH Daemon. The WAP was not getting an IP via DHCP. I assigned a static IP as well as manually configured the primary controller. Below is a snapshot from what happened after that;

[*08/19/2023 09:37:33.6517]
[*08/19/2023 09:37:33.6517] CAPWAP State: Discovery
[*08/19/2023 09:37:33.6517] Discovery Request sent to 10.61.XX.XX, discovery type STATIC_CONFIG(1)
[*08/19/2023 09:37:33.6617] Discovery Request sent to 10.61.XX.XX, discovery type STATIC_CONFIG(1)
[*08/19/2023 09:37:33.6617] Discovery Request sent to 255.255.255.255, discovery type UNKNOWN(0)
[*08/19/2023 09:38:03.1125]
[*08/19/2023 09:38:03.1125] CAPWAP State: Discovery
[*08/19/2023 09:38:03.1125] Discovery failed 5 times. Check Release/Renew DHCP AP CAPWAP MODE:[1] controller previously connected:[0]
[*08/19/2023 09:38:03.1125] CAPWAPd forces DHCP restart.
[*08/19/2023 09:38:03.1225] Discovery Request sent to 10.61.XX.XX, discovery type STATIC_CONFIG(1)
[*08/19/2023 09:38:03.1225] Discovery Request sent to 255.255.255.255, discovery type UNKNOWN(0)
[*08/19/2023 09:38:22.5664] WTP IP address changed from 0.0.0.0 to 10.61.XX.XX, restart CAPWAP.
[*08/19/2023 09:38:22.5664]
[*08/19/2023 09:38:22.5664]
[*08/19/2023 09:38:22.5664] Going to restart CAPWAP (reason : WTP IP address changed)...
[*08/19/2023 09:38:22.5664]
[*08/19/2023 09:38:22.5664] Restarting CAPWAP State Machine.
[*08/19/2023 09:38:22.5664] Discarding msg CAPWAP_WTP_EVENT_REQUEST(type 9) in CAPWAP state: Discovery(2).
[*08/19/2023 09:38:22.6064]
[*08/19/2023 09:38:22.6064] CAPWAP State: DTLS Teardown
[*08/19/2023 09:38:22.7564] upgrade.sh: Script called with args:[ABORT]
[*08/19/2023 09:38:22.7964] do ABORT, part2 is active part
[*08/19/2023 09:38:22.8263] upgrade.sh: Cleanup tmp files ...
[*08/19/2023 09:38:22.8563] Discarding msg CAPWAP_WTP_EVENT_REQUEST(type 9) in CAPWAP state: DTLS Teardown(4).
[*08/19/2023 09:38:22.8563] Discarding msg CAPWAP_WTP_EVENT_REQUEST(type 9) in CAPWAP state: DTLS Teardown(4).
[*08/19/2023 09:38:37.4318]
[*08/19/2023 09:38:37.4318] CAPWAP State: Discovery
[*08/19/2023 09:38:37.4718] Discovery Request sent to 10.61.XX.XX, discovery type STATIC_CONFIG(1)
[*08/19/2023 09:38:37.4718] Discovery Request sent to 10.61.XX.XX, discovery type STATIC_CONFIG(1)
[*08/19/2023 09:38:37.4718] Discovery Request sent to 255.255.255.255, discovery type UNKNOWN(0)
[*08/19/2023 09:39:06.9825]
[*08/19/2023 09:39:06.9825] CAPWAP State: Discovery
[*08/19/2023 09:39:06.9925] Discovery Request sent to 10.61.XX.XX, discovery type STATIC_CONFIG(1)
[*08/19/2023 09:39:06.9925] Discovery Request sent to 10.61.XX.XX, discovery type STATIC_CONFIG(1)
[*08/19/2023 09:39:07.0225] Discovery Request sent to 255.255.255.255, discovery type UNKNOWN(0)

On the WLC under security and AP policies, Accept Manufactured Installed Certificate is selected and nothing else. I have tried manually adding the MAC address of the AP's primary ethernet interface under the AP Authorization list and it's made no difference.

I have noticed that the time on the AP is not the same as the time on the WLC. However, I do not know the command to manually change the time when consoled into the AP. I've ? searched through all the menus and I don't see any option for manually adjusting the time. All the other AP's in the plant have sucessfully come back barring 10 pieces which are all behaving in the same manner. I have taken a spare AP from inventory and connected it in place of one of the units thats not behaving and the new unit connects to the WLC and comes online right away. I have also done a full factory reset on the unit I'm working with on the bench.

22 Replies 22

Here is the switchport config for the test port I configured with a working WAP from spares connected;

Name: Gi4/0/28
Switchport: Enabled
Administrative Mode: static access
Operational Mode: static access
Administrative Trunking Encapsulation: dot1q
Operational Trunking Encapsulation: native
Negotiation of Trunking: Off
Access Mode VLAN: 18 (mgmt_ap)
Trunking Native Mode VLAN: 1 (default)
Administrative Native VLAN tagging: disabled
Voice VLAN: none
Administrative private-vlan host-association: none
Administrative private-vlan mapping: none
Administrative private-vlan trunk native VLAN: none
Administrative private-vlan trunk Native VLAN tagging: enabled
Administrative private-vlan trunk encapsulation: dot1q
Administrative private-vlan trunk normal VLANs: none
Administrative private-vlan trunk associations: none
Administrative private-vlan trunk mappings: none
Operational private-vlan: none
Trunking VLANs Enabled: ALL
Pruning VLANs Enabled: 2-1001
Capture Mode Disabled
Capture VLANs Allowed: ALL

This currently has a new WAP connected and is operational.  All other switchports with WAPs connected are configured in the same manner, including the ones that have stopped talking to the controller.

Sorry, I thought that you were referring to the DHCP option on the WLC, under Controller => Advanced => DHCP. That is set to option 82. On the DHCP server is it set to 43.

Still have 10 units which do not connect to the WLC. One or two I can accept as hardware issues. But 10 pieces right after a switch reboot seems like an excessive amount to simultaneously develop hardware issues, after a single event caused all WAPs to disconnect from the WLC.

 

                        >...Still have 10 units which do not connect to the WLC
       - Could you try a reboot of the controller (too) and  check  if that changes anything  ?

 M.



-- Each morning when I wake up and look into the mirror I always say ' Why am I so brilliant ? '
    When the mirror will then always repond to me with ' The only thing that exceeds your brilliance is your beauty! '

It would be better to wait until the weekend, just in case I lost any more AP's. Presently I can get by with 10 out of action. The coverage is patchy in some locations, but it's not critical.

 

  >It would be better to wait until the weekend, just in case I lost any more AP's. Presently I can get by with 10 out of action. The coverage is patchy in some locations, but it's not critical.
                                  Good plan , 

 M.



-- Each morning when I wake up and look into the mirror I always say ' Why am I so brilliant ? '
    When the mirror will then always repond to me with ' The only thing that exceeds your brilliance is your beauty! '

Agreed 10 sounds like a lot even for 18xx, although we do RMA around 20 x 1832 per month (installed base of many thousands). The failure rate is considerably higher than any other model of AP we use.

I meant the running-config on the port - can you do "sh run int Gi4/0/28"?

Your AP logs do not seem to show any indication of receiving option 43 from DHCP.  If it received option 43 then I would expect to see "Got WLC address x.x.x.x from DHCP." in the logs.

Normally I have a failure rate of 1 a month on average. I've had the controller offline on a number of occasions in the past and never experienced anything like this.

The running config on the port is below;

Current configuration : 309 bytes
!
interface GigabitEthernet4/0/28
description AV-OFFICE 70802-1
switchport access vlan 18
switchport mode access
authentication event fail action authorize vlan 888
authentication open
authentication order mab
authentication port-control auto
mab
dot1x pae authenticator
spanning-tree portfast
end

Thats my test port, have compared with other ports that APs are currently running on and ones that APs are not running on and they are all the same. Using  one of the test AP's from spares (the new unit without the static address), while logged in via the console I can ping the DHCP server.

One thing I have noticed. Of the 265 WAP's the controller is showing 255 at any given time. I put two units from spares inventory on the network, and they connect fine, still 255 AP's. It should be 257. I take the two spares off, and I've still got 255. There may be something going on with the license.

 

  >...Of the 265 WAP's the controller is showing 255 at any given time. I...
  - This could be a bug too , in that context consider https://www.cisco.com/c/en/us/support/docs/wireless/wireless-lan-controller-software/200046-tac-recommended-aireos.html   , referring to  https://software.cisco.com/download/home/286284738/type/280926587/release/8.10.185.0
     As the aireos platforms are getting older it becomes more and more advisable to use the last recommended release for a particular model.

   - You can also have a checkup of the controller configuration using https://community.cisco.com/t5/networking-knowledge-base/show-the-complete-configuration-without-breaks-pauses-on-cisco/ta-p/3115114#toc-hId-1039672820 
    Have the output analyzed with : https://cway.cisco.com/wireless-config-analyzer/

     You may get insights concerning current issues too.

 M.



-- Each morning when I wake up and look into the mirror I always say ' Why am I so brilliant ? '
    When the mirror will then always repond to me with ' The only thing that exceeds your brilliance is your beauty! '

Yes limit of 255 does sound suspicious!

That port config looks a whole lot more complicated than ours.  You're sure the switch is not putting the port into blocking or err-disabled?  Maybe try with a simpler config and see whether it helps?
interface GigabitEthernet4/0/28
switchport access vlan 18
switchport mode access
spanning-tree portfast
spanning-tree bpdufilter enable
spanning-tree bpduguard disable
speed auto
duplex auto
cdp enable
end
The bpduguard and bpdufilter was config TAC advised us to use because the APs randomly send BPDUs at boot time which can trigger the port to be shut.
You might also want to take a look at https://bst.cisco.com/bugsearch/bug/CSCwf45495 expected to be fixed in forthcoming 8.10MR10.

> I've had the controller offline on a number of occasions in the past and never experienced anything like this.
We found the failures increased exponentially when the APs reached 2-3 years old.

Review Cisco Networking for a $25 gift card