06-21-2023 06:45 AM
Running into an issue where about half of our 41 brand new Cisco CW9166I-B access points are not registering on our WLC. When consoled into the APs, we noticed after the initial boot phase, the wired0 ethernet port constantly goes link up/link down. Image is attached. Strangely this issue follows the AP around which leads us to believe it is NOT an issue with the WLC or switch stack. I can take a working AP down, plug in one of the problem ones, and still experience the same problem. The issue follows the APs. So the path the traffic takes is AP > Cisco Catalyst 9300UN switch > Cisco Catalyst 9800-L Controller. We have DHCP option 43 configured on our Bluecat DNS on the DHCP scope for the APs. Since around half of the APs worked no issue, we know that is not our problem.
The APs are on their own vlan and we've oddly been able to plug in a few APs to the user data ports at random desks and somehow, the AP will register with DHCP address (from the incorrect scope of course) and pop up in the WLC. If I move the AP then and plug it back into its respective port in the ceiling, it'll never rejoin into the WLC.
APs have been reset, attempted multiple different ports around the office designated for APs, removed switch interface configurations as well on a port by port basis. Nothing is working and I find it hard to believe I have 18 brand new access points right out of the box to be defective. Cisco support can't even make sense of this all. Has ANYONE ever run into a similar issue?
06-21-2023 06:57 AM
Hi
Just to make sure I understood. You have interface + cable + connectors on the build dedicate to APs and those interface with proper Vlan for AP. If you plug the AP right there you are facing problem of ethernet port when down.
If you plug the AP in a different interface, didicate to something else, the AP works fine? It that correct?
What about if you connect another device on the AP interfaces? Does it works?
06-21-2023 07:02 AM
Have you tried plugging the APs directly into the switch using a known good patch cable? Are the cable runs all CAT5e or better and within 100 meters? Have you done TDR tests?
It is normal for the wired port cycle up and down periodically (every couple minutes) if the access point is not able to contact a controller, but by your screenshot, it’s happening so often that it must be bad hardware or cabling.
06-21-2023 07:08 AM
Yes, I forgot to mention I've taken the AP directly to the switch stack and plugged in to a properly configured port. No dice. Even though there is another identical AP that registered without issue on that exact same port. All brand new cat6 cabling and all well within length requirements on the same floor.
The issue follows the AP's around the office. No matter where I take a few of the "defective" ones, they all act the same even in locations with working APs.
06-21-2023 07:14 AM
I would start by checking Layer 1 - Cable, RJ45 connectors, Sufficient PoE availability on the switch.
If they check out correct, see if DHCP pool is not exhausted.
Eliminate the Ethernet cable run and connect the AP directly to the switch with a good patch cable.
Also, console into the AP, capture the AP boot process and share the output to analyze.
CJ
/**Rate all useful responses**/
06-21-2023 08:20 AM
- This , may not be directly involved but also have a checkup of the 9800-L configuration with the CLI command : show tech wireless ; have the output analyzed with https://cway.cisco.com/wireless-config-analyzer/
As others are indicating : do you also have this problem if you put the AP on a switch , 'near' the controller ? Or what happens if you connect an older but supported model at the ceiling connection ?
M.
06-21-2023 08:26 AM
I can move the same non working AP from the ceiling jack and plug directly into the switch (same port) and experience the exact same problems. We have some older 9120AXI access points and we placed 3 of them in spots where the 9166's weren't working. Those took quite some time before registering again in the WLC (they were used in our old space). It took a solid day before they registered. Strangely, some of the other 9120's aren't coming up.
06-21-2023 08:40 AM
>...Those took quite some time before registering again in the WLC (they were used in our old space). It took a solid day before they registered. Strangely, some of the other 9120's aren't
- Could you have a look at the duplex/speed status of the involved problematic port(s) , also check the interface (error) counters ,
M.
06-21-2023 09:03 AM
APs are connected to five gig interfaces. Their speeds are left at 5gig and a-full duplex. I even attempted to set the interface speed to 1gig and the APs had the same issue with their eth port going link up/link down.
I believe Cisco tech support had me monitor the switch port for errors but nothing would show up during the AP boot cycle. I can double check on that though.
06-21-2023 09:58 AM
>...Their speeds are left at 5gig and a-full duplex.
5g is an unfamiliar speed for me , usually we talk of 1g or 10g (e.g.)
>... with their eth port going link up/link down.
- You may try on the switch : show cdp neighbors detail , to check if an ip address can already be obtained by the APs
>...monitor the switch port for errors
Just issue the command : show int gix/y (e.g.) and look at the counters; take samples too , (look at multiple problematic ports/connections)
M.
06-21-2023 11:30 AM
Speed: 100,1000,2500,5000,auto
AP ports are 5g. Below is a switch interface with an AP connected that is not registering with the WLC.
FiveGigabitEthernet4/0/22 is up, line protocol is up (connected)
Hardware is Five Gigabit Ethernet, address is 8024.8fbd.ae16 (bia 8024.8fbd.ae16)
Description: WAP
MTU 9198 bytes, BW 2500000 Kbit/sec, DLY 10 usec,
reliability 255/255, txload 1/255, rxload 1/255
Encapsulation ARPA, loopback not set
Keepalive set (10 sec)
Full-duplex, 2500Mb/s, media type is 100/1000/2.5G/5GBaseTX
input flow-control is on, output flow-control is unsupported
ARP type: ARPA, ARP Timeout 04:00:00
Last input 00:00:10, output 00:00:00, output hang never
Last clearing of "show interface" counters never
Input queue: 0/2000/0/0 (size/max/drops/flushes); Total output drops: 0
Queueing strategy: fifo
Output queue: 0/40 (size/max)
30 second input rate 0 bits/sec, 0 packets/sec
30 second output rate 0 bits/sec, 0 packets/sec
348044 packets input, 56329790 bytes, 0 no buffer
Received 347976 broadcasts (320102 multicasts)
0 runts, 0 giants, 0 throttles
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
0 watchdog, 320102 multicast, 0 pause input
0 input packets with dribble condition detected
985751 packets output, 128301915 bytes, 0 underruns
Output 2232 broadcasts (885078 multicasts)
0 output errors, 0 collisions, 3 interface resets
0 unknown protocol drops
0 babbles, 0 late collision, 0 deferred
0 lost carrier, 0 no carrier, 0 pause output
0 output buffer failures, 0 output buffers swapped out
06-21-2023 04:52 PM
Those error messages are Layer 2 issues.
What firmware are the APs loaded with?
06-21-2023 11:38 PM
>... with their eth port going link up/link down.
- You may try on the switch : show cdp neighbors detail , to check if an ip address can already be obtained by the APs
Also check the port counters of the Wireless Management Interface on the controller , it would help if a console could be connected to one of the APs when this problem occurs ,
M.
02-21-2024 04:42 PM
I have experienced this with a few of our new 9166 waps on a 9800 wlc. Plugging in Console shows ***Firmware Crashed*** They will repeat the same behaviour even on factory reset and different switches. Same switchport, same cable has allowed other 9166 in the same delivery batch to connect without issue.
[*11/13/2023 00:04:07.8320] ++++ Radio Firmware Crashed ! ++++
[*11/13/2023 00:04:07.8320]
[*11/13/2023 00:04:07.8321] Fatal error received from wcss software!:
[*11/13/2023 00:04:07.8321] QC Image Version: QC_IMAGE_VERSION_STRING=WLAN.HK.2.4-02142-QCAHKSWPL_SILICONZ-1
[*11/13/2023 00:04:07.8321] Image Variant : IMAGE_VARIANT_STRING=8074.wlanfw.eval_v2Q
[*11/13/2023 00:04:07.8321]
02-22-2024 07:02 AM
I was able to resolve this last year after about a month of troubleshooting. Oddly, I never encountered firmware crashes. What I narrowed the issue down to was latency between where the WAP's were located and one of our DHCP servers.
To explain further, we have a DHCP server in NY and one in Chicago. At the time, we were going through an office move so our new office space in Chicago was tunneling traffic to our old space via 1gb link and then to NY if needed. So, (Chicago new office > 1gb link > Chicago old office > NY). When you have DHCP helper addresses within your environment, the broadcast is sent out and from what I understand, the closest or lowest latency server responds first. Evidently our NY DHCP server had some play in this even though the access points that did register came up with Chicago based IP addresses.
When we cut ties with the 1gb link back to our old Chicago office and turned on the 10gb link directly from Chicago new office to NY, the remaining 20 or so access points registered within our WLC. Their uptime coincided with when the 10gb circuit was turned on. For whatever reason, the access points got very sensitive to latency from Chicago to NY even though we had a local DHCP server to hand out IPs. There was no other evidence pointing to any scenario that resolved the issue after numerous hours with Cisco support.
This was certainly a bizarre and most likely one off scenario however, a scenario that could assist others if there are latency issues within an environment to potentially affect the AP join process.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide