07-10-2024 06:23 AM
Hello Cisco-WLAN-Community,
I would like to report a WLAN-AP-Join-Problem that arised here during SW-Upgrade on 9800-80-WLC to version 17.9.5.
Despite the fact, that this hospital is running a HA-SSO-9800-solution, more than 1200 APs were no longer able to join after the SW-Upgrade. The horror shows up in AP join statistics in form of the following Error-message:
DTLS cert-chain not available
The planned Stackered AP Image Upgrade shows even success in Gui with 0 Percentage completion!
It went even worse, the predownloaded image went overwritten, while APs jumped on 5520-WLCs used as backup.
Several hours confusion and outages ended after I found a hint to a missing Trustpoint on the wireless management interface
in the 9800-Best practice Guide, chapter "Dealing with trustpoints"
I don't no why this trustpoint disappeared in the dark during the SW-Update.
To me it looks like a clear bug and shame on Cisco for the outages created by this mess!
After adding Trustpoint Name : CISCO_IDEVID_CMCA3_SUDI to the Wireless Management Interface the APs started joining the 9800-WLC again. Sad to say, that this needed again a SW-Download and Downtime to overwrite the previously sucked 5520-image.
You can configure and check it in CLI:
wireless management trustpoint CISCO_IDEVID_CMCA3_SUDI
Unfortunately, the trustpoint-configuration on the Wireless-interface is not seen in the running config.
What a mess on this Cisco-WLAN-High-Availability-solution.
Is this a bug or a feature in SW-Version 17.9.5 ?
Kind regards
Wini
07-10-2024 08:05 AM
>...DTLS cert-chain not available
- FYI : https://bst.cloudapps.cisco.com/bugsearch/bug/CSCwj58547
I don't know how or when (if) this bug is applicable to previous versions (too)
In any case it is always advisable if the (a) controller is upgraded to always check
the current state of the configuration with the CLI command show tech wireless and feed the
output from that into Wireless Config Analyzer
You will discover such issues faster then , or it should perhaps be considered mandatory , (after upgrading)
M.
07-10-2024 08:30 AM
I always review the IOS-XE release notes to identify any potential issues/gotcha before starting the upgrade.
In your case - "Modified Trustpoints for Secure Unique Device Identity (SUDI) Certificates". Also, look at the following point from "Behavior changes" section --
"If you have configured CISCO_IDEVID_SUDI trustpoint in your configuration, you will need to replace it with CISCO_IDEVID_CMCA3_SUDI to avoid client connection and AP join issues. The reason for this change being the CISCO_IDEVID_SUDI changed from SW-SUDI certificate in previous releases to HW-SUDI certificate. The processing of HW-SUDI certificate is much slower than the SW-SUDI. Here, CISCO_IDEVID_CMCA3_SUDI is the new SW-SUDI certificate."
IOS-XE release notes: https://www.cisco.com/c/en/us/td/docs/wireless/controller/9800/17-9/release-notes/rn-17-9-9800.html
Jagan Chowdam
/**Pls rate useful responses**/
07-10-2024 01:01 PM - edited 07-10-2024 01:09 PM
Hello Jagan,
thank You for Your adivce to always read the Release-Notes carefully.
To be honest I read it, but did not realize this potential problem caused by a modified Trustpoint for SUDI.
I focus on open and resolved caveats chapter and check supported hardware.
The chapter about modified trustpoints does not warn about a potential problem with AP join.
In our case, the Trustpoint info on the wireless management interface was clearly lost and
not updated during the ISSU-process according to the info in Rel-Notes 17.9.5.
To recover, we did a redundancy force fail-over to return the active unit to the initial one and a colleague added and deleted another trustpoint for testing purposes in vain. Some minutes later, the field was filled automatically to our surprise with the correct SUDI-cert. Maybe this change happens to late in Cisco ISSU-update-procedure.
It looks like a problem during the ISSU Phase when the standy has been updatd to new code and the switch to it occurs to update the former active unit also. In that phase all went worse. No AP cyclic SW-Update happened.
I also do not understand the following info there:
show wireless management trustpoint command output
If Cisco Catalyst 9300 Series Switch is used with a Cisco Catalyst 9800 Series Wireless Controller for wireless deployments, the trustpoint name in the output of show wireless management trustpoint command is updated to the modified trustpoint name as mentioned previously.
What does that mean ? What is the purpose of a 9300 switch in cunjunction with Trustpoints certificates ?
Is there a dependency between the 9800-WLC and the switch to which it is connected in regards of this damned trustpoint ?
Kind regards
Wini
07-10-2024 03:40 PM
Console into the AP and reboot the AP.
Post the entire bootup process, specifically the point where the AP is attempting to download the 17.9.5 AP firmware and subsequently failing.
07-12-2024 12:59 AM - edited 07-12-2024 01:00 AM
Hello Cisco-WLAN-experts,
the problem all around this certificate-problem is apparently lack of knowledge and reading of Release-Notes at my side!
But I haven't configured this manufacturer-certificate by myself in the past ! It is everything Cisco's responsibility to roll out a new Manufacturer Certificate carefully in my opinion.
How can I check now our HA-9800-80-WLCs, that the failing Standby-Unit has loaded the correct new certifcate ?
On the active machine I have:
Marce1000 pointed also to the follwing bug which applies onlyfor the Cloud-version of this WLC, not our failed 9800-80-model.
SSC certificate/trustpoint fails to generate when challenge password contains special characters CSCwj58547
https://bst.cloudapps.cisco.com/bugsearch/bug/CSCwj58547
How can I check the Private Key Info of this new Cisco Trustpoint Certificate ragarding this ?
How can I check my logs now regarding this failing installation of a Manufacturer certificte on the Srtand-by-unit ?
Please check and com back with information.
Have anice weekend
Wini
07-12-2024 02:35 AM
Hello Cisco WLAN -experts,
we recently upgraded our central 9880-80-HA in this hospital and fighted with the change in trustpoint certificate of the wireless management interface. Info from Release-Notes:
If You have CISCO_IDEVID_SUDI trustpoint in your configuration, you will need to replace it with CISCO_IDEVID_CMCA3_SUDI to avoid client connection and AP join issues.
Now we have another problem, that is possibly tied to this change.
Some of our 3802i-AP are no longer joining the 9800-80-WLC.
What shall we do in this case ?
Shall we powercycle the connected swich-ports to force the APs to start all over?
Thank You for any good advice.
Kind regards
Wini
07-12-2024 04:16 AM
With one of the affected APs, post the complete output to the AP command of "sh version".
07-12-2024 04:44 AM - edited 07-12-2024 04:52 AM
Hello Leo,
here is the demanded output of show version:
ZOM-AP665 uptime is 0 days, 1 hours, 42 minutes
Last reload time : Fri Jul 12 09:58:07 UTC 2024
Last reload reason : Image Upgrade
cisco AIR-AP3802I-E-K9 ARMv7 Processor rev 1 (v7l) with 1028192/533672K bytes of memory.
Processor board ID FCW2329PEMT
AP Running Image : 17.9.5.47
Primary Boot Image : 17.9.5.47
Backup Boot Image : 17.9.3.50
Primary Boot Image Hash: c45a036194812821985c18856c025c88e94ea09ab7ab99c0bf445dd32e48f956a6081454b207779f56441bc3f5f672ab7bfda4a7f65624bddb8242d1c2276507
Backup Boot Image Hash: dc23d41992f206deb1a408c01ce88f5fe501ccbcc403deffb1ff51552e165570f1368ab82bddf73a7a66d3de2ec43114f52f5c18c69f2f77c78a300ba5a8a484
1 Multigigabit Ethernet interfaces
1 Gigabit Ethernet interfaces
2 802.11 Radios
Radio Driver version : 9.0.5.5-W8964
Radio FW version : 9.1.8.1
NSS FW version : 2.4.32
Base ethernet MAC Address : 08:4F:A9:50:C4:DE
Part Number : 0-000000-00
PCA Assembly Number : 800-105811-01
PCA Revision Number : A0
PCB Serial Number : FOC2328769S
Top Assembly Part Number : 800-105811-01
Top Assembly Serial Number : FCW2329PEMT
Top Revision Number : A0
Product/Model Number : AIR-AP3802I-E-K9
When I compare to joined APs without problem I can see the trustpoint problem we had. The backup image is the 5520 image which the APs loaded during Trustpoint join problem phase of 9800-80-WLC.
cisco AIR-AP3802I-E-K9 ARMv7 Processor rev 1 (v7l) with 1028192/529080K bytes of memory.
Processor board ID FCW2329PEC3
AP Running Image : 17.9.5.47
Primary Boot Image : 17.9.5.47
Backup Boot Image : 8.10.190.0
The still missing 3802i-APs only have the two 9800-image-version in there flash. This show, that they even where not able to join the 5520-WLcs during outage phase.
The problem is solved meanwhile.
In all cases I was successfull with shut/no shut at switch port of the missing WLAN AP using cdp-info of our good old Prime.
The 9800-WLC does not show cdp-info for not joined APs unfortunately.
Kind regards
Wini
07-12-2024 05:50 AM
>....The 9800-WLC does not show cdp-info for not joined APs unfortunately.
Does the cdp-info show up when the command show cdp neighbors detail is used on the 'local switch' where they are connected to. If not the AP has for instance not completely booted and or this could be a POE related problem (e.g)
M.
07-14-2024 04:04 AM
debug capwap packets <<- share this
MHm
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide