Solved: Re: Cannot join 2700 series access point to 1830 mobility express controller

IsteniucMihai85603 · ‎07-30-2020

Hi everyone,

This is my first post here so if I'm not posting on the right area maybe an admin move this to correct area. Also keep that in mind that english is not my native language so I might sound strange sometime.

I was trying for the past couple days, to join one AIR-CAP2702E-E-K to a controller based on AIR-AP1832I-E-K9 with Mobility Express Solution installed on it.

I have successfully joined to this controller a second AIR-AP1832I-E-K9 access point (with capwap software installed on it), so I presume that the LAN has no limitations regarding protocol(s) involved, also that the controller has the right type of software that is functioning correctly.

Before trying this, I have done some research and from what I understand, it can be done but with some limitations regarding management of the 2700 series access point. This is no concern for me. Based on compatibility matrix provided by Cisco I have tried several different software versions for the controller and the 2702 access point, but all failed for me on a successful join. Basically after a successful boot of the 2700 access point, (see the attached 2700 complete_boot.txt), the 2700 ends in a communication loop, (same on all software versions tried), with the controller that look like this:

*Jul 26 18:40:26.855: %SYS-6-LOGGINGHOST_STARTSTOP: Logging to host 255.255.255.255 port 0 CLI Request Triggered
*Jul 26 18:40:27.855: %SYS-6-LOGGINGHOST_STARTSTOP: Logging to host 255.255.255.255 port 514 started - CLI initiate d%No matching route to delete
Translating "CISCO-CAPWAP-CONTROLLER"...domain server (192.168.183.222) [OK]

*Jul 26 18:40:48.867: AP has SHA2 MIC certificate - Using SHA2 MIC certificate for DTLS.

*Jul 30 06:49:49.000: %CAPWAP-5-DTLSREQSEND: DTLS connection request sent peer_ip: 192.168.183.11 peer_port: 5246
*Jul 30 06:49:49.255: %CAPWAP-5-DTLSREQSUCC: DTLS connection created sucessfully peer_ip: 192.168.183.11 peer_port: 5246
*Jul 30 06:49:49.255: %CAPWAP-5-SENDJOIN: sending Join Request to 192.168.183.11
*Jul 30 06:49:54.255: %CAPWAP-5-SENDJOIN: sending Join Request to 192.168.183.11
*Jul 30 06:49:54.255: %DTLS-5-ALERT: Received WARNING : Close notify alert from 192.168.183.11
*Jul 30 06:49:54.259: %DTLS-5-SEND_ALERT: Send FATAL : Close notify Alert to 192.168.183.11:5246
*Jul 30 06:49:54.259: AP has SHA2 MIC certificate - Using SHA2 MIC certificate for DTLS.

*Jul 30 06:49:54.000: %CAPWAP-5-DTLSREQSEND: DTLS connection request sent peer_ip: 192.168.183.11 peer_port: 5246
*Jul 30 06:49:54.259: %CAPWAP-5-DTLSREQSUCC: DTLS connection created sucessfully peer_ip: 192.168.183.11 peer_port: 5246
*Jul 30 06:49:54.259: %CAPWAP-5-SENDJOIN: sending Join Request to 192.168.183.11
*Jul 30 06:49:59.259: %CAPWAP-5-SENDJOIN: sending Join Request to 192.168.183.11
*Jul 30 06:49:59.259: %DTLS-5-ALERT: Received WARNING : Close notify alert from 192.168.183.11
*Jul 30 06:49:59.259: %DTLS-5-SEND_ALERT: Send FATAL : Close notify Alert to 192.168.183.11:5246
*Jul 30 06:49:59.259: AP has SHA2 MIC certificate - Using SHA2 MIC certificate for DTLS.

... then repeat

Time stamps of the events are a little strange ... log is from July 30.

Before the message: "Translating "CISCO-CAPWAP-CONTROLLER"...domain server (192.168.183.222) [OK]" date is showed July 26. After that, date is July 30, but time is wrong ... it was not 06:49 but 09:49. So i presume that date/time are taken after discovery of the controller from it ... but with this 3 hours difference. I read on same posts that correct date and time should be on both controller and AP for a successful handshake, but if so, I don' know the next step to correct this. Also I'm not sure if this is to blame. On the logs taken from controller (see attached controller_debug_commands), date and time stamps are displayed correctly.

Questions:

1. Should I remove from time page of the controller DST and GMT settings?

2. Is there a command on capwap side for correctly synchronize time on AP with the controller?

3. If I'm not on the right track can someone look on the logs and tell me what I'm doing wrong? I have no other ideas for now...

One thing i don't understand also ... looks like AP it's making two join request at same time stamp ... is it normal?

Thank's in advance for looking into this.

IsteniucMihai85603 · ‎07-31-2020

Problem Solved!!! (I did an honest mistake I believe)

I performed a reset to factory defaults for controller and after re-configuring it with just the basic settings (anyway I didn't use any advance settings on it) voila!

The 2700 series access point joined immediately!

A simple explanation can be that somewhere down the road of upgrading or downgrading the controller some settings got messed up and they are not on plain sight I'm afraid. I always saved the configuration while upgrading, cause controller was somehow in production and only small disruptions of connectivity where allowed. But weekend has come and I could reset it to factory defaults and do a fresh re-initialization.

I highly recommend on situations like mine if something is wrong after an upgrade to reset the controller to factory defaults and re-program the parameters. After all it is ONLY a 15 minutes job that can save you many hours of searching reading and debugging (Mobility Express is a very good SOHO solution that can be deployed in minutes if you know what are u doing, BTW great job Cisco). For me was anyway a nice journey from what I have learned a lot.

Please read this document: https://www.ciscolive.com/c/dam/r/ciscolive/apjc/docs/2015/pdf/BRKEWN-3011.pdf that helped a lot to eliminate possible causes of malfunction (starting with page 10 from 105 - lucky me!). For reference i will post a piece of log maybe will help others too:

*spamApTask0: Jul 31 18:27:15.461: 00:6b:f1:b1:ec:00 Join Request from 192.168.183.109:36063

*spamApTask0: Jul 31 18:27:15.461: 00:6b:f1:b1:ec:00 Unable to get Ap mode in Join request

*spamApTask0: Jul 31 18:27:15.462: 00:00:00:00:00:00 Deleting AP entry 192.168.183.109:36063 from temporary database.
*spamApTask0: Jul 31 18:27:15.462: 00:6b:f1:b1:ec:00 Dropping request or response packet to AP :192.168.183.109 (36063) by Controller: 192.168.183.11 (5246), message Capwap_wtp_event_response, state Capwap_no_state

*spamApTask0: Jul 31 18:27:15.462: 00:6b:f1:b1:ec:00 Message type Capwap_wtp_event_response is not allowed to send in state Capwap_no_state for AP 192.168.183.109

*spamApTask0: Jul 31 18:27:15.462: 00:6b:f1:b1:ec:00 Join Request failed!

*spamApTask0: Jul 31 18:27:15.462: 00:6b:f1:b1:ec:00 State machine handler: Failed to process msg type = 3 state = 0 from 192.168.183.109:36063

*spamApTask0: Jul 31 18:27:15.462: 00:6b:f1:b1:ec:00 Failed to parse CAPWAP packet from 192.168.183.109:36063

If you look at page 16 of previously mentioned document you will see it's like a carbon copy of the above error. Or vice-versa.

I will try to upgrade the controller to a more recent firmware (along with upgrade of the AP) because right now it"s running an old 8-3-111. I will keep you updated.

Many thanks to all who tried to help me.

Mihai

View solution in original post

Leo Laohoo · ‎07-30-2020

Join failed as dtls connection not found

CSCuu35882

IsteniucMihai85603 · ‎07-30-2020

Hi Leo,

Thank you for the quick reply.

I have read about the bug (I wasn't aware of it), but ... looks like there are not any plans to repair this, and i quote: "No release planned to fix this bug".

Any workaround? I see that it's a known bug since 15.2(2)E1. Prior to that release could this setup work? Should I try to downgrade controller and ap software and try again? (is that even possible?, I didn't check the compatibility matrix yet to see if it is doable...).

Thank you,

Mihai

Rich R · ‎07-30-2020

You didn't actually mention what code your ME is running?
I doubt that CSCuu35882 is actually the problem as that seems to be for very old code which explains why it won't be fixed.
Have you tried resetting your AP to factory default?

------------------------------
Please click Helpful if this post helped you and Select as Solution (drop down menu at top right of this reply) if this answered your query.
------------------------------
TAC recommended codes for AireOS WLC's and TAC recommended codes for 9800 WLC's
Best Practices for AireOS WLC's, Best Practices for 9800 WLC's and Cisco Wireless compatibility matrix
Check your 9800 WLC config with Wireless Config Analyzer using "show tech wireless" output or "config paging disable" then "show run-config" output on AireOS and use Wireless Debug Analyzer to analyze your WLC client debugs
Field Notice: FN63942 APs and WLCs Fail to Create CAPWAP Connections Due to Certificate Expiration
Field Notice: FN72424 Later Versions of WiFi 6 APs Fail to Join WLC - Software Upgrade Required
Field Notice: FN72524 IOS APs stuck in downloading state after 4 Dec 2022 due to Certificate Expired
- Fixed in 8.10.196.0, latest 9800 releases, 8.5.182.12 (8.5.182.13 for 3504) and 8.5.182.109 (IRCM, 8.5.182.111 for 3504)
Field Notice: FN70479 AP Fails to Join or Joins with 1 Radio due to Country Mismatch, RMA needed
How to avoid boot loop due to corrupted image on Wave 2 and Catalyst 11ax Access Points (CSCvx32806)
Field Notice: FN74035 - Wave2 APs DFS May Not Detect Radar After Channel Availability Check Time
Leo's list of bugs affecting 2800/3800/4800/1560 APs
Default AP console baud rate from 17.12.x is 115200 - introduced by CSCwe88390

IsteniucMihai85603 · ‎07-30-2020

Hi mudling,

Thank you for you're interest on subject.

The controller is running now 8.10.113.0 version. According with compatibility matrix looks like for 2900 series i need ap3g2-k9w8-153-3.JK1t (the files ap1g4 delivered for controller and ap2g3 via tftp, are taken from zip archive available for different versions).

Tried also with 8-5-135-0, and i believe also with some 8.3 version (can't remember exactly), and every time I end with the loop mentioned in logs (of course respecting every time the correct version for both controller and ap). I believe it's 2900 to blaime but not really sure ...

Regarding reset for 2900 done by 2 different methods (mode buton pressed for more then 20 seconds and by cli command: clear capwap ap all-config), with no positive result.

Any other ideas I can try?

Thank you

Rich R · ‎07-31-2020

No I'm afraid no other ideas from me.

Maybe you are hitting that old bug but either way I think you're at the point where you need a TAC case and possibly an RMA.

------------------------------
Please click Helpful if this post helped you and Select as Solution (drop down menu at top right of this reply) if this answered your query.
------------------------------
TAC recommended codes for AireOS WLC's and TAC recommended codes for 9800 WLC's
Best Practices for AireOS WLC's, Best Practices for 9800 WLC's and Cisco Wireless compatibility matrix
Check your 9800 WLC config with Wireless Config Analyzer using "show tech wireless" output or "config paging disable" then "show run-config" output on AireOS and use Wireless Debug Analyzer to analyze your WLC client debugs
Field Notice: FN63942 APs and WLCs Fail to Create CAPWAP Connections Due to Certificate Expiration
Field Notice: FN72424 Later Versions of WiFi 6 APs Fail to Join WLC - Software Upgrade Required
Field Notice: FN72524 IOS APs stuck in downloading state after 4 Dec 2022 due to Certificate Expired
- Fixed in 8.10.196.0, latest 9800 releases, 8.5.182.12 (8.5.182.13 for 3504) and 8.5.182.109 (IRCM, 8.5.182.111 for 3504)
Field Notice: FN70479 AP Fails to Join or Joins with 1 Radio due to Country Mismatch, RMA needed
How to avoid boot loop due to corrupted image on Wave 2 and Catalyst 11ax Access Points (CSCvx32806)
Field Notice: FN74035 - Wave2 APs DFS May Not Detect Radar After Channel Availability Check Time
Leo's list of bugs affecting 2800/3800/4800/1560 APs
Default AP console baud rate from 17.12.x is 115200 - introduced by CSCwe88390

Leo Laohoo · ‎07-31-2020

Ok, my turn ...
On the AP, post the complete output to the command "dir".

IsteniucMihai85603 · ‎07-31-2020

Hi Leo,

This is the output of the Dir command as instructed:

Directory of flash:/

2 -rwx 95008 Jan 19 2018 09:22:54 +00:00 lwapp_reap.cfg.bak
3 -rwx 86178 Jul 30 2020 21:27:58 +00:00 event.log
42 drwx 2496 Jul 30 2020 21:27:32 +00:00 ap3g2-k9w8-mx.153-3.JD4
86 -rwx 1950 Feb 7 2017 09:51:12 +00:00 atf_override_config_slot0.txt
6 -rwx 355 Jul 30 2020 21:27:51 +00:00 env_vars
78 drwx 512 Mar 1 1993 00:00:17 +00:00 configs
79 -rwx 64 Jul 30 2020 21:27:51 +00:00 sensord_CSPRNG0
80 -rwx 136147 Feb 16 2017 13:11:47 +00:00 event.r1
7 -rwx 0 Jul 26 2020 18:40:03 +00:00 config.txt
8 -rwx 6168 Jul 26 2020 18:40:11 +00:00 private-multiple-fs
90 -rwx 64 Jul 30 2020 21:27:51 +00:00 sensord_CSPRNG1
9 -rwx 2092 Jul 26 2020 18:39:41 +00:00 config.txt.bak
94 -rwx 1950 Feb 7 2017 09:51:12 +00:00 atf_override_config_slot1.txt
161 -rwx 60762 Jan 19 2018 09:14:31 +00:00 event.capwap

Thank you

Mihai

@Rich R Don't know if really matter but I have installed an autonomous firmware on 2700 machine and was able to boot and setup the ap. I presume then that the machine it's ok.

Mihai

Leo Laohoo · ‎07-31-2020

Dang. The RCV folder is missing.

Rich R · ‎07-31-2020

Explain Leo?

------------------------------
Please click Helpful if this post helped you and Select as Solution (drop down menu at top right of this reply) if this answered your query.
------------------------------
TAC recommended codes for AireOS WLC's and TAC recommended codes for 9800 WLC's
Best Practices for AireOS WLC's, Best Practices for 9800 WLC's and Cisco Wireless compatibility matrix
Check your 9800 WLC config with Wireless Config Analyzer using "show tech wireless" output or "config paging disable" then "show run-config" output on AireOS and use Wireless Debug Analyzer to analyze your WLC client debugs
Field Notice: FN63942 APs and WLCs Fail to Create CAPWAP Connections Due to Certificate Expiration
Field Notice: FN72424 Later Versions of WiFi 6 APs Fail to Join WLC - Software Upgrade Required
Field Notice: FN72524 IOS APs stuck in downloading state after 4 Dec 2022 due to Certificate Expired
- Fixed in 8.10.196.0, latest 9800 releases, 8.5.182.12 (8.5.182.13 for 3504) and 8.5.182.109 (IRCM, 8.5.182.111 for 3504)
Field Notice: FN70479 AP Fails to Join or Joins with 1 Radio due to Country Mismatch, RMA needed
How to avoid boot loop due to corrupted image on Wave 2 and Catalyst 11ax Access Points (CSCvx32806)
Field Notice: FN74035 - Wave2 APs DFS May Not Detect Radar After Channel Availability Check Time
Leo's list of bugs affecting 2800/3800/4800/1560 APs
Default AP console baud rate from 17.12.x is 115200 - introduced by CSCwe88390

IsteniucMihai85603 · ‎07-31-2020

Is RCV (recovery, I believe) necessary?

I understand it's useful only if running image gets somehow corrupted or deleted. Then the ap will boot in recovery mode connect @ controller and download a new working image reboot, join ... and so on.

Remember that my controller is 1830 and I'm under the impression it cannot provide a working image for 2700. The only thing loaded in 1830 is ap1g4 image that doesn't contain any reference about ap2g3 running image of the ap.

Apologizes if I'm wrong,

Mihai

IsteniucMihai85603 · ‎07-31-2020

Hi all,

More testing done. I have downgraded the controller image (and also the corresponding image for AP) from 8.10 to 8.8, then 8.5, then 8.3. I wasn't able to join the AP. For every release logs look quite different but never joined. Bottom line: I failed.

Running at the moment 8-3-112 on controller side.

Deleted from AP flash the following files:

lwapp_reap.cfg.bak

atf_override_config_slot0

config.txt

private-multiple-fs

config.txt.bak

atf_override_config_slot1

deleted nvram

After nvram was deleted got this message (only once at 1st boot):

*Jul 31 14:40:01.000: %CAPWAP-5-DTLSREQSEND: DTLS connection request sent peer_ip: 19 by 2.168.183.11 peer_port: 5246
*Jul 31 14:40:30.999: DTLS_CLIENT_ERROR: ../capwap/base_capwap/dtls/base_capwap_dtls_connection_db.c:2214 Max retransmission count reached for Connection 0xA1F76C8!

As you can see after DTLS request the AP waited 30 seconds until throwing ERROR

After that DTLS request are send from 16 to 16 seconds:

Jul 31 15:01:20.000: %CAPWAP-5-DTLSREQSEND: DTLS connection request sent peer_ip: 192.168.183.11 peer_port: 5246
*Jul 31 15:01:20.251: %CAPWAP-5-DTLSREQSUCC: DTLS connection created sucessfully peer_ip: 192.168.183.11 peer_port: 5246
*Jul 31 15:01:20.251: %CAPWAP-5-SENDJOIN: sending Join Request to 192.168.183.11
*Jul 31 15:01:25.251: %CAPWAP-5-SENDJOIN: sending Join Request to 192.168.183.11
*Jul 31 15:01:25.251: %DTLS-5-ALERT: Received WARNING : Close notify alert from 192.168.183.11
*Jul 31 15:01:25.251: %DTLS-5-SEND_ALERT: Send FATAL : Close notify Alert to 192.168.183.11:5246
*Jul 31 15:01:35.251: AP has SHA2 MIC certificate - Using SHA2 MIC certificate for DTLS.

*Jul 31 15:01:36.000: %CAPWAP-5-DTLSREQSEND: DTLS connection request sent peer_ip: 192.168.183.11 peer_port: 5246
*Jul 31 15:01:36.255: %CAPWAP-5-DTLSREQSUCC: DTLS connection created sucessfully peer_ip: 192.168.183.11 peer_port: 5246
*Jul 31 15:01:36.255: %CAPWAP-5-SENDJOIN: sending Join Request to 192.168.183.11
*Jul 31 15:01:41.255: %CAPWAP-5-SENDJOIN: sending Join Request to 192.168.183.11
*Jul 31 15:01:41.255: %DTLS-5-ALERT: Received WARNING : Close notify alert from 192.168.183.11
*Jul 31 15:01:41.255: %DTLS-5-SEND_ALERT: Send FATAL : Close notify Alert to 192.168.183.11:5246

What I don't understand why there are two join request at 5 seconds apart. Somehow looks like the second one ruin everything cause at the exact time stamp the second request it's issued there is an alert message, followed by a fatal one.

Is this normal behavior?

I will start looking more careful on logs, cause I'm out of ideas. Until now i was trying to do some trial by error debugging but this doesn't work. IMHO there is something "wrong" with the AP, but can't figure what for now.

Any advice will be highly appreciated.

Mihai

Rich R · ‎07-31-2020

Presume you've checked the switch port configuration and looked for errors on the port? Have you tried it in a switch port which works with the other AP? And tried changing the cable?

------------------------------
Please click Helpful if this post helped you and Select as Solution (drop down menu at top right of this reply) if this answered your query.
------------------------------
TAC recommended codes for AireOS WLC's and TAC recommended codes for 9800 WLC's
Best Practices for AireOS WLC's, Best Practices for 9800 WLC's and Cisco Wireless compatibility matrix
Check your 9800 WLC config with Wireless Config Analyzer using "show tech wireless" output or "config paging disable" then "show run-config" output on AireOS and use Wireless Debug Analyzer to analyze your WLC client debugs
Field Notice: FN63942 APs and WLCs Fail to Create CAPWAP Connections Due to Certificate Expiration
Field Notice: FN72424 Later Versions of WiFi 6 APs Fail to Join WLC - Software Upgrade Required
Field Notice: FN72524 IOS APs stuck in downloading state after 4 Dec 2022 due to Certificate Expired
- Fixed in 8.10.196.0, latest 9800 releases, 8.5.182.12 (8.5.182.13 for 3504) and 8.5.182.109 (IRCM, 8.5.182.111 for 3504)
Field Notice: FN70479 AP Fails to Join or Joins with 1 Radio due to Country Mismatch, RMA needed
How to avoid boot loop due to corrupted image on Wave 2 and Catalyst 11ax Access Points (CSCvx32806)
Field Notice: FN74035 - Wave2 APs DFS May Not Detect Radar After Channel Availability Check Time
Leo's list of bugs affecting 2800/3800/4800/1560 APs
Default AP console baud rate from 17.12.x is 115200 - introduced by CSCwe88390

IsteniucMihai85603 · ‎07-31-2020

I was thinking to try that, but forgot about it during my tests ... Glad you reminded me.

I will use a unmanaged switch and will power both controller and ap from poe power injector.

I will get back with info.

Mihai

IsteniucMihai85603 · ‎07-31-2020

Test done. Nothing new unfortunately.

Mihai