cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
19904
Views
15
Helpful
8
Replies

CAPWAP State: DTLS Teardown

Jeza-925
Level 1
Level 1

Hi all,

I'm facing strange problem with one of our sites, AP's are constantly disconnecting from WLC. They are associated for some time, than for some reason they disconnect and enter dtls loop process, they are stuck in loop for some random time (sometimes 2-5-10 hours, there is no rule) then they join WLC and after some time (again, no rules) disconnect. All other sites works like a charm.

WLC is running 8.5.140.0 image and all APs are 1852. Logs from WLC and AP are attached, I did a debug on AP as well. Reboot and factory reset didn't help.

I think that problem is with WAN link, there is some delay that is causing this issue. Is there some kind of timers that can be modified so that capwap discovery process last longer?

8 Replies 8

marce1000
VIP
VIP

 

 - In the context of using WAN connections note that for CAPWAP connection link latency may not access 300ms. When flexconnect is used however you can play with latency settings :

   https://www.cisco.com/c/en/us/td/docs/wireless/controller/8-5/config-guide/b_cg85/managing_aps.html?bookSearch=true#ID4174

 M.



-- ' 'Good body every evening' ' this sentence was once spotted on a logo at the entrance of a Weight Watchers Club !

patoberli
VIP Alumni
VIP Alumni
Spoiler
Oct  2 07:28:26 kernel: [*10/02/2020 07:28:26.0000] DTLS connection created sucessfully local_ip: 10.130.X.X local_port: 5264 peer_ip: X.X.X.X peer_port: 5246
Oct  2 07:28:51 kernel: [*10/02/2020 07:28:51.0022] Discarding msg CAPWAP_WTP_EVENT_REQUEST(type 9) in CAPWAP state: DTLS Setup(3).
Oct  2 07:28:51 kernel: [*10/02/2020 07:28:51.0022] Discarding msg CAPWAP_WTP_EVENT_REQUEST(type 9) in CAPWAP state: DTLS Setup(3).
Oct  2 07:29:23 kernel: [*10/02/2020 07:29:23.0522] Wait DTLS timer has expired
Oct  2 07:29:23 kernel: [*10/02/2020 07:29:23.0522] Dtls session establishment failed
Oct  2 07:29:23 kernel: [*10/02/2020 07:29:23.0522] dtls_disconnect: ERROR shutting down dtls connection ...

This sounds like the connection was overloaded or latency to high, as @marce1000 mentioned. 

Also make sure NTP is correctly setup on the remote site and WLC, the times in the two logs don't fully match.

Jeza-925
Level 1
Level 1

Hi @marce1000 , @patoberli 

Thank you for prompt reply. All AP's are in flexconnect mode and I enabled link latency option for 3 AP that are currently connected (3 stuck in loop), these are the output for them:

 

AP Link Latency.................................. Enabled
Current Delay................................... 4 ms
Maximum Delay................................... 11 ms
Minimum Delay................................... 10 ms
Last updated (based on AP Up Time).............. 15 days, 14 h 46 m 56 s

AP Link Latency.................................. Enabled
Current Delay................................... 4 ms
Maximum Delay................................... 11 ms
Minimum Delay................................... 10 ms
Last updated (based on AP Up Time).............. 15 days, 14 h 46 m 55 s

AP Link Latency.................................. Enabled
Current Delay................................... 0 ms
Maximum Delay................................... 13 ms
Minimum Delay................................... 0 ms
Last updated (based on AP Up Time).............. 15 days, 14 h 48 m 06 s

 

Results are nothing near 300ms. Here is the ping from AP that is joined:

Sending 5, 100-byte ICMP Echos to X.X.X.X, timeout is 2 seconds
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 8.591/9.253/9.841 ms

 

And from AP that is trying to join:

Sending 5, 100-byte ICMP Echos to X.X.X.X, timeout is 2 seconds
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 9.153/9.465/9.934 ms

Do you have any idea if the latency is not the problem, how to solve this? All AP's are connected on the same switch.

Kind regards,

Jeza

 

Jeza-925
Level 1
Level 1

Hi @marce1000 , @patoberli 

Thank you for prompt reply. All AP's are in flexconnect mode and I enabled link latency option for 3 AP that are currently connected (3 stuck in loop), these are the output for them:

AP Link Latency.................................. Enabled
Current Delay................................... 4 ms
Maximum Delay................................... 11 ms
Minimum Delay................................... 10 ms
Last updated (based on AP Up Time).............. 15 days, 14 h 46 m 56 s

AP Link Latency.................................. Enabled
Current Delay................................... 4 ms
Maximum Delay................................... 11 ms
Minimum Delay................................... 10 ms
Last updated (based on AP Up Time).............. 15 days, 14 h 46 m 55 s

AP Link Latency.................................. Enabled
Current Delay................................... 0 ms
Maximum Delay................................... 13 ms
Minimum Delay................................... 0 ms
Last updated (based on AP Up Time).............. 15 days, 14 h 48 m 06 s

Results are nothing near 300ms. Here is the ping from AP that is joined:

Sending 5, 100-byte ICMP Echos to X.X.X.X, timeout is 2 seconds
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 8.591/9.253/9.841 ms

And from AP that is trying to join:

Sending 5, 100-byte ICMP Echos to X.X.X.X, timeout is 2 seconds
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 9.153/9.465/9.934 ms

Do you have any idea if the latency is not the problem, how to solve this? All AP's are connected on the same switch.

Kind regards,

Jeza

 

 - Reconnect one of the problematic ap's with cold-start, check if anything abnormal can be seen in the switch logs after it comes up.

 M.



-- ' 'Good body every evening' ' this sentence was once spotted on a logo at the entrance of a Weight Watchers Club !

Jeza-925
Level 1
Level 1

Hi @marce1000 ,

Thank you again for your reply. On this site we have some HP switch, I turned debug on, but couldn't see anything except that port went offline/online and dhcp address was assigned, also there aren't any errors on packet drops on interface. At this moment only one AP is still trying to connect. This is info for one of the AP's, 8 hours to join...

AP Up Time....................................... 19 days, 12 h 23 m 05 s
AP LWAPP Up Time................................. 0 days, 04 h 39 m 53 s
Join Date and Time............................... Tue Oct 6 06:02:07 2020
Join Taken Time.................................. 0 days, 08 h 33 m 28 s

I opened ticket with SP to check if everything is ok with link.

Hi Jeza,

Did you ever get to sort out this problem, I know it's an old post but I am having the same problem.

Chris.

Hi Chris,

This is a remote location, and the problem was with internet link that was terrible at the moment. I had to contact ISP to fix the issue.

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Review Cisco Networking products for a $25 gift card