cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
5521
Views
170
Helpful
35
Replies

Client connection Issue with 702W AP on 8.5.131.0

charlietweet
Level 1
Level 1

We are seeing clients having random and unpredictable connection issues in housing facilities where we have Cisco 702Ws.  We have debugs and a TAC case open, but it's been slow trying to determine root cause via those methods.  We are running on 8540s and WiSM2s with Cisco controller code:  8.5.131.0.  This sounds buggy, but we are not seeing this as a bug from Cisco and TAC has not been able to provide any analysis.  Here is the "debug mac" of one client seeing the issue (the specific debugs immediately after successful authentication completes and the problem begins):

*apfReceiveTask: Sep 07 14:39:02.898: 0c:54:15:49:34:9b 0.0.0.0
DHCP_REQD (7) DHCP Policy timeout. Number of DHCP Discover 0, DHCP Request 0 from client
*apfReceiveTask: Sep 07 14:39:02.898: 0c:54:15:49:34:9b Interface Group was NULL.Number of DHCP Discovery 0 from client
*apfReceiveTask: Sep 07 14:39:02.898: 0c:54:15:49:34:9b 0.0.0.0 DHCP_REQD (7) Pem timed out, Try to delete client in 10 secs.
*apfReceiveTask: Sep 07 14:39:02.898: 0c:54:15:49:34:9b Scheduling deletion of Mobile Station: (callerId: 12) in 10 seconds

 

 

35 Replies 35

See whether you see any symptoms of AP radio resets 

https://bst.cloudapps.cisco.com/bugsearch/bug/CSCvi42919/

 

HTH

Rasika

I checked the AP logs of a 702W that has clients who have had this issue on the AP and those symptoms are not found.  It must be something else.

The AP doesn't seem to be receiving any DHCP request.

If you run a capture at a PC having the issue do you see DHCP requests going out?
Does this happens on wireless clients, wired clients or both?
** Please rate helpful posts **

CCIE #58023

Thanks for confirming the debug output.  That is what we were thinking on our end.  To answer your questions, this is only happening with wireless clients associated with 702W's.  The issue is very rare and random though.  We have not been able to duplicate this in a lab with a few client devices.  However, we are getting dozens of users reporting the issue while thousands of others are working just fine.  Therefore, it is nearly impossible to capture the issue via a live packet capture.  Also, we have confirmed we are not even close to running out of addresses on our DHCP scopes.  Also, for every report of this issue from the user, rebooting the 702w resolves the issue.  Any other analysis or suggestions are appreciated.???

If the user turns its wifi off and on, does it resolve the issue? (without rebooting the 702). This will make the device to re-do DHCP.

Also, when an user is affected, can other devices connect to the 702w via wireless?

I don't see any way to determine what's going on other than simultaneous OTA capture(unless it is 802.1x) + SPAN Capture + AP radio debugs which are hard to get in your scenario.

Also try to narrow down if the affected device type is always the same.
** Please rate helpful posts **

CCIE #58023

If the user turns its wifi off and on, does it resolve the issue?

No

 

Also, when an user is affected, can other devices connect to the 702w via wireless? 

Yes

 

Also try to narrow down if the affected device type is always the same.?

No, we are seeing this via all client types.  

 

Keep in mind, this issue is being reported multiple times a day, but we have about 1100 x 702w APs throughout several buildings.  Therefore, this issue is hard to catch or "capture".  We cannot predict when it will happen.  We are guessing this is a bug with the 702w on the new code, but nobody from TAC or this community has provided us with any bug that has these symptoms.  If you have any other analysis or suggestions, please share.  Thx

 

 

 

At our university we have around 5000 702W access points in numerous residence hall buildings.  We have been experiencing what appears to be the same type of issue.  We are getting numerous calls a day from students saying "I can't connect to wifi" or "I keep getting disconnected".  If we reboot the AP the problem seems to go away.  Students say the they can go on campus or other student rooms and they work fine and then back to their own room and start experiencing issues again.  As for device types, it appears to be all types of devices (smart phones and laptops).  We are running code version 8.5.131.6 (due to HA bug in 8.5.131) on 8510 controllers.  We have not done a debug on a client connection at this point in time.  It was interesting to find this thread, as we were having enough students calling in with the same type of connectivity issues, I was curious if anyone else was experiencing something similar. 

MIke, you are not alone.  Everything you wrote has also been taking place at our University since we upgraded to 8.5.131.x.  However, we only have about 1000 x 702w.  I was 98% sure this issue is buggy.  After reading your post, I'm now 99% sure...ha ha.  We actually just had a webex with TAC and we are waiting on his analysis to confirm what code we need to upgrade or downgrade to.  I will post the bug-id after TAC gives it to me (hopefully soon).  If I were you, I would start planning an upgrade or downgrade unless you want to spend all your time rebooting APs manually.

Thanks for your reply.  I'm very curious what Cisco identifies the issue as.  

We ran 8.5.131 and 8.5.131.6 for quite a while, but didn't really have student in the dorms until the week of Aug 13th.  Seemed to have a good run for a while then the calls started to come in.  Since the issues were so spread apart and random we just thought we were seeing local issues or device issues...  but they just kept coming in with the same description of the problem (can't connect or keep getting disconnected). I'll wait a while to see if you have any luck with a solution from Cisco TAC, otherwise I will begin some more detail information gathering and give them a call myself.

Similar symptoms here; another University with HA 8510s running 8.5.131 and a few hundred 702Ws in specific student residence locations.

 

Only a handful of reports so far, but all exhibiting the same symptoms as reported above: WLC shows device as unable to connect to to DHCP failure. Typical report has been student can connect to the wireless fine in their room with their laptop but unable to connect with phone. No issues connecting elsewhere on campus (2702s for the most part). Reports only started after the students returned after the summer, so impossible to say when the fault(s) may have started.

 

As documented above resetting the 702W clears the problem. I've just run through pre-emptive resets on a number of buildings to try and alleviate the problem and planning to upgrade from 8.5.131 to 8.5.135 early next week. The release notes don't mention anything matching this problem, but if nothing else that will reset all the 702Ws.

 

I'm assuming a bug that only appears after a certain length of uptime, but that's just a guess.

To Mike, Alan and anyone else running Cisco 702w's on 8.5.131.x ...

 

Can you guys please respond to this to let us know what was the last stable code you were on prior to 8.5.131.x (when you had campus in session without any of these 702w issues) ???  

 

Thanks for adding to the conversation.  We have sent many debugs and logs from controllers and AP's to TAC, but still have not received any useful response.  I'm still 99% sure this is a bug and Cisco will eventually suggest a newer code version.  However, since our 702w's are only located in student housing which is only 10% of our campus, we are considering separating out all of the housing AP's to a couple WiSM2 controllers we still have online and downgrading to a more stable code while keeping all the other controllers without 702w's on 8.5.131.0.

 

To repeat:

Can you guys please respond to this to let us know what was the last stable code you were on prior to 8.5.131.x (when you had campus in session without any of these 702w issues) ???  

Hi Charlie,

 

We ran our 8510s on 8.3.140 from February to June this year without any reported issues with the 702Ws; then a straight upgrade from 8.3.140 to 8.5.131.

 

However I'm reluctant to roll back to 8.3.x during term due to the impact of the double reboot required on APs between these software versions which affects the 2702s - these make up the bulk of our wireless network:

 

The image format of Cisco Aironet 1700, 2700, 3700, and IW3702 APs has been changed from ap3g2 to c3700. Therefore, if you are upgrading to Release 8.5 or a later release from Release 8.3 or an earlier release, these APs will download the image twice and reboot twice

(https://www.cisco.com/c/en/us/td/docs/wireless/controller/release/notes/crn85mr3.html#concept_A19A968740184C429334003B89BE6F1E)

 

Alan

We were previously running 8.2.166 before upgrading to 8.5.131 with no reports of the the issue we are currently experiencing.

Hi all,

 

Have you experienced issues with any other type of AP like 2700,2600,3500? I have not moved yet from 8.3.120.0 which is relatively stable but we need 8.5 for the 800 AP series. I will follow this post.

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Review Cisco Networking products for a $25 gift card