08-01-2008 06:28 AM - edited 07-03-2021 04:16 PM
Hello NetPro gurus!
I am currently troubleshooting an issue we are having with our Guest (completely open) WLAN in which it seems certain clients are losing their layer 3 connectivity while staying 'connected' to the LWAP(s). These certain clients lose their layer 3 configuration and are not able to access internal or external resources until they disable/enable their wireless connection.
I specifically have this problem, and it's only on the Guest WLAN that this occurs. I am using a Lenovo T61 with an Intel 4965AG internal wireless chipset. I know this chipset is relatively new and I have tried multiple drivers, all with the same result. Not all machines have this issue. MacPro laptops do not seem to have this issue nor do machines with Intel Pro 2200BG chipsets. I tested with a Netgear PCMCIA card and did not have this issue either.
Here's some more background information:
We have 5 WLCs (2 WiSM blades each in a Catalyst 6509 and 1 WLC 4402) and 7 WLANs. The 4 WiSM controllers have each WLAN configured on it, and the 4402 WLC only knows about one Guest wireless network (it is a completely open WLAN i.e. no security). This is the particular network we see this issue with. We have approximately 200 LWAP 1131AG's (47 in one building, 154 in another) all broadcasting the Guest SSID. Our server core Catalyst 6509's each have seperate VLANs (with Port-channels in them) for the WiSM blades. The Guest WLC 4402 is in the DMZ in its own VLAN. Each WLC is providing DHCP for each of the WLANs.
The issue that seems to be occuring is the fact that during our troubleshooting I lose all layer 3 connectivity. I continue to stay "connected" to the AP and signal strength is excellent however my continuous pings to the Guest WLC (192.168.0.x network) time out and I cannot get out to the Web. I noticed the following error on my laptop (Lenovo T61 w/ an Intel 4965AG wireless chipset) in the system event viewer:
Description:
The system detected that network adapter Intel(R)...Link 4965AG - Packet Scheduler Miniport was disconnected from the network, and the adapter's network configuration has been released. If the network adapter was not disconnected, this may indicate that it has malfunctioned. Please contact your vendor for updated drivers.
This occured at the exact time I lost my layer 3 connectivity. A co-worker and I did some research and determined that this was exactly one half of the way through my 1-hour DHCP lease from the Guest WLC (the 4402). The DHCP leases are set to expire at 1 hour as we have a lot of clients on the Guest WLAN that come and go and only have one network configured for the Guest WLAN w/ 229 available IP's to be handed out. We were wondering if it was an issue with the DHCP renewal process from the WLC. This does not occur on the Internal WLANs configured with strict authentication security.
We tested with a few machines, such as an Apple laptop, an older laptop with an Intel Pro 2200BG chipset, and even my same laptop with a Netgear PCMCIA WiFi card none of which exhibited this problem. Connectivity at layer 3 was not interrupted. I have tried multiple drivers as well, all with the same result.
Now, we are not sure if it is an issue with the WLC itself or a chipset issue. The Intel 4965AG chipset is rather new but we have a lot of WLAN clients with this chipset on the network. That also doesn't explain why this issue ONLY occurs on the Guest WLAN.
We were thinking of placing a small DHCP server on the network to take over DHCP responsibilities from the Guest WLC to see if that makes a difference. Another idea we had was to increase the DHCP scope to two Class B networks (191.168.0.0 - 191.168.1.255 /23 to give us 510 hosts so we can extend the DHCP lease time).
I plan on doing further testing today by placing a few more machines on the Guest WLAN with multiple chipsets and taking note of which ones exhibit the problem.
Any and all help is MUCH appreciated. Thanks!
Shane
Solved! Go to Solution.
08-04-2008 12:40 PM
1. A DHCP client will always try renew the lease after half lease time has expired. (the same ipconfig /renew if you don't want to wait half an hour).
2. The client try to renew its lease (192.168.0.4)
Mon Aug 4 13:17:46 2008: 00:13:02:24:ca:77 DHCP siaddr: 0.0.0.0, giaddr: 192.168.0.164
3. The DHCP server replies and tells the clients to keep its current address.
Mon Aug 4 13:17:46 2008: 00:13:02:24:ca:77 DHCP ciaddr: 192.168.0.164, yiaddr: 192.168.0.164
4. Response is sent the the client via the EoP tunnel
Mon Aug 4 13:17:46 2008: 00:13:02:24:ca:77 DHCP sending packet in EoIP tunnel to foreign 10.50.111.11 (len 346)
Now the strange points
Mon Aug 4 13:17:55 2008: 00:13:02:24:ca:77 Clearing Address 192.168.0.164 on mobile
Mon Aug 4 13:17:55 2008: 00:13:02:24:ca:77 192.168.0.164 RUN (20) Change state to DHCP_REQD (7) last state RUN (20)
For some reasons, the controller decides to change the client state from RUN to DHCP_REQD. So, from now, the client connections are dropped until a full DHCP cycle is done (ex: ipconfig /release ipconfig /renew)
Could you post debug message for a working a wireless chipset ?
08-05-2008 06:55 AM
08-05-2008 08:10 AM
08-06-2008 06:20 AM
What's interesting is that if I force a /release and then a /renew on the client just prior to the DHCP renew proces occuring on its own, the problem doesn't seem to exist. The client will be leased the same IP for another hour...
08-06-2008 08:13 AM
I see differnce between my capture and yours. When I do a IP renew (which happens naturally at the half lease time), the PEM mechanisnm is not trigger. DHCP is seen as normal traffic. When I do a ipconfig /renew the PEM is triggered and change the connection state from RUN to DHCP_REQ. So all traffic is blocked until a new DHCP query is done.
Does it works again, if I you do "ipconfig /renew" twice ? Does the PEM state change from DHCP_REQ to RUN ?
08-12-2008 06:48 AM
You might try entering this command on the wireless lan controller. I had DHCP issues on our open network and this fixed it.
config dhcp proxy disable
08-12-2008 01:07 PM
If I disable DHCP proxying, than we can't use the internal WLC DHCP capability. I have plans to implement an external DHCP server tomorrow evening which means I will need to disable DHCP proxying anyway.
08-15-2008 11:33 AM
I am seeing similar things on my guest network as well.
08-16-2008 02:05 AM
I am seeing similar things on my guest network as well. My anchor controller is in the dmz, port 1 is on the private network, and port 2 is in the DMZ, is trunked, and I am using a dynamic interface for the DMZ subnet. The DHCP server is the DMZ 6509 with the SVI for the guest network.
What it appears is when I am using web auth (passthrough) my clients are getting put back to a dhcp_reqd state, which results in them being stuck in the web auth required.
I don't know why the clients are losing their IP address to begin with - the WCS only captures "Client Moved to DHCP Required State. 8...
The wlan to which client is connecting does not require 802 1x authentication. 7?.0
08/15/2008 15:35:49 EDT INFO 10.2.254.150 Client does not have an IP address yet. 7?.0
08/15/2008 15:35:49 EDT INFO 10.2.254.150 Client L3 authentication is required 7?.0
08/15/2008 15:35:49 EDT INFO 10.2.254.150 Client Moved to DHCP Required State. 7?.0
08/15/2008 15:35:49 EDT INFO 10.2.254.150 Client Moved to DHCP Required State. 7?.
08/15/2008 15:35:50 EDT INFO 10.2.254.150 DHCP successful. ;...
08/15/2008 15:35:50 EDT ERROR 10.2.254.150 Client got an IP address successfully and the WLAN requires Web Auth or Web Auth pass through. ;... "
I'm not sure if some of the messages are just INFO about the l3 auth, this wlan is configured for no security, just web auth passthrough.
The anchor controller is running 4.2.99 while my inside controllers are 4.2.130.
I had a previous problem when the anchor was on 4.2.130 where DHCP messages would get dropped by the inside controller... I may play around and put it back to 4.2.130 just to see.
08-16-2008 09:23 AM
Brian,
If you configure a dhco scope on the DMZ controller, do you still see errors and do the clients still hang? I have multiple clients with DMZ controllers running various code and a couple running 4.2.130 with no issues. I do have the dhcp scope on the DMC WLC though.
08-16-2008 10:21 AM
I've never tested long enough with the controller as the server to see this problem.
I have tried it before and it works as the DHCP server, but I've never tried staying connected the 15 or 20 min it would take to get dropped..
A few people at Cisco Networkers this year that spoke on Wireless said not to use the WLC as a DHCP server because it is not an enterprise type DHCP server.. so I've been afraid to use the WLC as the DHCP server.
How big is your clients guest user base with the WLC as the DHCP server ?
08-16-2008 10:52 AM
Guest is the only time I would use the WLC as a DHCP server. We have implemented this in various environments, but the biggest is probably in hospitals. Some we have ran /24 to /23. Internal wireless has always been done on an enterprise DHCP server. I have also implemented this in a retail mall area that has public wifi on 8 floors. No issue with the WLC being a DHCP server.
08-16-2008 10:53 AM
Ok I was using a /23 for now - which should cover my guest needs.
I'll give that a shot and see if that's where the problem lies.
08-17-2008 08:52 AM
Unfortunately, no luck.
I moved the DHCP server to the anchor controller, and enabled DHCP proxy (which works to provide my clients an IP address) but the disconnects continue.
More debugging this weekend has provided me some additional info to consider.
It seems the problem happens on some kind of timer because 2 test machines I'm working with go into the web_auth required state at exactly the same times when they've both been connected at the same time.
The controller gives messages about Mobility role update requests.
"Mobility role update request. from Anchor to Handoff Peer = 10.1.254.150, Old Anchor = 10.2.254.150, New Anchor = 0.0.0.0"
Right at that time the client no longer passes traffic as web auth is required.
If I shut web auth off - the clients are OK - although I'm not positive the problem is gone - its just the small amt of time it may be going through changing states doesn't bother the client as web_auth isnt required.
I've upgraded all of the controllers to 5.1.151 - and the exact same symptoms are with me.
I don't beleive it's DHCP related as this occurs with DHCP on the WLC or on the 6509 in the DMZ.
I beleive it has something to do with mobility and or web auth.
08-17-2008 09:13 AM
Here is some monitoring output with the web_auth disabled. This is why I beleive the problem still exists, just not as noticeable with web_auth gone because the client gets (or keeps) its IP address. With web_auth w/passthrough enabled the client has to refresh a web browser to get traffic passing again.
08/17/2008 13:05:13 EDT INFO 10.2.254.150 Mobility role update request. from Export Anchor to Handoff Peer = 10.1.254.150, Old Anchor = 10.2.254.150, New Anchor = 0.0.0.0
08/17/2008 13:05:13 EDT INFO 10.2.254.150 Client Moved to DHCP Required State.
08/17/2008 13:05:16 EDT INFO 10.2.254.150 Mobility role update request. from Unassociated to Export Anchor Peer = 0.0.0.0, Old Anchor = 0.0.0.0, New Anchor = 10.2.254.150
08/17/2008 13:05:16 EDT INFO 10.2.254.150 The wlan to which client is connecting does not require 802 1x authentication.
08/17/2008 13:05:16 EDT INFO 10.2.254.150 Client does not have an IP address yet.
08/17/2008 13:05:16 EDT INFO 10.2.254.150 Client Moved to DHCP Required State.
08/17/2008 13:05:16 EDT INFO 10.2.254.150 Mobility role changed. State Update from Mobility-Incomplete to Mobility-Complete, mobility role=ExpAnchor
08/17/2008 13:05:16 EDT INFO 10.2.254.150 Client Moved to DHCP Required State.
08/17/2008 13:05:17 EDT INFO 10.2.254.150 DHCP successful.
08/17/2008 13:05:17 EDT INFO 10.2.254.150 Client has got IP address, no L3 authentication required.
08/17/2008 13:05:17 EDT INFO 10.2.254.150 Client IP address is assigned.
08/17/2008 13:05:17 EDT INFO 10.2.254.150 Dhcp Information. transmitting DHCP REQUEST (3)
08/17/2008 13:05:17 EDT INFO 10.2.254.150 Dhcp Information. op: BOOTREQUEST, htype: Ethernet, hlen: 6, hops: 1
08/17/2008 13:05:17 EDT INFO 10.2.254.150 Dhcp Information. xid: 0x5faa5e58 (1605000792), secs: 0, flags: 0
08/17/2008 13:05:17 EDT INFO 10.2.254.150 Dhcp Information. chaddr: 00:1a:73:9d:96:cb
08/17/2008 13:05:17 EDT INFO 10.2.254.150 Dhcp Information. ciaddr: 10.99.0.10, yiaddr: 0.0.0.0
08/17/2008 13:05:17 EDT INFO 10.2.254.150 Dhcp Information. siaddr: 0.0.0.0, giaddr: 10.99.0.1
08/17/2008 13:05:17 EDT INFO 10.2.254.150 Received DHCP ACK ,dhcp server set.
08/17/2008 13:05:17 EDT INFO 10.2.254.150 Dhcp Information. transmitting DHCP ACK (5)
08/17/2008 13:05:17 EDT INFO 10.2.254.150 Dhcp Information. op: BOOTREPLY, htype: Ethernet, hlen: 6, hops: 0
08/17/2008 13:05:17 EDT INFO 10.2.254.150 Dhcp Information. xid: 0x5faa5e58 (1605000792), secs: 0, flags: 0
08/17/2008 13:05:17 EDT INFO 10.2.254.150 Dhcp Information. chaddr: 00:1a:73:9d:96:cb
08/17/2008 13:05:17 EDT INFO 10.2.254.150 Dhcp Information. ciaddr: 10.99.0.10, yiaddr: 10.99.0.10
08/17/2008 13:05:17 EDT INFO 10.2.254.150 Dhcp Information. siaddr: 0.0.0.0, giaddr: 0.0.0.0
08/17/2008 13:05:17 EDT INFO 10.2.254.150 Dhcp Information. server id: 4.4.4.4 rcvd server id: 10.2.254.150
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide