cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1971
Views
15
Helpful
14
Replies

IOS XE 16.12.5 switches not reliably forwarding DHCP Offers to Clients

Might Ncube
Level 1
Level 1

Hi all.

Had been searching everywhere for a solution for this issue. While the conditions discussed here are not 100% similar to mine, I am convinced that the issue is the same.

1) My environment runs a Cisco 3850 stacks access switches with two vlans, Voice and Data only. They running IOS XE 16.12.5b. DHCP servers are Windows 2016 and are tucked away in the Datacenter.

2)We using NAC supplied by Ivanti, formerly Juniper, Pulse Policy Server with dot1x and MAB.

3) We do not have snooping enabled.

The issue started on the 3rd day of NAC implementation,, the client would send a Discover for an IP, the switch would successfully forward the message to the server, server would send back the Offer, and for an odd reason, the offer would not make it to the client, effectively cutting short the dhcp proccess. 

Another indication that the issue was on the switch was that the ARP Table had the IP to Mac mapping even though the Client remained with an APIPA address. On the access-session table, the endpoints were MAB authorised, non on dot1x.

We been all over everything trying to understand why the switch would not forward the Offers to the clients, we focused on DHCP options without success, did a lot of packet captures in different places, suspected rate limiting and cpp policies. 

Stumbled on this bug that seems to describe what me and my team couldn't fix

https://bst.cloudapps.cisco.com/bugsearch/bug/CSCvs91593 

Now that we have found this Bug disclosure, would like to upgrade the code and observe, if that solves the issue, would revert back to this forum.

1 Accepted Solution

Accepted Solutions

ip device tracking probe auto-source
ip device tracking probe delay 10

I ask you about IP tracking you say NO, but I see the IP tracking in your config, this make issue here

 https://www.cisco.com/c/en/us/support/docs/ios-nx-os-software/8021x/116529-problemsolution-product-00.html

check this link

View solution in original post

14 Replies 14

are you run any ip tracking ?

Might Ncube
Level 1
Level 1

No.
No ip tracking, no dhcp snooping, just basic NAC.

ammahend
VIP
VIP

interesting, few question :

when you did offer packet capture, what do you see as L2 source and destination MAC address ?

do you have any broadcast storm configured ?

does the issue happens when PC is directly connected to port or only when PC is connected via a phone ?

-hope this helps-

Hello,

the bug seems to be applicable when DHCP snooping is enabled (which you don't have). Can you post the exact port config (or better yet, the entire switch config) ? Are these Windows (10/11) clients ?

Might Ncube
Level 1
Level 1

Hi Georg, 

The switch config was very long, I have attached here only samples of the interface configs and the rest of other configs, I also had to clean out all information that would pose a security risk to myself. For MAB, I used one Policy that is a clone of the policy list that could have been added to the switch.

There is currently a temporary pool that I added as the DHCP server is still basically unusable.

ip device tracking probe auto-source
ip device tracking probe delay 10

I ask you about IP tracking you say NO, but I see the IP tracking in your config, this make issue here

 https://www.cisco.com/c/en/us/support/docs/ios-nx-os-software/8021x/116529-problemsolution-product-00.html

check this link

Thank you so so much for pointing that out,

That could be the potential problem perhaps, will review and come back to you.

Thank you once more

I have checked,

From the article you shared, I see the current configs on my switch are what the article recommends, will attempt to use SVI and test it out, and will remove the Temp pool currently in place.

PROD-S01#show ip device tr
PROD-S01#show ip device tracking all
PROD-S01#
PROD-S01#
PROD-S01#show run | inc device
ip device tracking probe auto-source
ip device tracking probe delay 10
PROD-S01#

I will double check all info you share and update you

I have tested by removing the Temp scope I had, and managed to bounce a Windows 10 machine and a Cisco AP, both got addresses from the Microsoft DHCP and are both reachable on the network as shown on the attached image.

I modified the tracking command to below, I am not 100% sure of the 0.0.0.1 address but I just followed the documentation, will have to research more on that. All I can say is that it worked and will mark your responses as Acceptable solutions. Will do that after full confirmation tomorrow.

ip device tracking probe auto-source fallback 0.0.0.1 255.255.254.0

 

 

Cisco recommend four solution and you apply one of them which is probe delay <10>
then I need more info.
you mention 
"""The issue started on the 3rd day of NAC implementation,, the client would send a Discover for an IP, the switch would successfully forward the message to the server, server would send back the Offer, and for an odd reason, the offer would not make it to the client, effectively cutting short the dhcp proccess."""
with Wireshark
please share the packet of PC discover and Offer from DHCP Server 
I need full packet.

Hello,

to be honest, I am bit confused about certain parts of your config. First of all, you have a layer 3 access list applied to your layer 2 interfaces ( ip access-group BLOCK_ROGUE_DHCP in), the access group allows everything, so it is completely redundant either way, so you might as well remove that from your config.

Then, you have a local DHCP pool for Vlan 230, and a voice Vlan 248. Which Vlan are the helper addresses for ? I assume the voice Vlan only (since all other DHCP requests will always be serviced by the router first. You have multiple IP helper addresses, why that many ? Are you making sure that the DHCP servers connected to these addresses do not have overlapping address pools ? 

Hi @Georg Pauwen 

I would admit there is a huge number of ip-helpers will need to remove them, they don't have overlapping scopes,,

The scope you saw is a Temp scope put in place few days ago to temporarily hand over addresses and that is what is keeping the environment working. The voice vlan was not affected by this. was ready to add temp scope for it as well.

The ACL had an entry that was for blocking rogue DHCP, removing it was going to cause issue or force me to remove on each and every interface, didn't want to go to hard config changes.

I added the below command, removed the Temp scope and the PCs started getting IPs

ip device tracking probe auto-source fallback 0.0.0.1 255.255.254.0

That led me to believe its not bug related but IP tracking and duplicate address detection done by Windows machines

Thank you all for your time and assistance

 

 

Might Ncube
Level 1
Level 1

@ammahend I am not comfortable posting attaching the captures but if you look at it, you'll realise there are more Discover messages and a few ACKs and a few Offers.

For a moment I thought perhaps the Clients do not honor the addresses, or perhaps they discard them, now I couldn't find the reason why, primary suspect was DHCP options,, but haven't been successful on proving they are the issue.

It behaves like there is rate-limiter, sad part is that the ARP table will be having the leased IPs but clients, 90% windows 10 will be having APIPA

 

 

Review Cisco Networking for a $25 gift card