cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
7285
Views
5
Helpful
11
Replies

%SW_DAI-4-DHCP_SNOOPING_DENY after dhcp server migration

Johann
Level 1
Level 1

Hello,

Some weeks ago, we migrated our DHCP server (from windows 2003 to windows 2012 with the new failover features in active/passive mode).

On our switches ; we have both arp inspection & dhcp snooping enabled. Since the migration, arp inspection is not working correctly : as soon as i activate the arp inspection on our client vlan (96) ; we get errors like : "Sep  1 11:50:39: %SW_DAI-4-DHCP_SNOOPING_DENY: 1 Invalid ARPs (Req) on Fa0/29, vlan 96.([d4c9.efdf.710e/10.0.96.89/0000.0c07.ac60/10.0.127.254/11:50:39 GMT+1 Mon Sep 1 2014])
Sep  1 11:50:40: %SW_DAI-4-DHCP_SNOOPING_DENY: 2 Invalid ARPs (Req) on Fa0/7, vlan 96.([d485.64b4.0068/10.0.97.214/0000.0000.0000/10.0.127.254/11:50:40 GMT+1 Mon Sep 1 2014])

"

If i have a look on the dhcp snooping binding table on the same switch :

NUKUH052#sh ip dhcp snooping binding
MacAddress          IpAddress        Lease(sec)  Type           VLAN  Interface
------------------  ---------------  ----------  -------------  ----  --------------------
18:A9:05:F5:28:2B   10.0.97.101      418236      dhcp-snooping   96    FastEthernet0/40
6C:3B:E5:0D:B3:B2   10.0.96.184      2936        dhcp-snooping   96    FastEthernet0/36
10:60:4B:7C:A3:14   10.0.97.17       678739      dhcp-snooping   96    FastEthernet0/42
00:1F:29:02:AA:6B   10.0.98.53       678938      dhcp-snooping   96    FastEthernet0/37
88:51:FB:80:1B:E1   10.0.97.252      680212      dhcp-snooping   96    FastEthernet0/3
64:31:50:A3:F8:52   10.0.96.96       341484      dhcp-snooping   96    FastEthernet0/20
64:31:50:A3:D7:5A   10.0.97.209      677205      dhcp-snooping   96    FastEthernet0/6
6C:3B:E5:1A:8D:05   10.0.96.255      677165      dhcp-snooping   96    FastEthernet0/8
00:1F:29:02:AA:EF   10.0.96.207      678365      dhcp-snooping   96    FastEthernet0/1
00:23:7D:2F:72:E7   10.0.98.152      680376      dhcp-snooping   96    FastEthernet0/16
Total number of bindings: 10

 

Strange, interface FastEthernet0/7 is not in the table !!! and that s the same case for a lot of computers. (of course it s dhcp and not static ip address). 

Extract of the switch configuration :

Standard port configuration 

interface FastEthernet0/7
 switchport access vlan 96
 switchport mode access
 switchport nonegotiate
 switchport voice vlan 192
 switchport port-security maximum 3
 switchport port-security
 switchport port-security aging time 1
 switchport port-security violation restrict
 ip arp inspection limit rate 256 burst interval 10
 no logging event link-status
 mls qos trust dscp
 no snmp trap link-status
 storm-control broadcast level bps 1m
 storm-control multicast level bps 1m
 storm-control action shutdown
 spanning-tree portfast
 spanning-tree bpduguard enable

General switch settings

ip dhcp snooping vlan 96
ip dhcp snooping information option allow-untrusted
no ip dhcp snooping information option
ip dhcp snooping

>> ip arp inspection vlan 96 : as soon as i had this command i have the error messages.

I already :

* tested several software version

* only enabled  a single dhcp server in the helper address

 

But i don't find the issue... the problem came when we started up the 2 new dhcp server (with the new 2012 dhcp failover feature). We have the same issue on all the switches of this LAN (the same config is running fine on the other factory we own).

Can you help me to solve that issue ?

 

 

 

 

11 Replies 11

educruz
Cisco Employee
Cisco Employee

Hi Johann,

 

New versions of Windows servers are known to have certain incompatibility issues with DAI and DHCP snooping. Can you check whether this is your case, please?

http://support.microsoft.com/kb/2978225

 

Kind regards,

- Ed

Hello Ed,

Thanks for your response.

 

We are not falling into this case ; as we are using the server in hot-standby mode (active/passive).

However, i suspect that the switches are not interpreting correctly the dhcp negociation packet...

Last friday ; i deleted the passive server from the 2 ip helper address on our core switch, but this does not seems to help (i ll keep the experience till the end of the max dhcp lease - 8 days on our system).

 

Any other idea ?

Johann,

 

 

I agree with you. I would suggest that a packet capture is carried out specifically including a couple of hosts and the DHCP servers to understand solely how DHCP negotation goes.

I suspect that the switch receives two DHCP messages per one DHCP client request (even if the server is in an active-passive mode) so DAI processes one IP-to-MAC binding as invalid. If you would like to share the outcome I will be happy to analyse it.

 

- Ed

 

Hi,

I just attached the file containing the filtered caps ; taken simultaneously on both dhcp servers. As we are using HSRP ; it looks like the dhcp messages are not doubled, but quadruplet.

Below the HSRP configuration on one of our both core switches :

interface Vlan96
 ip address 10.0.127.252 255.255.224.0
 ip helper-address 10.0.9.33
 ip helper-address 10.0.9.32
 no ip redirects
 standby 96 ip 10.0.127.254
 standby 96 timers 1 4
 standby 96 priority 80
 standby 96 preempt
 arp timeout 720

 

And i also discovered that there are some microsoft bugs related to dhcp failover. Links here :

http://blogs.technet.com/b/teamdhcp/archive/2014/02/26/dhcp-failover-patch-to-address-a-reservation-issue-and-another-issue-related-to-failover-partner-not-accepting-state-transition-from-bad-address-gt-active-has-been-released.aspx

and

http://support.microsoft.com/kb/2831920

And the active dhcp windows server has not been updated since january 2013.... (so the update are not applied). I ll discuss with my colleague in charge of server to update it asap...

after the server update, it is still not working...

i took another new network caps ; it seems to get the same trafic.

any idea to progresse on the question ?

I think we are falling under a design situation.

The Windows servers will always respond to DHCP packets, even if they are duplicated. This is something Microsoft modified for the 2012 edition as opposed to the 2003 DHCP behaviour.

DAI will always block one received DHCP packet and mark the other as invalid. This is something Cisco has never changed. 

I know about two exact situations from past experience, in which even cases with Microsoft were opened but they specified their servers act as designed. Usually applying certain patches resolved the issue; otherwise, DAI was turned off.

I will do a further research and come back to you.

Is this currently user impacting? Is applying the specified patches in the previous links or turning off DAI feasible for this specific scenario? 

Tnks for your analysis.

We applied some hours ago the specified patches on both server, but this does not corrected the problem.

Of course, i disabled DAI for the moment to avoid impacting users...

 

oh wait...

it is working on my first switch now. i just did a renew on one client which did not reboot this night.

so maybe the patch did the work, i will try to deploy to the other switches...

That is great news Johann! Let me know how the rest of the switches cope with the patched servers. Hopefully that is the fix.

Hello Ed,

Tnks for your help. I confirm you it is working for all the switches.

For next users : 

* update windows 2012 dhcp server

* version 12.2(58)SE3 was nor working at all. I read some other threads talking about a bug on that version.

 

On my case ; DAI is working fine with version 12.2(58)SE2 & 15.02(SE5) ; with windows server 2012 in failover hot-standby (= active/passive mode).

Johann,

Our analysis was correct. As the capture was run with the two servers simultaneously, DHCP messages were quadrupled. Under normal circumstances (an only server), there would have been two messages.

Windows 2012 servers per design are obliged to respond to every DHCP packet, even if they have the same Transaction ID.

This can cause a lease mismatch as the host accepts the first DHCP ACK whilst the switch honors the second DHCP ACK received, per expected behaviour of DAI. 

In addition, I reviewed the given links and I have to say that falling under those scenarios is feasible. My suggestion would be update the server as much as possible, apply the known workarounds in the links and as a final option, only if truly necessary, turn off DAI on the switch.

Let me know if I can be of any help after the server update.

- Ed

Review Cisco Networking for a $25 gift card