09-28-2017 11:15 AM - edited 03-08-2019 12:12 PM
I have a central DHCP server that provides leases to clients at multiple remote sites over a GRE Tunnel.
If they ever have a WAN link flap and their EIGRP neighbors re-establish, the clients are no longer able to get DHCP leases, and get 169 addresses as a result even after the link and connectivity are fully restored.
If I remove the "ip helper-address" command, wait a few minutes and re-add it or reboot the router, the clients will start getting DHCP again, and I can see the bootrequests reaching the DHCP Server.
Any idea what might be going on here?
09-28-2017 01:28 PM
Does the router's log show anything when this happens?
Have you checked the release notes for the version of software running on the router for any bugs related to this?
HTH
09-28-2017 02:57 PM
Hey Reza,
I don't remember seeing anything other than the EIGRP neighbor changes, and unfortunately the log is now flooded with DHCP Debugs. I've disabled the DHCP debugging and will manually flap the tunnels tonight and see if anything shows in the log.
We are running iOS version 15.1(4)M6 on our 2951, I've looked at the release notes, and unfortunately nothing stands out to me.
https://www.cisco.com/c/en/us/td/docs/ios/15_1/release/notes/15_1m_and_t/151-4MCAVS.html#59107
 
					
				
		
09-28-2017 01:29 PM
Please share config, logs and debug results for both ends of the GRE tunnel.
-Austin
09-29-2017 10:03 AM
Hi Austin,
Attached is a copy of the log and the dhcp debug.
Below are the interface configurations:
interface Tunnel0 description ***Connected to ASR-1** bandwidth 50000 ip address 10.0.1.25 255.255.255.0 no ip redirects ip authentication mode eigrp 10 md5 ip authentication key-chain eigrp 10 NAME ip nhrp authentication primary ip nhrp map 10.0.1.1 [HUB PUBLIC] ip nhrp map multicast [HUB PUBLIC] ip nhrp network-id 10 ip nhrp nhs 10.0.1.1 delay 1 tunnel source GigabitEthernet0/0 tunnel mode gre multipoint tunnel key 100000 tunnel protection ipsec profile [IPSECPROFILENAME] shared ! interface Tunnel1 description ***Connected to ASR-2** bandwidth 50000 ip address 10.0.2.25 255.255.255.0 no ip redirects ip authentication mode eigrp 10 md5 ip authentication key-chain eigrp 10 NAME ip nhrp authentication backup ip nhrp map 10.0.2.1 [HUB 2 PUBLIC] ip nhrp map multicast [HUB 2 PUBLIC] ip nhrp network-id 20 ip nhrp nhs 10.0.2.1 delay 1000 tunnel source GigabitEthernet0/0 tunnel mode gre multipoint tunnel key 10000 tunnel protection ipsec profile [IPSECPROFILENAME] shared ! interface Embedded-Service-Engine0/0 no ip address shutdown ! interface GigabitEthernet0/0 description ***Connected to ISP*** ip address [PUBLIC IP REMOVED] ip nat outside ip virtual-reassembly in max-reassemblies 1000 load-interval 30 duplex auto speed auto ! interface GigabitEthernet0/1 description ***LAN-DATA*** ip address 10.100.65.253 255.255.248.0 ip authentication mode eigrp 10 md5 ip authentication key-chain eigrp 10 NAME ip flow monitor NPMMonitor input ip flow monitor NPMMonitor output ip flow ingress ip flow egress ip helper-address 10.200.68.15 ip nat inside ip virtual-reassembly in load-interval 30 duplex auto speed auto !
One thing to note is that connectivity between the DHCP Server and the remote site once the link comes back *SEEMS* fully functional. It seems the only thing we notice is that the DHCP Requests are not making it to the DHCP server, unless we remove and re-add the ip helper command.
 
					
				
		
09-29-2017 11:18 AM
I don't see any issues in the log neither the debug. would it be possible to share your remote site config 2951 device as the one above is for the hub side (to verify ip helper command position on the interface facing the lan side). also once the tunnel is bounced and dhcp fails - try to ssh to the remote site 2951 and start troubleshooting there without reloading the router or re-applying ip helper like checking keepalive status (needs to be matched on both ends). One more thing, is this impacting all remote sites or just this one?
note: see if this is a server related issue- build a lab: test it internally or configure dhcp on the router without impacting your production.
Good luck!
-Austin
09-29-2017 12:21 PM
Hey Austin,
This issue affects multiple remote sites, not just a single site. I don't believe this is an issue with the DHCP server. The DHCP server in question is still providing DHCP to other remote sites while this issue is occuring with another, as well no changes are made to the DHCP server at all once we re-apply the ip helper-address .
I will try to do some troubleshooting at the hub location once the tunnel flaps, any suggested debug commands?
I want to figure out if the BOOTREQUESTS ever make it to the hub site, is the best way to do this matching traffic with an acl?
 
					
				
		
09-29-2017 02:34 PM
Try Router# debug dhcp detail. Please post show version on ASR.
See bug# CSCsm86039 same symptoms but with the VRF. It looks like dhcp renew is failing after the GRE tunnel bounces specifically dhcp relay is failing at forwarding DHCP REQUEST to the ASR.
https://bst.cloudapps.cisco.com/bugsearch/bug/CSCsm86039/?referring_site=bugquickviewredir
09-30-2017 10:31 AM
Hey Austin,
GA-DC1-ASR1001-PRI#show ver Cisco IOS XE Software, Version 03.13.02.S - Extended Support Release Cisco IOS Software, ASR1000 Software (X86_64_LINUX_IOSD-UNIVERSALK9-M), Version 15.4(3)S2, RELEASE SOFTWARE (fc3) Technical Support: http://www.cisco.com/techsupport Copyright (c) 1986-2015 by Cisco Systems, Inc. Compiled Fri 30-Jan-15 14:23 by mcpre
Unfortunately debug dhcp detail doesn't show anything on the ASR
I can switching this dhcp to be under the ip dhcp pool command - I've never used it as a relay would this work?
ip dhcp pool DATALAN relay source 10.100.65.0/23 relay destination 10.200.68.15
 
					
				
		
09-30-2017 11:56 AM
If it was a router based dhcp then #show ip dhcp server statistics would help but not here. I am not sure if using relay commands will help. I would file a TAC case to get this resolved especially if its suddenly stopped working -a bug related issue-. Again best way to isolate the issue is to troubleshoot when clients fail to renew their IPs once the tunnel is bounced without re-applying ip helper or reloading the remote router.
Good luck!
 
					
				
		
09-30-2017 12:34 AM
Hello,
in addition to the other posts, what are your route caching settings on the tunnel interfaces ? Try and explicitly enable it with the 'ip route-cache cef', also in combination with globally disabling cef (no ip cef)...
no ip cef
!
interface Tunnel 1
ip route-cache cef
09-30-2017 10:37 AM
Hey Georg,
IP CEF was globally enabled. However If I turn it off, I am unable to turn on 'ip route-cache cef' on the tunnel interface - it states that CEF is globally disabled.
I've checked another router that experiences the same issue, both the "ip route-cache cef" command on the interface is enabled and the 'IP CEF' command is enabled globally.
09-30-2017 10:52 AM
Can you post the full configs of both ends so we can lab this ?
 
					
				
				
			
		
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide