Solved: there could be an issue with

simon.patenaude · ‎11-30-2016

I have a DMVPN network with EIGRP routing between all routers. The system has two hubs and 4 spokes at the moment. The nature of the spoke routers are to be set up at sites for no more than 12 hours. The spokes travel the world quite a bit. They connect to many different ISPs, modems, LTE networks... If it has the internet, it needs to connect home.

Recently one of the spokes established IPsec without any issues however EIGRP would not establish adjacency. The following day the spoke was set up at a new site and remained solid and has been good at all new sites since.

This is the log from HUB1:

%DUAL-5-NBRCHANGE: EIGRP-IPv4 101: Neighbor 10.0.0.12 (Tunnel0) is up: new adjacency
%DUAL-5-NBRCHANGE: EIGRP-IPv4 101: Neighbor 10.0.0.12 (Tunnel0) is down: retry limit exceeded

This is from the Spoke

%DUAL-5-NBRCHANGE: EIGRP-IPv4 101: Neighbor 10.0.0.1 (Tunnel0) is up: new adjacency
%DUAL-5-NBRCHANGE: EIGRP-IPv4 101: Neighbor 10.0.0.1 (Tunnel0) is down: Peer Termination received

With research, I was able to narrow the issue to being an MTU mismatch from the ISP. I recreated the setup in my lab and I have the same error messages when I throttle the MTU down to 613 on my "ISP router" interface with the command mtu 613.

Is there a way to have the spoke dynamically determine the MTU of the path to HUB? Is there a workaround I should be looking at? Am I looking at the completely wrong thing?

This needs to be as hands-free as possible because the routers do not travel with technicians who can make changes on the fly.

I tried tunnel path-mtu-discovery but that doesn't seem to do what I need it to. I will admit I am new to this, so my apologies for any newbie mistakes.

###SPOKE Information###

!

interface Tunnel0
description DMVPN SPOKE2
ip address 10.0.0.12 255.255.255.0
no ip redirects
ip mtu 1436
ip authentication mode eigrp 101 md5
ip authentication key-chain eigrp 101 [KEY-CHAIN]
ip hello-interval eigrp 101 60
ip hold-time eigrp 101 180
no ip next-hop-self eigrp 101
ip nhrp authentication [NHRP AUTHENTICATION]
ip nhrp map multicast [HUB1 PUBLIC IP]
ip nhrp map 10.0.0.1 [HUB1 PUBLIC IP]
ip nhrp map multicast [HUB2 PUBLIC IP]
ip nhrp map 10.0.0.2 [HUB2 PUBLIC IP]
ip nhrp network-id [NHRP NET-ID]
ip nhrp holdtime 1200
ip nhrp nhs 10.0.0.1
ip nhrp nhs 10.0.0.2
ip nhrp registration no-unique
ip nhrp shortcut
zone-member security IN-ZONE
tunnel source GigabitEthernet8
tunnel mode gre multipoint
tunnel key [KEY]
tunnel protection ipsec profile [PROFILE]

!

interface GigabitEthernet8
description WAN
ip address dhcp
ip flow ingress
ip flow egress
ip nat outside
ip virtual-reassembly in
zone-member security OUT-ZONE
duplex full
speed auto

xthuijs · ‎02-18-2017

hi simon, thanks for that update! the left over message may be a recursive loop whereby the tunnel physical address is advertised through Eigrp and learnt like that also. Check the router eigrp config and see if there needs to be something updated.

the address looped is precisely the 10.0.0.11 that seems to be a remote peer. I think we are advertising that via eigrp while it is also a directly connected adj (because of our local subnet).

I also see why the tunnel source fixed up the situation: the tunnel is using the mapping between private and public addresses, so the tunnel source needs to be of the private kind mapped/found via the public. so that resolution indeed makes sense.

cheers!

xander

View solution in original post

simon.patenaude · ‎01-21-2017

Any help from anyone would be appreciated.

I ran into this issue again and it looks like the Hellos are making it to the hubs but the Hub's hellos are not making it to the spoke. this is what i got:

SPOKE1#show ip eigrp traffic
EIGRP-IPv4 Traffic Statistics for AS(101)
Hellos sent/received: 20351/0
Updates sent/received: 0/0
Queries sent/received: 0/0
Replies sent/received: 0/0
Acks sent/received: 0/0
SIA-Queries sent/received: 0/0
SIA-Replies sent/received: 0/0
Hello Process ID: 412
PDM Process ID: 411
Socket Queue: 0/10000/0/0 (current/max/highest/drops)
Input Queue: 0/2000/0/0 (current/max/highest/drops)

xthuijs · ‎01-24-2017

there could be an issue with mtu, but since the hellos are only one way, it seems that there is possibly an issue with the multicast forwarding on the tunnel link.

the eigrp hellos are sent with 224.0.0.10, and I think we need to do some debugging on the hub router to see why they are not making it out. it also seems like they are not even created.

it would be best to debug at the eigrp level at the hub router to find out the hello reception and creation and see if there is a reason for dropping/ignoring the hello.

if the eigrp process is creating hellos, but they are not making it out, we need to do a ip packet debug to see if they are created and able to be forwarded.

it could possibly be that the link established has an issue forwarding mcast OR that the tunnel encap is where it goes wrong.

few options to consider are:

- make an eigrp p2p neighbor relationship to omit the discovery via mcast and directly peering it

- if there is no hub to hub necessary, possibly instead of eigrp/dynamic routing use a static default route on the hub to simplify the design.

I would possibly also set the mtu of the interface to accommodate for normal intf mtu minus encap headers for the tunnel to instruct apps to lower their packet size as we cant, or dont want to, rely on internet fragmentation (fw's might block that)

cheers

xander

simon.patenaude · ‎01-28-2017

Hi Xander,

Thank you for taking the time to look at this!

I know that the Hub is creating and sending Hellos via the tunnel as it is connecting to other spokes. When I moved the spoke that was not connecting to a new ISP it connected perfectly.

Could the ISP affect Multicast traffic inside a GRE tunnel? If that's the case, I agree that I need static EIGRP neighbours.

By creating static EIGRP neighbours within the AS, will multicast discovery stop for all other networks within the AS? If I understand what I read, when setting an EIGRP Neighbor the "Hellos" and "ACK" stop. Will the HUB router know when the Spoke is offline?

This design is intended to be a Phase 2 DMVPN where traffic can flow spoke to spoke. I feel that static routes would not be an option as we plan to grow this network. I've attached the topography for the network. Note all spokes are temporary and connect for 12 hours at a time.

Thank you,

Simon

xthuijs · ‎01-30-2017

the intermediate network would only see GRE, and not looking inside the tunnel per-se.

but it could be that on either endpoint the mcast forwarding is broken with the tunnel. however based on your test that switching SP's solve the issue, that can be eliminated.

what i suspect is happening is that due to the addition of gre headers (and possibly some others), the packet size gets too large and either :

- requires fragmentation and the SP's policy is to not fragment

- requires fragmentation but the packets are sent with DF bit set so can only be dropped

- requires fragmentation, get fragmented, but some filter somewhere is dropping fragmented packets as security policy.

you could without tunnel send a few pings with various packet sizes without df set first and see if there are fragments along the line. and or with the df bit set and look for icmp responses (not necessary to be received perse)

in vpn scenarios like this generally endpoints are set to a much smaller mtu like 1200 or so, so that IF packets need to be fragmented they are already at the tunnel start point to be reassembled by the receiver.

it is always better to encap 2 fragments into 2 unfragmented gre packets vs

one packet with gre header to be fragmented into 2 pieces.

cheers!

xander

simon.patenaude · ‎02-18-2017

I should update this issue in case it is useful to anyone else. Turns out I was chasing the wrong issue.

I noticed that the EIGRP was flapping at the HUB while the SPOKE was not receiving any EIGRP requests from the HUB. I also noticed that the SPOKE had issues establishing EIGRP only when it was connected to a modem set in bridge mode. The HUB logged this message:

%CRYPTO-5-IKEV2_SESSION_STATUS: Crypto tunnel v2 is UP. Peer [SPOKE DHCP PUBLIC IP]:4500 Id: [SPOKE DHCP PUBLIC IP]

This is where EIGRP was flapping at HUB. I could not Ping the tunnel end points from the SPOKE or HUB. This suggested a routing issue.

When the SPOKE was connected to a modem set as a router, the HUB gave me this message:

%CRYPTO-5-IKEV2_SESSION_STATUS: Crypto tunnel v2 is UP. Peer [SPOKE DHCP PUBLIC IP]:4500 Id: 192.168.1.10

EIGRP connected without any issues. This tells me that the tunnel source affects routing from the SPOKE to the HUB.

I made the following change to the Tunnel at the SPOKE:

Changed from tunnel source GigabitEthernet8 to tunnel source Loopback0

This has fixed the issue where I can connect to a bridged or non-bridged modem. That said, I have to admit I'm not sure I understand how the tunnel source helped.

I am getting this message but does not seem to be affecting any traffic.

%ADJ-5-PARENT: Midchain parent maintenance for IP midchain out of Tunnel0, addr 10.0.0.11 - looped chain attempting to stack

xthuijs · ‎02-18-2017

hi simon, thanks for that update! the left over message may be a recursive loop whereby the tunnel physical address is advertised through Eigrp and learnt like that also. Check the router eigrp config and see if there needs to be something updated.

the address looped is precisely the 10.0.0.11 that seems to be a remote peer. I think we are advertising that via eigrp while it is also a directly connected adj (because of our local subnet).

I also see why the tunnel source fixed up the situation: the tunnel is using the mapping between private and public addresses, so the tunnel source needs to be of the private kind mapped/found via the public. so that resolution indeed makes sense.

cheers!

xander

simon.patenaude · ‎02-18-2017

Thank you so much Xander for all your help!

I will read up on the recursive loop but I do consider the original issue of this chain to be resolved.

Thank you,

Simon

DMVPN - ISP Stopping EIGRP Adjacency