04-10-2018 09:21 AM - edited 02-21-2020 07:37 AM
I have a DMVPN setup as well as another SD-WAN solution. The tunnels for both solutions flow through an ASA for Internet access. I have a problem when one end of the tunnel flaps or briefly goes down, the tunnel will not come back up.
I have narrowed it down to flows through the ASA. If I clear the connections on the ASA the tunnels will reform. It seems that the tunnels try to use an existing connection on the ASA when coming back up and that fails.
Any suggestions?
04-10-2018 10:11 AM
Hi,
If one of the spoke routers' tunnel flaps, you might want to configure Dead Peer Detection. If configured the router will check periodically to see if it still can communicate with the peer router, if it loses connectivity it will delete the SA.
Reference:
HTH
04-10-2018 01:50 PM
Thanks, but that is not really my problem. The tunnels are trying to reform after a link flaps. The issue seems to be, that since these UDP flows are going through the ASA, they try to reuse the existing ASA connection/flow they had before the link flapped.
I can reset tunnels/nhrp etc. all day, but the only thing that brings the tunnels back up is clearing the connection on the ASA.
04-10-2018 09:20 PM
Does the ASA keep the udp 500 or udp 4500 flows? If it is udp 500, you can try to disable the dpd functionality on the peers. What this does is not send dpd's that keep the udp 500 connection up on the ASA. When the interface flaps on one end, it sends the udp 500 that does not match an existing ASA connection and goes through.
UDP 4500 is a little tricky. Here since both negotiation and encrypted flows use the same udp 4500 ports, this connection will always match because one side will keep sending encrypted traffic.
Now a possible solution here is to use EEM scripts on the ASA. On the ASA, you can create SLA monitoring to check the reachability to both peers. If one of them goes down, you can issue a clear connection command (or clear local-host) for that particular host alone. This will clear the connection so that the peer can establish a new session. The timers may have to be tweaked to get this right, but I feel that this should work in theory. An example of the EEM script (used for another purpose) is here:
04-22-2018 08:54 AM
Try changing the floating-conn timeout, this should fix the problem. By default, this timeout is set to 0. Change it to a low number like 1 minute. this should help with the failover.
04-23-2018 01:56 PM
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide