We have several hundred branches with internet based IPSEC VPN connections over ADSL (with 3G Backup) terminating on the ASA firewalls in our datacentres. This has been well established and running for several years.
However, on Friday, we saw a number of connections (around 20) drop and our 3G failover did not kick in (We have two IP SLAs set-up monitoring 2 internet locations and if both fail then the 3G should kick in) We got 3G to come up by manually disconnecting the ADSL cable. When we reconnected the ADSL cable, VPN connectivity dropped again. However, a short time later all became good again and we thought "that was weird" but just figured it was a "glitch" of some sort.
Of course, today we saw the same sort of thing. This time a dozen sites dropped and our 3G did not kick in. Further investigation showed that the reason 3G did not kick in was because the IP SLA's were still "up" on the device i.e. the ADSL still had internet connectivity but did NOT have IPSEC VPN connectivity. (Again, manually disconnecting the ADSL cable could "force" the router to fail over to 3G) This time the problem hasn't disappeared by itself and we have had to leave the sites with their ADSL cables disconnected and running on 3G for the time being.
But what I can't understand is what could cause sites that had been working fine for a long time to suddenly stop working even though it appears they have internet connectivity? It's as if the VPN traffic is somehow being blocked. I am convinced it has to be something at the ISP but they deny any problems and suggest it's ACLs on our routers! (But we haven't changed any and have hundreds of others still connecting without issue)
Can anyone suggest what the problem might be? If an ISP issue, what could it possibly be and how do I prove to them its down to them? What troubleshooting can I carry out to pinpoint where the problem lies?
Any suggestions are welcome because this one has me puzzled!
Is it possible you don't have static IP addresses from the provider at some of those locations? That's something I would double check. If your remote office IP addresses changed it would affect the VPN connection. The connections would still appear to be working fine because you'd still have internet connection.
So is it just the VPN that's dropping? Can the sites ping each other when the VPN goes down? I'd focus on just one remote site and ensure connectivity before troubleshooting the VPN.
After that maybe do a debug crypto isakmp and see if anything jumps out.
Bizarrely, some of the lines (that we had left disconnected overnight) came back up no problem when we reconnected their ADSL cables this morning.
However, the others remain down as far as VPN connectivity is concerned but authenticated and working according to our ISP (Struggling to ascertain from the sites what they can and can't do - we are the victims of our own draconian security as we nail them down quite heavily on what they can and can't do themselves so when we can't actually connect to their machines it's difficult to find out what's possible and what's not!)
We got one site to switch off their router for 30minutes and then switch it back on again - made no difference. Then got them to switch it off again, in the meantime, I reconfigured another router with their ADSL account details, connected it to one of the ADSL lines in the office and it worked fine? Could establish VPN connectivity without issue?
I have to admit I'm baffled.
The main thing I'd try to determine is if it's a VPN issue or an internet connection issue. When the VPN goes down you should still be able to ping the public IP address of that remote office. If you can't, then it may be a provider issue. Do you have the same provider for all of the sites that are having an issue? Are they in the same general geographic location?
Well, never really managed to get to the bottom of this but all the sites eventually managed to restore VPN connectivity.
For the majority of them, we had left their ADSL cables disconnected for a decent amount of time (some overnight) before reconnecting them and all came back up. I don't know if there was any significance in that at all or if it was just coincidence.
But there were a couple that just re-established VPN connectivity without any action at all on our part (not even a reboot of the router)