Hello, I have multiple locations with GRE IPSEC tunnels running on Cisco Routers between the locations. I have two main Hub sites and several remote sites. Most of the remote sites have two different ISPs with 3 tunnels total coming out of the remote locations; one tunnel over ISP1 to Hub1, one tunnel over ISP1 to Hub2 and one tunnel over ISP2 to Hub1
In front of the Cisco routers are Cisco ASA firewalls so I have ACLs that pass the GRE traffic through, with NAT handled on the same firewalls.
Encryption is done on the routers and either side can bring up the tunnels. EIGRP is used to advertise routes over the tunnel interfaces and all of that works fine.
The issue is I have this one location that has 4xT1s bonded together coming into a Cisco 1800 series router for the Internet Circuits, and behind the Cisco ASA firewall is a Cisco 1841 router that has the tunnel interfaces configured experiences MTU issues everytime the site goes down. It does not matter if it is a Link failure, power failure, scheduled outage or not. As soon as the site if off the network, the next time it comes back up users are complaining about the telnet application they use timing out.
When I log into the routers and look (specifically at the remote side) I see huge number of output drops on the Tunnel Interfaces going to the hub sites. To fix it I end up adjusting the MTU parameters ip mtu 700 and ip tcp adjust-mss 650 on both sides and the problem goes away until the next outage.
I also do a ping with the DF bit set to see the largest packet I can get across before fragmentation and then I adjust accordingly. I have continued to do this for a while. I think I started out with ip mtu 1100 and the adjust at 1050 and now the above is what I am currently at. I am concerned at this rate I will run out of room soon. I obviously do not understand enough about the MTU as I thought with the appropriate commands the router would dynamically adjust the MTU or have the hosts to dynamically adjust the packet size and resend the packet and the corrected value and thus I should not have to continue to do this.
This only happens with this one location. The hub sites have 2921 routers. parital config is below. Any help would be greatly appreciated.
Remote Router
ip icmp rate-limit unreachable 50
ip icmp rate-limit unreachable DF 50
ip tcp path-mtu-discovery
Int fa0/0
ip tcp adjust-mss 1436
interface Tunnel8
bandwidth 6000
ip address x.x.x.x 255.255.255.252
ip mtu 700
ip flow ingress
ip flow egress
ip tcp adjust-mss 650
delay 5000
tunnel source x.x.x.x.x
tunnel destination x.x.x.x
tunnel path-mtu-discovery
tunnel protection ipsec profile internet
HUB1 Router
ip icmp rate-limit unreachable 50
ip icmp rate-limit unreachable DF 50
ip tcp path-mtu-discovery
ip tcp synwait-time 10
interface Tunnel8
bandwidth 6000
ip address x.x.x.x 255.255.255.252
ip mtu 700
ip flow ingress
ip flow egress
ip tcp adjust-mss 650
delay 5000
tunnel source x.x.x.x.x
tunnel destination x.x.x.x
tunnel path-mtu-discovery
tunnel protection ipsec profile internet