cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
8312
Views
0
Helpful
22
Replies

File Transfers Dropping over site to site VPN tunnels (GRE ove IPSec)

jwhite100
Level 1
Level 1

We have an issue on our network that when we transfer files from site to site using windows the transfer drops and we get the error message "network not available". It doesn't happen all the time but 9 times out of ten it does! Some tunnels work better than others. We currently have 4 sites.

I have posted on another forum. We have tried many things but no one has been able to resolve the issue so far. Please check my other post for an update.

http://www.networking-forum.com/viewtopic.php?f=35&t=19354&p=120181#p120181

22 Replies 22

Marcin Latosiewicz
Cisco Employee
Cisco Employee

James,

I had a quick look at your post on the other forums.


I would say please start by checking accelerator statistics specifically to see if any ppq full was reported.

I didn't go over the captures. You mention a RST coming, where is it coming from, how long does it take for the RST to pop up?

More importantly:

1) When has the problem started?

2) Are there protocols (http, ftp) which are not affected?

3) Any firewalls or tcp accelaration technology in topology?

Marcin

Thank you for you quick response.

How do I check the accelerator statistics? I have not heard of this before.

For protocols we are mainly using SMB and FTP. I haven't seen so many errors with FTP but I have kicked one off now just to test.

The problem has been there since day one when we swiched from Microsoft ISA server to Cisco for our site to site VPNs.

We don't have any firewalls and as far as I know we don't have any tcp accelaration technology.

For the packet capture - The RST occurs right at the end when the packet drops. First you get a load of tcp dups

joyride_us2
Level 1
Level 1

try to decrease your keep-alive timer on the IPSEC tunnel.

I don't have a keepalive timer set which setting shall I start with?

Lei Tian
Cisco Employee
Cisco Employee

Hi James,

Put 'crypto ipsec df-bit clear ' on both ends, and see if that helps.

HTH,

Lei Tian

Hi Lei,

We did have a routing policy set before which had no effect. The command used was:

ip policy route-map CLEAR-DF-BIT

Does the 'crypto ipsec df-bit clear' have the same effect?

Hi James,

Don't know how's your route-map configured and where the PBR applied on. 'crypto ipsec df-bit clear' is to clear the DF bit before packets got encrypted, and the the PBR is to clear the DF bit before packets enter the interface.

HTH,

Lei Tian

I Just applied the "crypto ipsec df-bit clear" across all router tunnels. After inital tests it doesn't seem to be working. Transfers to 2 sites dropped staight away. Transfers to one site always go through OK. This is the site with the slowest link. I don't know if this has anything to do with it. The link doesn't get used as much either.

Hi James,

I am little bit confused. On the initial post you said, 9 of 10 time it doesn't work. Now you said transfer to 1 site always work. Can you post a simple diagram and the config for the router in question? Maybe that will be easier for people to help you on the troubleshooting.

HTH,

Lei Tian

This is what my colleague posted on the other forum. The Tampa and Reading sites are not included. Tampa is the only tunnel which works fine at the moment. We only have one member of staff there at the moment so it doesn't get that heavily used. This is what makes me think that it has something to do with bandwidth. The command you gave me is on there now I don't have time to get all the configs now. On Monday when I am back at work I will repost the most up to date configs for you.


Scenario
Cisco 2901 w VPN hardware Module (10Mb Lease line) to Cisco 7201 without module, so software (1Gb transit) is failing pretty quick :-( both routers have low CPU  history

Cisco 2901= We have two tunnels to the same network via 2  different routers on two different entry points.
vol-gateway#sh run inter tunn 1
Building configuration...

Current configuration : 273  bytes
!
interface Tunnel1
description Volume to Blue Square House
ip address 10.0.0.5 255.255.255.252
ip mtu 1400
ip tcp adjust-mss  1300
tunnel source 212.*.*.157
tunnel destination 95.*.*.1
tunnel path-mtu-discovery
tunnel protection ipsec profile VolumeVPN
!
end

vol-gateway#sh run inter tunn 2
Building  configuration...

Current configuration : 288 bytes
!
interface  Tunnel2
description Volume to Blue Square 3
ip address 10.0.0.9 255.255.255.252
ip mtu 1400
ip tcp adjust-mss 1300
ip ospf cost 2000
tunnel source 212.*.*.157
tunnel destination 95.*.*.2
tunnel path-mtu-discovery
tunnel protection ipsec profile VolumeVPN
!
end

----------------------------

Cisco 7201 - Blue Square  House

bsh-r1#show run inter tunn 0
Building  configuration...

Current configuration : 287 bytes
!
interface  Tunnel0
description Blue Square House to Volume
bandwidth 10000
ip address 10.0.0.6 255.255.255.252
ip mtu 1400
ip tcp adjust-mss  1300
tunnel source 95.*.*.1
tunnel destination 212.*.*.157
tunnel path-mtu-discovery
tunnel protection ipsec profile VolumeVPN
end


Cisco 7201 - Blue  Square 3
bs3-r1#sh run inter tunn 0
Building  configuration...

Current configuration : 303 bytes
!
interface  Tunnel0
description Blue Square 3 to Volume
bandwidth 10000
ip address 10.0.0.10 255.255.255.252
ip mtu 1400
ip tcp adjust-mss 1300
ip ospf cost 2000
tunnel source 95.*.*.2
tunnel destination 212.*.*.157
tunnel path-mtu-discovery
tunnel protection ipsec profile VolumeVPN
end

Hi James,

Bandwidth could be the issue; you can confirm that by sending small amount of data to sites, see if it work.

Besides of bandwidth, you might also want to make sure there is no asymmetric routing. I see on Volume, you are modifying the ospf cost to prefer bs3-r1 as the primary path. Depends on your setup, if Square House uses bsh-r1 as the primary path to reach Volume, then that is a asymmetric routing, which will cause problem on ipsec.

HTH,

Lei Tian

edit: ignore that, this shouldn't cause this problem.

We ended up shutting down the second route to the Blue Square Data center. My colleague did a Tracert to one of the servers at Blue Square and it was taking a longer route, but I am not sure exactly what was happening. The previous network manager had the bandwidth command set on the tunnels which I understand doesn't actually set the bandwidth - it is only used by EIGRP to find the best route.

Is there someway you can limit the bandwidth on tunnels? Would QoS help? We are thinking of implementing this.

The other day I sent 2gb from Tampa to the volume (the hub site) and it went though fine. I don't think the size of the file matters because some times the file transfers drop immediately where other times they get to about half way. Using FTP is more reliable than using SMB in Windows so we have advised staff to use this as a temporary measure.

This is the bandwidth of the sites:

Volume (the hub)                         10 mb

Vee                                                7mb

Blue Sqaure Data Center              10 gb (I think)

Tampa                                           1mb

Bluesquare is the worst for dropping. Tampa works fine and Vee is intermittent.

I installed Cisco Configuration Professional on Fri and was monitoring the tunnels on the the hub router. The Blue square monitor seemed to be going up and down all the time, where as Vee was running steady at half way. The Tampa shot to the max when I did the the transfer and stayed steady till the transfer was done. The protocols monitor was showing IPSec as taking 1/3 of the bitrate I think. I hope this information helps

Thanks

Hi James,

That information definatly helps. Now, my question is what is the traffic direction, is it from Blue Sqaure to Volume? Does the application have the ability to adjust transmit rate based on congestion of the link?

When you say "Blue square monitor seemed to be going up and down", do you mean the tunnel interface on blue square going up and down?

Regards,

Lei Tian

Hello Lei,


Yes I mean the tunnel interface monitor. At Blue Square we have our webservers and various other servers. The developers deliver a lot of files to this site so the traffic is mainly from Volume to Blue Square. For my testing I was dragging and dropping some iso files from my c drive to all the sites. The 2gb transfer I did the other day was from Tampa to Volume. What I meant from going up and down is that Blue Sqaure seems to be in constant use where as when I wasn't transferring from Tampa the activity was minimal and Vee was about half way. When I was transferring to Blue Square the monitor would shoot to the top for a few secs then drop and at the same point my transfer would fail.

What do you mean by "Does the application have the ability to adjust transmit rate based on congestion of the link"? Which application are you reffering to? CIsco CDP or Windows itself? I don't know how to use CDP well yet I have only just started using it.

Thanks for your help