1) Changed to VTI / IKEV2 =

Nathan Brock · ‎07-04-2017

I am having an unusual problem. The Cisco TAC said nothing looks wrong on the ASA 5516/5510 firewalls, but I am having significant packets drops that just started occurring over this last weekend. These tunnels have been running for years.

I have 6 Site to Site Tunnels

Site HQ -> OKC (AT&T to Cox Communications) Zero Packet Loss

Site HQ -> Durant, OK (AT&T to AT&T) 60% - 80% Packet Loss

Site HQ -> Reno NV (AT&T to AT&T) 20-40% Packet Loss

Site HQ -> Thacker, OK (AT&T to INFO ISP) Zero PL

Site HQ -> Tulsa, OK (AT&T to Cox) Zero PL

Site HQ -> Pocola, OK (AT&T to Cox) Zero PL

Site HQ -> Azure Cloud in Dallas Zero PL (Uses VTI, with static routes, IKEV2)

These all use the exact same transform set encryption, crypto map definitions (except on topography)

All of there were working with 0 PL for years.

I can ping internally from any server to the inside GW with zero PL.

Site Durant, OK -> OKC (AT&T to Cox) Zero PL

Site Reno, NV - > OKC (AT&T to Cox) Zero PL

Any thoughts on how to debug this problem with my site to site traffic.

It appears there are not drops when going to the public IP of the server from either Reno Site or Durant Site.

I have no idea what is causing this for two different sites.

Georg Pauwen · ‎07-04-2017

Hello,

what is terminating the tunnels at the 'problem' sites ? And with packet loss you mean you are losing pings, or is the actual traffic affected too ?

Nathan Brock · ‎07-04-2017

Site HQ - > Reno NV (ASA 5516X -> ASA 5510) IKEV1 Tunnel

Reno, NV has a U-Verse Type modem terminating the connection before hitting our firewall interface port.

Site HQ - Durant, OK ( ASA 5516X -> ASA 5516X) IKEV1 Tunnel

Durant, OK has a Cisco router from AT&T terminating their Fiber

They both drop pings. This has never happened.

They both are running remote desktop (terminal Services) RemoteApp programs extremely slow.

So I assume actual traffic is dropping. If I have both sites Durant and Reno, NV point to OKC there is no slowness or traffic dropping.

My most recent change to Site HQ was adding VTI configuration and a static route to Azure Cloud for a tunnel. But that was months ago and this issue started this last weekend. < 1 week ago.

Any thoughts on some debug tests?

Also it maybe worth noting, my client ipsec VPN connection even over LTE Cellular Data connection doesn't drop packets and the RemoteApps run great. This isn't a site to site connection just cisco vpn client to Site HQ.

Georg Pauwen · ‎07-05-2017

Hello,

depending on the severity of the problem, and if you have downtime available, you could change the tunnel configuration from IKEv1 to IKEv2, since v2 allows you to change the MTU settings; often, MTU is the problem.

I would start out with sending a traceroute to find out at which hop traffic is slow and being dropped.

On the ASA itself, do a 'show interface' on the outside interfaces to make sure there are no dropped packets due to congestion. If you see dropped packets, you could change the flowcontrol settings on the interfaces...

These are just a few suggestions. You could obviously also ask your ISP to perform and end-to-end test. PING traffic often has the lowest priority in provider networks, so if they have congestion in their network, ICMP will be the first traffic affected...

Nathan Brock · ‎07-05-2017

ASA Site HQ =

47529936 packets input, 50224260199 bytes, 0 no buffer

Received 16331 broadcasts, 0 runts, 0 giants

0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort

0 pause input, 0 resume input

0 L2 decode drops

29150389 packets output, 14471701861 bytes, 0 underruns

0 pause output, 0 resume output

0 output errors, 0 collisions, 0 interface resets

0 late collisions, 0 deferred

0 input reset drops, 0 output reset drops

input queue (blocks free curr/low): hardware (1968/1820)

output queue (blocks free curr/low): hardware (2047/1950)

Traffic Statistics for "outside":

47486618 packets input, 49358916032 bytes

29150389 packets output, 13919196531 bytes

121808 packets dropped

1 minute input rate 2190 pkts/sec, 2242279 bytes/sec

1 minute output rate 1529 pkts/sec, 1116012 bytes/sec

1 minute drop rate, 4 pkts/sec

5 minute input rate 2415 pkts/sec, 2735667 bytes/sec

5 minute output rate 1223 pkts/sec, 346348 bytes/sec

5 minute drop rate, 5 pkts/sec

ASA Durant, OK ->

8284971 packets input, 7349024806 bytes, 0 no buffer

Received 1118 broadcasts, 0 runts, 0 giants

0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort

0 pause input, 0 resume input

0 L2 decode drops

5830322 packets output, 2069423489 bytes, 0 underruns

0 pause output, 0 resume output

0 output errors, 0 collisions, 0 interface resets

0 late collisions, 0 deferred

0 input reset drops, 0 output reset drops

input queue (blocks free curr/low): hardware (2015/1839)

output queue (blocks free curr/low): hardware (2047/1853)

Traffic Statistics for "outside":

8283586 packets input, 7192981759 bytes

5830322 packets output, 1950841224 bytes

114932 packets dropped

1 minute input rate 1010 pkts/sec, 765459 bytes/sec

1 minute output rate 837 pkts/sec, 177509 bytes/sec

1 minute drop rate, 11 pkts/sec

5 minute input rate 902 pkts/sec, 771511 bytes/sec

5 minute output rate 686 pkts/sec, 141070 bytes/sec

5 minute drop rate, 9 pkts/sec

This is what I see on drops for show int outside

Georg Pauwen · ‎07-05-2017

Hello,

a certain number of drops are normal behavior for the ASA. You can use the command 'show asp drop' to get more specific information on the packet drops.

You could try basic traffic shaping on your outside interfaces. In the example below, you shape for 20MB (change the value accordingly).

policy-map SHAPE_OUT
class class-default
shape average 20000000

service-policy SHAPE_OUT interface outside

Nathan Brock · ‎07-06-2017

Thanks for your help!

We isolated the problem to be a flapping BGP Handoff between AT&T and Cogentco. There was a BBP route sending us through Dallas, Tx and a router was creating the issue. Actually all of AT&T's traffic was hitting at this point before it came to our main DC. I ended up getting my ISP to block the cogentco BGP peers and use Cox Communications. Ended up fixing everything. Cogentco's network is horrible.

Georg Pauwen · ‎07-07-2017

Nathan,

good stuff, glad that you found the problem and got it fixed !

Nathan Brock · ‎07-07-2017

Hopefully they will fix it perm. They said they can only drop cogentco temporarily. My new goal is to configure all 9 of my sites over VTI and BGP on top of my tunneled mesh to find different routes. For example my ISP route from OKC to DALLAS is SLower than my TULSA route to DALLAS. But if I add up the OKC - T
ULS Site - TO DALLAS Site, its dramatically less ms than going straight to dallas from OKC. My goal is to proxy the best route across my mesh network. I have never it done it but it will be an exciting challenge. I did also notice setting up vASA at specific Azure or AWS has some great proxy benefits connecting my sites. I know there isn't much I can do when it comes to the magic of BGP over the web, but I think I can at least put some spin on the packets.

Georg Pauwen · ‎07-07-2017

Tricky indeed...unless you have multiple BGP entry points at your sites (and even then) you are at the mercy of your ISP. That said, the fact that you got your ISP to block the route is a good sign because it means they are working WITH you...

Nathan Brock · ‎07-05-2017

1) Changed to VTI / IKEV2 = No Luck

2) Changed the MTU on the outside lowered it but now luck

3) Waiting on AT&T to test after hours.

Cisco ASA ? Dropping Packets Over 2 Tunnels, But 4 others work perfect. Confused!