07-04-2017 11:36 PM - edited 03-08-2019 11:12 AM
I am having an unusual problem. The Cisco TAC said nothing looks wrong on the ASA 5516/5510 firewalls, but I am having significant packets drops that just started occurring over this last weekend. These tunnels have been running for years.
I have 6 Site to Site Tunnels
Site HQ -> OKC (AT&T to Cox Communications) Zero Packet Loss
Site HQ -> Durant, OK (AT&T to AT&T) 60% - 80% Packet Loss
Site HQ -> Reno NV (AT&T to AT&T) 20-40% Packet Loss
Site HQ -> Thacker, OK (AT&T to INFO ISP) Zero PL
Site HQ -> Tulsa, OK (AT&T to Cox) Zero PL
Site HQ -> Pocola, OK (AT&T to Cox) Zero PL
Site HQ -> Azure Cloud in Dallas Zero PL (Uses VTI, with static routes, IKEV2)
These all use the exact same transform set encryption, crypto map definitions (except on topography)
All of there were working with 0 PL for years.
I can ping internally from any server to the inside GW with zero PL.
Site Durant, OK -> OKC (AT&T to Cox) Zero PL
Site Reno, NV - > OKC (AT&T to Cox) Zero PL
Any thoughts on how to debug this problem with my site to site traffic.
It appears there are not drops when going to the public IP of the server from either Reno Site or Durant Site.
I have no idea what is causing this for two different sites.
07-04-2017 11:41 PM
Hello,
what is terminating the tunnels at the 'problem' sites ? And with packet loss you mean you are losing pings, or is the actual traffic affected too ?
07-04-2017 11:51 PM
Site HQ - > Reno NV (ASA 5516X -> ASA 5510) IKEV1 Tunnel
Reno, NV has a U-Verse Type modem terminating the connection before hitting our firewall interface port.
Site HQ - Durant, OK ( ASA 5516X -> ASA 5516X) IKEV1 Tunnel
Durant, OK has a Cisco router from AT&T terminating their Fiber
They both drop pings. This has never happened.
They both are running remote desktop (terminal Services) RemoteApp programs extremely slow.
So I assume actual traffic is dropping. If I have both sites Durant and Reno, NV point to OKC there is no slowness or traffic dropping.
My most recent change to Site HQ was adding VTI configuration and a static route to Azure Cloud for a tunnel. But that was months ago and this issue started this last weekend. < 1 week ago.
Any thoughts on some debug tests?
Also it maybe worth noting, my client ipsec VPN connection even over LTE Cellular Data connection doesn't drop packets and the RemoteApps run great. This isn't a site to site connection just cisco vpn client to Site HQ.
07-05-2017 12:24 AM
Hello,
depending on the severity of the problem, and if you have downtime available, you could change the tunnel configuration from IKEv1 to IKEv2, since v2 allows you to change the MTU settings; often, MTU is the problem.
I would start out with sending a traceroute to find out at which hop traffic is slow and being dropped.
On the ASA itself, do a 'show interface' on the outside interfaces to make sure there are no dropped packets due to congestion. If you see dropped packets, you could change the flowcontrol settings on the interfaces...
These are just a few suggestions. You could obviously also ask your ISP to perform and end-to-end test. PING traffic often has the lowest priority in provider networks, so if they have congestion in their network, ICMP will be the first traffic affected...
07-05-2017 11:32 AM
ASA Site HQ =
47529936 packets input, 50224260199 bytes, 0 no buffer
Received 16331 broadcasts, 0 runts, 0 giants
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort
0 pause input, 0 resume input
0 L2 decode drops
29150389 packets output, 14471701861 bytes, 0 underruns
0 pause output, 0 resume output
0 output errors, 0 collisions, 0 interface resets
0 late collisions, 0 deferred
0 input reset drops, 0 output reset drops
input queue (blocks free curr/low): hardware (1968/1820)
output queue (blocks free curr/low): hardware (2047/1950)
Traffic Statistics for "outside":
47486618 packets input, 49358916032 bytes
29150389 packets output, 13919196531 bytes
121808 packets dropped
1 minute input rate 2190 pkts/sec, 2242279 bytes/sec
1 minute output rate 1529 pkts/sec, 1116012 bytes/sec
1 minute drop rate, 4 pkts/sec
5 minute input rate 2415 pkts/sec, 2735667 bytes/sec
5 minute output rate 1223 pkts/sec, 346348 bytes/sec
5 minute drop rate, 5 pkts/sec
ASA Durant, OK ->
8284971 packets input, 7349024806 bytes, 0 no buffer
Received 1118 broadcasts, 0 runts, 0 giants
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort
0 pause input, 0 resume input
0 L2 decode drops
5830322 packets output, 2069423489 bytes, 0 underruns
0 pause output, 0 resume output
0 output errors, 0 collisions, 0 interface resets
0 late collisions, 0 deferred
0 input reset drops, 0 output reset drops
input queue (blocks free curr/low): hardware (2015/1839)
output queue (blocks free curr/low): hardware (2047/1853)
Traffic Statistics for "outside":
8283586 packets input, 7192981759 bytes
5830322 packets output, 1950841224 bytes
114932 packets dropped
1 minute input rate 1010 pkts/sec, 765459 bytes/sec
1 minute output rate 837 pkts/sec, 177509 bytes/sec
1 minute drop rate, 11 pkts/sec
5 minute input rate 902 pkts/sec, 771511 bytes/sec
5 minute output rate 686 pkts/sec, 141070 bytes/sec
5 minute drop rate, 9 pkts/sec
This is what I see on drops for show int outside
07-05-2017 12:42 PM
Hello,
a certain number of drops are normal behavior for the ASA. You can use the command 'show asp drop' to get more specific information on the packet drops.
You could try basic traffic shaping on your outside interfaces. In the example below, you shape for 20MB (change the value accordingly).
policy-map SHAPE_OUT
class class-default
shape average 20000000
service-policy SHAPE_OUT interface outside
07-06-2017 11:20 PM
Thanks for your help!
We isolated the problem to be a flapping BGP Handoff between AT&T and Cogentco. There was a BBP route sending us through Dallas, Tx and a router was creating the issue. Actually all of AT&T's traffic was hitting at this point before it came to our main DC. I ended up getting my ISP to block the cogentco BGP peers and use Cox Communications. Ended up fixing everything. Cogentco's network is horrible.
07-07-2017 12:03 AM
Nathan,
good stuff, glad that you found the problem and got it fixed !
07-07-2017 12:09 AM
Hopefully they will fix it perm. They said they can only drop cogentco temporarily. My new goal is to configure all 9 of my sites over VTI and BGP on top of my tunneled mesh to find different routes. For example my ISP route from OKC to DALLAS is SLower than my TULSA route to DALLAS. But if I add up the OKC - T
ULS Site - TO DALLAS Site, its dramatically less ms than going straight to dallas from OKC. My goal is to proxy the best route across my mesh network. I have never it done it but it will be an exciting challenge. I did also notice setting up vASA at specific Azure or AWS has some great proxy benefits connecting my sites. I know there isn't much I can do when it comes to the magic of BGP over the web, but I think I can at least put some spin on the packets.
07-07-2017 12:36 AM
Tricky indeed...unless you have multiple BGP entry points at your sites (and even then) you are at the mercy of your ISP. That said, the fact that you got your ISP to block the route is a good sign because it means they are working WITH you...
07-05-2017 12:26 PM
1) Changed to VTI / IKEV2 = No Luck
2) Changed the MTU on the outside lowered it but now luck
3) Waiting on AT&T to test after hours.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide