cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1690
Views
0
Helpful
3
Replies

Packet loss and high latency at remote sites

stoopdude1982
Level 1
Level 1

I'm currently dealing with an issue where all the remote sites of the network are experiencing brief packet drops and high latency at the same time every hour, every day. Currently the network is half SD-WAN VeloCloud and the other half is MPLS using DMVPN over EIGRP. Both the remote VeloCloud sites and the MPLS are experiencing the issue. This happens like clockwork usually 13 minutes past every hour. This was discovered about 4 weeks ago but most likely has been going on longer. It took awhile to put the pieces together but it was discovered because of the effect it has on our voice and video services. I currently have a stack of 3850's which is the aggregation point for all routes. It performs the routing between the MPLS network and the VeloCloud network. The MPLS network is all EIGRP, while the VeloCloud network is all static routes back to the 3850 stack. Below are notes with the setup of the network and things that I have tried to remedy the situation. I am open to all ideas as I am running out of things to try. 

Notes

-VeloCloud uses dynamic VPN tunnels between the Hub and it's remote sites

-Connection to the VeloCloud Sites relies on all static routes to and from the core switch

-Connection to our MPLS network uses EIGRP over a port-channel from our core router to our core switch

-MPLS network is DMVPN over EIGRP

-All traffic from both VeloCloud and MPLS sites needing to access the Internet routes to the core switch and then to the firewall

Problem

-Both VeloCloud and MPLS remote sites are experiencing brief packet loss or high latency at the same time every hour

Symptoms

-Remote sites experience a brief packet loss or high latency at the same time every hour. (Example: (9:13, 10:13, 11:13)

-We have noticed the busier the link the more affected it is by the problem. If the link is at high utilization it will lose packets, if the link is not at high utilization then it will experience high latency

-The length of time of the issue is very brief but it affect our voice and video services at the time of occurence. We identified the issue 4 weeks ago but it most likely has been present before that.

-Every hour we have random DMVPN tunnels terminates and re-establish. But even the sites that tunnels do not drop experience the issue

Remedy Measures

-Upgraded core switch, core router, MPLS routers

-Created SPAN on core switch to monitor interface to VeloCloud HUB have not seen anything that stands out

-Created IP SLAs on Core Router testing reachability for both remote tunnel and BGP interface. While we do see the brief packet loss and the tunnel does drop and re-establishes we do not have BGP neighborship loss

-Reached out to Service Providers. They do not see any loss or issues with any of the circuits on their side

-Verified that CPU and Memory levels were low and not overutilized

 

 

3 Replies 3

Joseph W. Doherty
Hall of Fame
Hall of Fame
Sounds like transient congestion possibly due to some scheduled reoccurring bulk data transmission. It might be in your network or the SP's network. Ideally you want to identify the source of the problem. If you do, then you can determine best way to mitigate its impact.

Suggest you retain a consultant to trouble shoot this.

Hello,

 

in addition to Joseph's suggestions, and in reference to the dropped tunnels, you might want to try and disable volume based rekeying:

 

crypto ipsec security-association lifetime kilobytes disable

Hi

 

Was this issue resolved?

 

I was wondering if one of the not-too-cloud-friendly components were maybe sending logs or something In-Bandwidth on an hourly basis or something like that?

 

Just a thought

 

Russ