cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
3523
Views
20
Helpful
10
Replies

Flapping GRE Tunnel Interface

CarloCrz
Level 1
Level 1

Hi all,

 

I have a flapping GRE tunnel between 2 C2801 (C2801-ADVIPSERVICESK9-M), Version 12.3(14)T4

Both routers have several other tunnels working fine and the ISP network doesn't seem to have problems (0.35% loss with ping size 1500). The tunnel goes down approximately every hour and turns up in approx 5 mins.

 

Following are the 2 configs:

 

R1

interface Tunnel4

description Tunnel XXX

bandwidth 8000

ip address 192.168.104.1 255.255.255.0

ip mtu 1400

ip flow ingress

ip flow egress

ip route-cache flow

ip tcp adjust-mss 1336

ip ospf cost 15

delay 2000

keepalive 1 3

tunnel source FastEthernet0/1

tunnel destination 172.22.77.1

 

R2

interface Tunnel1

description Tunnel XXX

bandwidth 8000

ip address 192.168.104.7 255.255.255.0

ip mtu 1400

ip route-cache flow

ip tcp adjust-mss 1336

delay 2000

keepalive 1 3

tunnel source FastEthernet0/1

tunnel destination 172.22.8.7

 

I activated several debugs, but the only useful info that I get is related to the change of state:

 

008395: .Aug 25 2020 14:11:21.316 UTC: %LINEPROTO-5-UPDOWN: Line protocol on Interface Tunnel1, changed state to down

008396: .Aug 25 2020 14:11:26.316 UTC: %LINEPROTO-5-UPDOWN: Line protocol on Interface Tunnel1, changed state to up

 

While using a repeated ping I noticed that during the down time I am not able to reach the other device.

I wander if this issue can be related to one of the 2 ISPs involved or something is wrong with the router config.

 

 

Thank you in advance to anybody wanting to help!

1 Accepted Solution

Accepted Solutions

Giuseppe Larosa
Hall of Fame
Hall of Fame

Hello @CarloCrz ,

as a start I would use less aggressive timers for the tunnel keepalive

>> keepalive 1 3

 

Hope to help

Giuseppe

 

View solution in original post

10 Replies 10

Giuseppe Larosa
Hall of Fame
Hall of Fame

Hello @CarloCrz ,

as a start I would use less aggressive timers for the tunnel keepalive

>> keepalive 1 3

 

Hope to help

Giuseppe

 

balaji.bandi
Hall of Fame
Hall of Fame

Tunnel is overlay, So it rely on underlay infrastrucure.

 

So i will start suggesting Physical port Fas 0/1 to ISP

also monitor out of the Tunnel ping between Interface IP see any packet Drops

what is the utilisation of this port and what is CPU level when the link go down up ?

 

BB

***** Rate All Helpful Responses *****

How to Ask The Cisco Community for Help

Joseph W. Doherty
Hall of Fame
Hall of Fame
One possibility that comes to mind, what's the actual bandwidth supported/guaranteed for this tunnel? Reason I ask, if your tunnel's transit bandwidth is less than the physical port bandwidths, burst congestion along the tunnel's path might cause lost of the tunnel's keep alives, and then the tunnel would go down.

BTW, how did you come to chose the IP MTU and IP TCP adjust-mss values?

Also BTW, you might consider enabling IP MTU discovery.

Richard Burts
Hall of Fame
Hall of Fame

We have only minimal information here. You tell us that each router has several tunnels. Are all of the tunnels sourced from the same physical interface (Fast0/1)? 

 

When you get a tunnel interface state change message on one router, do you also get a similar message on the tunnel peer router at about the same time? Or does some times one router report the tunnel is down but the other router continues to believe that the tunnel is up?

 

I agree with the suggestion about relaxing the tunnel keep alive timers.

 

It may not be significant but I notice something in your post. You tell us that "turns up in approx 5 mins."  

It might be that normally it comes back up in 5 minutes, but the log messages you post do not show that

008395: .Aug 25 2020 14:11:21.316 UTC: %LINEPROTO-5-UPDOWN: Line protocol on Interface Tunnel1, changed state to down

008396: .Aug 25 2020 14:11:26.316 UTC: %LINEPROTO-5-UPDOWN: Line protocol on Interface Tunnel1, changed state to up

This shows recovery in 5 seconds.

HTH

Rick

Hi Richard,

 

you are right, the down period is about 5 seconds, not minutes. After the first answers I decided to follow @Giuseppe Larosa's suggestion and I increased the timers (now 5 4), since this I didn't had any flapping.

All of the tunnels are using the same physical interface and I can see the tunnel going up ad down from both devices.

 

Answering to @Joseph W. Doherty, I checked the bandwidths and utilization and there is no problem with that. Regarding the IP MTU and IP TCP adjust-mss values, I inherited this infrastructure and unfortunately I don't know the design reasons.

 

 

At this point, I will probably adjust the timers (maybe 3 4) and go ahead by using this tunnel. I suppose that both ISPs have some delay and cause some loss, so since I use them together it happens to lose 2-3 ping in a row.

 

Thank you to everyone

"I checked the bandwidths and utilization and there is no problem with that."

Checked how?  Reasons I ask are, first, something like microbursts can be difficult to capture/see with the "usual" networking monitoring tools, and second, you were (apparently) having tunnel drops due to keep alives lost over several seconds, and now, with longer keep alive timers, you see ". . . lose 2-3 ping in a row."  I.e. although tunnel flaps are "fixed", you may still have a problem (first shown by tunnel flaps).

"Regarding the IP MTU and IP TCP adjust-mss values, I inherited this infrastructure and unfortunately I don't know the design reasons."

Ah, in that case, at some point you may want to analyze them and perhaps revise them.  A good place to start for how to set up tunnels, optimally, is this Cisco TechNote: https://www.cisco.com/c/en/us/support/docs/ip/generic-routing-encapsulation-gre/25885-pmtud-ipfrag.html  (Although the TechNote discusses much about IPSec tunnels, much of it applies to GRE tunnels too.)

The tunnel's actual use is almost null, we have a mesh topology and this one never reaches 1 Mbps from what I see on the show int Tu4 output. The physical WAN connection is 15/15 on both sides of the tunnel.

 

Unfortunately the link you sent me is 404 at the moment but yes, I will have to analyze those values and for sure the connection is not without issue. Do you think that by configuring IP MTU Discovery and maybe setting different IP MTU and IP TCP adjust-mss values we could obtain a more reliable connection?

"Unfortunately the link you sent me is 404 . . ."

That might be fixed, now.  (Cisco revised their site [?].  I had pasted link directly into note, also, when I just tried, I also got 404.  Revised to use "insert link" option.  Now, it works for me.)

"Do you think that by configuring IP MTU Discovery and maybe setting different IP MTU and IP TCP adjust-mss values we could obtain a more reliable connection?"

No I don't think that would help make the connection more reliable, although that should use it more optimally.

If your WAN bandwidths are 15/15, you be better off shaping, but multipoint connections are a problem to shape for unless you use something like DMVPN with the later dynamic shaping feature.

It works now, I confirm. Thank you for all of your advices, I will apply them.

balaji.bandi
Hall of Fame
Hall of Fame

Thank you for the input and looks like changing timers fix. good to know

BB

***** Rate All Helpful Responses *****

How to Ask The Cisco Community for Help

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Review Cisco Networking products for a $25 gift card