Re: RIP and redundant paths

richmorrow624 · ‎09-18-2006

I have five sites configured as the drawing with GRE tunnels and RIP as the routing protocol.

The normal operation is for OSPF to run through the MPLS cloud (not shown)until there is a failover condition. Both GRE tunnels are always up and populating the RIP tables.

The transition is pretty fast when I have tested the fialover, but I have seen a problem more than once.

The path and failover works fine for several minutes, then suddenly, the routing dies and I cannot ping across the tunnel.

The routing tables on the remote end are still populated, but the RIP routes die out of the Main site switches.

Is it possible with the two paths that it could be causing a problem? A loop or something?

Isn't RIP supposed to load balance?

I am wondering if the HSRP also is causing something to happen that is undesirable.

Any thoughts?

Richard Burts · ‎09-18-2006

Richard

I doubt that HSRP is doing anything that is causing this problem. There is not quite enough information here to know for sure the source of the problem, but I do have a guess at it. I am guessing that in normal operation (when OSPF is running through MPLS) the tunnel destination is advertised through OSPF and the tunnel works. But I am guessing that when OSPF stops running through MPLS that either the tunnel destination is not advertised and is not reachable (which causes the tunnel to fail) or that the tunnel destination is advertised through RIP over the tunnel (which causes the tunnel to fail with a recursive routing problem).

You could do some things to verify whether my guess is correct: when OSPF is running over MPLS on the remote router with GRE tunnel do a show ip route for the tunnel destination address and do a traceroute from the remote router to the tunnel destination address. I suspect that the traceroute will go through the MPLS cloud to get to the tunnel end point. Then when OSPF is not running through MPLS do the show ip route and the traceroute. I suspect that either the traceroute has no route to the tunnel destination or that it is trying to go THROUGH the tunnel to get to the tunnel end point.

If that is not the cause of the problem then it might be helpful if you would post the configuration of the routers that have the tunnel.

HTH

Rick

HTH

Rick

tdrais · ‎09-18-2006

One thing to check is that your tunnels end points (tunnel source/tunenl dest) do not get advertised though the tunnels.

You can get the problem of tunnel tring to form itself through itself.

In your case since OSPF has a better AD at the remote site it would override any rip routes learned. When the OSPF no longer has the route the rip routes are left. With 2 tunnels you can get all kinds of nasty senerios.

richmorrow624 · ‎09-18-2006

Thanks for the reply guys,

Here is the switch1, switch 2 is configured the same way, except for the tunnel endpoint.

Along with the remote VPN router.

It looks to me like the tunnel is being advertised through itself correct?

Richard Burts · ‎09-18-2006

Richard

Thanks for posting the config info. It answers some questions and raises other questions.

I note that the tunnel end points are:

tunnel destination 10.10.1.1

tunnel destination 10.10.1.5

and I note that these addresses are covered by static routes:

ip route 10.10.1.1 255.255.255.255 6.16.17.27

ip route 10.10.1.5 255.255.255.255 6.16.17.27

static routes for the tunnel end points are one of the best ways to avoid problems with the tunnel recursive route problem (the problem where the router attempts to get to the tunnel end point through the tunnel) which is the issue that Tim and I both raised.

So it looks like you are protected from the recursive issue. But I would like to know what these static routes are doing. I see that the next hop address in the static route is 6.16.17.27 and that is in the connected subnet of FastEthernet0/1. But I do not see anything in the drawing or in your explanation that tells me what/where that is and what impact might be on it when MPLS fails. Can you clarify this?

Also as I suggested before it would be helpful to get the output of a traceroute from the router to the tunnel endpoint at a time when the tunnel is working and then from a time when the tunnel is not working.

Another note about tunnels: unless you have configured GRE tunnel keepalive (a recent feature in IOS) you can not depend on the interface state to tell you anything useful about a GRE tunnel. The router will report the tunnel as up/up so long as it believes that it has a route to the tunnel destination. The tunnel can be up/up and pass no traffic at all.

HTH

Rick

HTH

Rick

richmorrow624 · ‎09-18-2006

Thanks for the reply Rick,

The configs were done before I got to this company, the 6.16.17.27 is not the real address, it is edited. It is a DSL connection to the Internet.

The VPN router has a secure tunnel to the Main site PIX firewall.

There is no impact on the connectivity to the Internet when MPLS fails.

I have done traceroutes at both times like you asked.

I have not saved any of them, but when the tunnel is working it will trace to the other end, when it is not working, it dies at the first hop on both sides.

This past weekend, I was testing the failover and the only thing that got the tunnel back upo was a remote side VPN router reload.

One thing I have noticed, when the tunnel is not working:

If I source the ping from a workstation on the 10.10.151.0 subnet, I can ping the remote side tunnel endpoint and trace the connection.

But I cannot ping from the switch and the remote side will not ping anything on the main site side.

When the tunnel is working, the main site switch has the RIP routes populated and I can ping and trace from the main site Switch to the remote site router and vise versa.

Is it possible there is a configuration issue with the encryption access-lists?

All remote sites are configured the same and right now the other sites are up and working.

This switch shows that it has been up for more than a year, could it use a reboot?

Is there a possibility od a bug in the IOS code?

I am not sure if the other tunnels are going up and down, I have tested the failover in all sites and thought it was working until I discovered this problem.

richmorrow624 · ‎09-18-2006

Just another note,

I shut down all of the tunnels in switch 2 and i still have the problem of the tunnel dieing out.

I checked this morning and it was dead again.

Everything shows like it is up ( Ichecked in the remote site router and it shows the tunnel up), there are RIP routes in the remot e site router for all of the tunnels in the switch.

I can see RIP working on the rmeote site router, but cannot ping the tunnel endpoints.

The routes have died in the switch at main site.

Also, ICMP debug shows the packets getting to the remote site but not getting to main site.

richmorrow624 · ‎09-19-2006

tdrais,

did you see my replys to Rick?

Do you have any thoughts on this?

I reloaded the remote site router again and the tunnel is up and working again.