PaulReally sorry but now I am

Paul Morgan · ‎01-21-2015

Hello all,

I am trying to work out if I am being a bit rubbish or if split horizon is my new worst enemy.

Below is a diagram of my (simplified) problem scenario using EIGRP.

The solution I am looking for is that Router R3 learns of the 10.0.0.0/8 network from both R1 and R2, then does not advertise it to either. Simple with split horizon enabled.

But when either R1 or R2 are rebooted, a decision somehow takes place, and may well determine that R3 should advertise 10/8 to the new (rebooted) neighbour, at which point split horizon prevents it from being advertised back again. This means the topology table on R3 doesn't contain this route for this neighbour and is slow to converge if the other neighbour is lost.

Is there a way to control in which direction routes are advertised first on a neighbour link? and then I can let split horizon do its thing

Or is there something I am not thinking of...

many thanks,

Paul

Jon Marshall · ‎01-21-2015

Paul

Are you seeing this actual behaviour ?

I ask because from your diagram i don't think you should.

If the summary address is being advertised by the top router and R2, for example is rebooted, when it comes back up it will receive advertisements from both the top router and R3.

It doesn't matter which one it gets first because the metric of the route via the top router will be better so it is this one it will use.

Split horizon says you cannot advertise a route out of the interface used to get to that network.

So R2 gets both routes, doesn't matter in which order. It selects the best one ie. the one via the top router and then it can advertise it to R3.

What it can't do is advertise it back to the top router but that's not an issue.

Jon

Paul Morgan · ‎01-21-2015

Hi Jon,

Yes this is a real world problem Im afraid.

After R2 was restarted, it received route advertisements from both sides, but therein lies the problem. Since it has just learned that both interfaces can be used as next hops to reach the summary network, it no longer advertises this summary (ie advertises itself as a route to 10/8) back to R3. R3 now has only one route in its topology table (through R1), and if R1 is lost, must go active to converge. Not desirable since this is my data centre core switch!

The logic of what you say is sound, but it looks like a photo finish situation. If R3 learns of the summary network from both R1 and R2 before it advertises that summary to either, happy days, it puts both in the topology table and has a feasible successor ready. But that isnt what Im seeing.

R3 has the route through R1; and R2 has a route direct and through R3, but R3 has no topology entry for R2. And sh ip eig n de shows no route passed from R2 to R3 (as you'd expect).

I wasnt expecting to see this behaviour and I still think I have done something odd.

( the summary network is actually 30 different remote routers all connected back to both hub routers)

Jon Marshall · ‎01-21-2015

Paul

That's weird because I labbed it up and what I did was shut down both interfaces on R2.

Then I deliberately brought the interface up between R2 and R3 and left the other one down so R2 had the summary via R3.

When I brought up R2's other interface as soon as it got the summary route from the top router it used that one and advertised it to R3.

So it's definitely not a photo finish.

I checked the metrics and the one from the top was a better metric than the one received from R3 which is what you would expect.

So I'm not sure what is happening in your environment.

Perhaps if I get time i'll redo the lab but I did test it quite a few times and It always worked.

A solution to your problem could simply be to use a distribute list under your EIGRP configuration on R3 which stops it advertising that summary route to either R1 or R2.

That would probably do it although I didn't test that as I didn't need to.

Happy to test it out for you if you want me to.

Jon

Paul Morgan · ‎01-22-2015

Picking up on what you say here;

"I checked the metrics and the one from the top was a better metric" ---

I will investigate some metric changes Jon and see if it is something to do with the high delays that are in place to facilitate the route. Perhaps I have created a situation for myself.

Paul Morgan · ‎01-23-2015

Hmmm...

So testing with different metrics does show that you are correct. It will behave as expected when no metric alterations are used. But this is hardly ideal. I am using a basic addition of delay to the (tunnel) interfaces I don't want to give preference to. As soon as they are no longer the same, this problem re-surfaces.

I am thinking I am going to need to load balance using a different method such as summary route advertisements with different distances instead. Or perhaps offset lists... eurgh!

thanks for all your help with that Jon.

Jon Marshall · ‎01-23-2015

Paul

Maybe if we understand exactly what the issue is we could help more.

However if you are simply trying to make sure that when either R1 or R2 reboots that either of them do not see R3 as the best path for the summary then use a distribute list on R3 that stops R3 advertising that route to either R1 or R2.

If however you want R2, for example, to route via R3 using the summary if R2's interface to the top route fails then this would not be a solution.

Difficult to say without knowing more about the topology etc.

Jon

Paul Morgan · ‎01-23-2015

Hi - sorry, a little clarification then might have helped you more. I deliberately over-simplified to get straight to the point.

R3 is an L3 core switch, R1 and R2 and network edge routers and the summaries are coming from 30+ branch hosts. Leased lines and ADSL connect the branches with IPSEC tunnels back to R1 and R2. Traffic was to be load balanced simply (50/50) split using a small additional delay on the tunnels to give preference. The ADSL is highly unstable (but cheap!) at many of these sites so I intended to ensure instant failover simply by ensuring there was a feasible successor for every branch on R3. But similarly, internet or voip traffic inbound on either R1 or R2 must be able to reach the branches through R3 if a connection (tunnel) has been lost.

My design was that if a tunnel (lets say R2 to branch) was lost and no FS existed, the time taken for QUERY (R2 to R3) and convergence was acceptable, but that R3 MUST have an FS for each route. So basically, I am looking for the best solution for load-balancing this design and achieving these objectives. Offset lists or summary routes with Admin Distances are my two simple answers at the moment.

Thanks again for your input

Jon Marshall · ‎01-25-2015

One other query.

Can you effectively summarise the summary routes without that summary address including any subnets that are elsewhere ie. not in your branches.

Jon

Paul Morgan · ‎01-25-2015

First question is a yes. If a link from R1 to a branch goes down, then route via R3 and R2 to the branch. Each branch has tunnels to both R1 and R2. And load balancing is really just from the core yes.

The summarisation is genius (if I do say so myself). The depots each have two concurrent subnets and are numbered 192.168.32 - 96. Centrally, Voice and data are between 192.168.160 - 192. This makes it easy to teach my 2nd line guys and leaves expansion room.

The main issue with the load-balancing is that VOIP traffic wont load-balance on two equal cost paths. VOIP needs one path OR the other. The official EIGRP documentation would suggest that CEF takes care of this by pairing up endpoints in the table, but I have yet to see this work in my environment and I have tried it. VOIP is a sensitive puppy.

So the idea of adding delay to tunnels was to stack the decision permanently in favour of one side or the other (R1 or R2) as a route. Except in the event of a failure.

Jon Marshall · ‎01-26-2015

The summarisation is genius (if I do say so myself).

Glad to hear it :-)

I think their is a way to achieve what you want without having to worry about adding metrics as you do now.

So two last points of clarification -

1) you are weighting the traffic to favour one of the routers, lets say R1 for arguments sake.

This is for return traffic from the core.

Are you influencing which path is taken at the other end in the depots ie. do all depots use their link to R1 unless it fails ?

What I'm trying to understand is if traffic from a depot comes in via R2 to the core do you want it to go back via R2.

I would guess not because of the way you have applied the metrics but can you clarify for me.

2) Sorry to ask again but the branch summaries. The actual summarisation for each branch is done on the branch router and not R1 or R2 ?

Jon

Paul Morgan · ‎01-26-2015

1) If traffic is to favour one route, it should do so in both directions. So no loops. My current thinking is offset lists on R1 and R2, R1 offset subnets 32-63, R2 offset subnets 64-95

2) summarisation occurs to reduce routing table sizes to minimum. So branch routers are set as EIGRP STUB. Its not really worth them summarising downstream. R1 and R2 summarise outbound to branches, simply 192.168.0.0/18 and 192.168.64.0/19.

Jon Marshall · ‎01-26-2015

Paul

Really sorry but now I am even more confused.

From the sounds of 1) you want half the traffic to go via R1 and half via R2.

Is that correct ?

2) are R1 and R2 summarising the branch networks to R3 ?

3) how are you advertising 192.168.0.0/18 to the branches ie. where is the summarisation done and how are you doing it ?

Like I said before there may be a solution where you not only don't need to manipulate metrics but you also don't need worry about an FS because you already have the route in the routing tables on R1, R2 and R3.

But I don't want to suggest it if it is going to break your network for obvious reasons so I need to understand exactly what is doing what.

Jon

Paul Morgan · ‎01-26-2015

It's ok Jon, I know how difficult it is to visualise these things sometimes.

1) Yes, half the traffic on each line.

2) No. If they summarise backwards to R3, a single line outage will not take the route down on R3.

3) On R1 and R2, each Interface Tunnel statement includes ip summary-address eigrp 1 192.168.0.0 255.255.192.0 and 192.168.64.0 255.255.224.0

Each branch router only needs eigrp stub. No summaries.

Hope this helps.

Jon Marshall · ‎01-26-2015

This is what i think would work.

Two assuptions i'm making -

1) R1 and R2 have full routes in terms of the remote branch subnets which from what we have talked about seems to be the case.

2) R1 will advertise the specific subnets it is primary for (see below) to R3 which then advertises them to R2 and R2 will do the same for it's primary subnets.

R1 is primary for 32 - 63 summary address 192.168.32.0 255.255.224.0
R2 is primary for 64 - 95 summary address 192.168.64.0 255.255.224.0

Each router is secondary for the other router's primary subnets.

on R1 configure a summary address for R2's subnets on the interface connecting to R3 -

ip summary-address eigrp <AS no> 192.168.64.0 255.255.224.0

on R2 do the same for R1's subnets -

ip summary-address eigrp <AS no> 192.168.32.0 255.255.224.0

So now -

R1 points to R3 and R3 points to R2 for 192.168.32.0/19
R2 points to R3 and R3 points to R1 for 192.168.64.0/19

Because you have used a summary address this suppresses the advertisement of the more specific routes within that summary range.

R1 will therefore advertises it's specific subnets for which it is primary to R3 and a summary address only for R2's subnets.

And R2 does the same ie. it advertises it's specific subnets and a summary for R1's.

R3 then obviously passes these summaries via EIGRP to R1 and R2.

R3's routing table will have specific branch routes pointing to the respective
primary router but only a summary route for the same subnets pointing to the secondary router.

Because a router will always pick the longest match it will use the more specific subnets unless there isn't a matching route.

Which means no need to use metrics to load balance traffic.

In addition the summary route is already in the routing table so no need for either R1 or R2 to send a query to R3 if one of their branch links fail.

I may well have overlooked something so let me know whether you think this will work for you or not.

Jon

Setting advertisement / split horizon direction in EIGRP routing