cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
930
Views
0
Helpful
10
Replies

ECMP with BGP and GRE

dibbledobble
Level 1
Level 1

Hi,

We have two Internet circuits provided by the same ISP, and also use GRE to tunnel to an external 3rd party.

We want to use ECMP to load balance traffic but have some questions.

  1. When ECMP is used with GRE, does it load balance using the tunnel interface IPs or the underlying source and destination IPs that were used to create the tunnel?
  2. Can ECMP detect utilisation on a link and make an intelligent decision to not send anymore traffic on a link that is full?

Thanks

1 Accepted Solution

Accepted Solutions


@dibbledobble wrote:

For reachability to the tunnel destinations on our side we receive a default route from our ISP on each circuit and the router installs both as equal. 

If the router "sees" both tunnels as being equally valid across either path, i.e. router tunnel destinations are via ECMP default route, then either tunnel could use either path, and there's a 50/50 chance both tunnels are using the same physical path.

That might be overcome by using a static route, with a higher priority than the default, selecting a specific egress interface for each tunnel destination.  (If one path fails, either the tunnel fails, forcing all traffic to use one tunnel, or the tunnel traffic shifts to the remaining path.  Redundancy in either case.)


@dibbledobble wrote:

One other question as well, what sort of split is typically seen for return / ingress traffic and is there a way influence it so it's evenly balanced? The 3rd party is also using ECMP, I understand there are elements such as transit networks and ISPs where we don't have control but just curious what other peoples experience are with this?


Real-world ingress path selection, is very variable.  What I've seen, with multiple Internet connections (almost always different ISPs - if it was same ISP, probably more likely an multi port Ethernet would use Etherchannel), traffic on the Internet often uses a single ingress path, even if ECMP is possible.

If your outbound isn't just ECMP defaults, the better path outbound is often also the better path inbound, for same src/dst pair.  It's not too difficult to send traffic out along the path you want, but it can be difficult to get reply traffic to use the inbound path you want.  (You can, working with different ISPs, try to insure certain src/dst pairs, ingress traffic, uses the path you want it to.)

However, working with just one ISP, they should be able to pretty much do the same for your ingress traffic , as you can for your egress traffic to them.

I.e. general Internet traffic can do ECMP to you, but the two tunnels have to be "bound" to a particular path.  Again, the mirror of your configuration to them.

View solution in original post

10 Replies 10

Joseph W. Doherty
Hall of Fame
Hall of Fame

#2 Ordinary router protocols' ECMP, no.

Cisco's PfR, would deal with that, but unsure it's still available.

#1 Do you have one tunnel for each physical link, or just one tunnel?

If you have just one tunnel, it's seen as a single flow, so normally, it will only use one physical link, unless packet-per-packet distribution is being use, then the tunneled packets would be split across the two physical paths.  (This it NOT recommended!  [As it often leads to out-of-sequence packet delivery, which often has an adverse impact.])

I recall (???) a later IOS feature had some (optional?) enhancement for multipath and tunnels, but don't recall the specifics.  I.e. it may have allowed a single tunnel's packet's to be distributed across ECMP by flow.

If you have multiple tunnel, I believe they can be used in an ECMP fashion.

The feature I had in mind, I believe, is https://www.cisco.com/c/en/us/td/docs/ios-xml/ios/ipswitch_cef/configuration/xe-16/isw-cef-xe-16-book/isw-cef-ecmp-loadbalance-with-tunnel-visibility.html

Hi @Joseph W. Doherty 

I cannot speak to how IOS/XE software-based routers might handle #1, but whether or not a hardware-based router can hash on the inner (payload) header will be platform dependent. In most use-cases, the user would want to hash on the inner, as the outer (GRE) header tends to have less variation than the inner (ie, fewer tunnels than flows within the tunnels). In order to hash on the inner, the hardware forwarding engine would have to be designed to recognize outer encaps (GRE, MPLS, etc) and know the proper offset to look deeper into the packet to find inner hash fields (eg, the payload's IP addresses). "Modern" NPUs can do this easily, but not all gear in the field is modern.

Perhaps the OP can respond with their router model and someone familiar with it can verify whether the inner header can be used in hashing. However, a caveat here is that this only would answer the question for traffic egressing the customer network toward the ISP, as traffic load sharing in the reverse direction, toward the customer, is under the control of the ISP.

Disclaimers: I am long in CSCO. Bad answers are my own fault as they are not AI generated.

Hmm, I'm unsure whether there's any need to comment on Jim's info, as it's all correct, but reading what he wrote, I wonder if anyone might assume things which aren't stated.

Jim writes "In most use-cases, the user would want to hash on the inner . . ." (header).  He's, I believe, 100% correct.

However, similar questions might include "In most use-cases, does the user really need to hash on the inner headers?"

And/or, not withstanding software/hardware/platform considerations, "Are inner headers actually available for inspection?".  Think encryption.

For OP, there's a desire to LB across both SP links (oh, and it's great that Jim, mentioned, we only control to SP), and we're just using GRE, at least encryption isn't an issue.

In any case, on software based platforms, ignoring encryption, using an inner header, would be relatively simple, different story when using dedicated hardware (although Jim is also correct, the latest NPU probably have the capability, ah, but likely still need software to set it up).

So, as vendors, such as Cisco, at least in the Enterprise market, probably don't get much demand for this capability, probably are in no rush to provide it, especially, if you can configure a tunnel across both paths, LB would work just as for two physical interfaces, which may be just fine.

To recap, splitting a single tunnel across multiple physical paths, is very dependent on the platform, but splitting traffic across multiple tunnels ("bound" to a particular physical path) likely, can be done as it would across multiple physical interfaces.

Dynamic LB, outside of SD-WAN (?), I only know, in the Cisco world, supported by PfR.

BTW, in theory, ECMP, if one link hits 100%, all links should be 100%.  In practice, if flows are being LBed, that's not the case, but often even optimal flow based LB will not be equal.

CEF packet-by-packet, comes very close to really equal LBing, but because of flow sequencing issues, isn't recommended.

Other approaches, such a ATM IMUX, or MLPPP, also do nice equal LBing, but each has its own considerations.

Possibly, if you can achieve one tunnel "bound" to each link, you achieve "normal" ECMP, which will be "good enough".

Oh, just as with Etherchannel LB algorithm choices, believe some CEF ECMP might offer options to improve LB in specific cases; another, platform dependency, though.

dibbledobble
Level 1
Level 1

Thanks Joseph, will have a look into PfR.

We have two tunnels using the same source IP configured on a loopback, this is advertised across both ISP links for resiliency.

Lets say we advertise the 10.10.10.0 network to our 3rd party, they see equal cost routes via the tunnel IPs, and there are also equal cost routes back to our source IP, how does ECMP decide to distribute traffic? 

Our tunnel config:

interface Tunnel0
ip address 172.16.100.2 255.255.255.252 <--tunnel IP
tunnel source 1.1.1.1 <--source IP
tunnel destination 2.2.2.2

interface Tunnel1
ip address 172.16.100.6 255.255.255.252 <--tunnel IP
tunnel source 1.1.1.1 <--source IP
tunnel destination 3.3.3.3

=============================================

3rd party routing table:

B               1.1.1.1 [20/0] via 192.168.100.2
                             [20/0] via 192.168.100.6

B               10.10.10.0 [20/0] via 172.16.100.2
                                   [20/0] via 172.16.100.6

I believe ECMP will be across both tunnels.

However, what's seen for routing, in you example, the tunnel destinations 2.2.2.2 and 3.3.3.3?

I.e. it's possible, both tunnels get their traffic directed to the same physical link.

For reachability to the tunnel destinations on our side we receive a default route from our ISP on each circuit and the router installs both as equal. 

One other question as well, what sort of split is typically seen for return / ingress traffic and is there a way influence it so it's evenly balanced? The 3rd party is also using ECMP, I understand there are elements such as transit networks and ISPs where we don't have control but just curious what other peoples experience are with this?


@dibbledobble wrote:

For reachability to the tunnel destinations on our side we receive a default route from our ISP on each circuit and the router installs both as equal. 

If the router "sees" both tunnels as being equally valid across either path, i.e. router tunnel destinations are via ECMP default route, then either tunnel could use either path, and there's a 50/50 chance both tunnels are using the same physical path.

That might be overcome by using a static route, with a higher priority than the default, selecting a specific egress interface for each tunnel destination.  (If one path fails, either the tunnel fails, forcing all traffic to use one tunnel, or the tunnel traffic shifts to the remaining path.  Redundancy in either case.)


@dibbledobble wrote:

One other question as well, what sort of split is typically seen for return / ingress traffic and is there a way influence it so it's evenly balanced? The 3rd party is also using ECMP, I understand there are elements such as transit networks and ISPs where we don't have control but just curious what other peoples experience are with this?


Real-world ingress path selection, is very variable.  What I've seen, with multiple Internet connections (almost always different ISPs - if it was same ISP, probably more likely an multi port Ethernet would use Etherchannel), traffic on the Internet often uses a single ingress path, even if ECMP is possible.

If your outbound isn't just ECMP defaults, the better path outbound is often also the better path inbound, for same src/dst pair.  It's not too difficult to send traffic out along the path you want, but it can be difficult to get reply traffic to use the inbound path you want.  (You can, working with different ISPs, try to insure certain src/dst pairs, ingress traffic, uses the path you want it to.)

However, working with just one ISP, they should be able to pretty much do the same for your ingress traffic , as you can for your egress traffic to them.

I.e. general Internet traffic can do ECMP to you, but the two tunnels have to be "bound" to a particular path.  Again, the mirror of your configuration to them.


If the router "sees" both tunnels as being equally valid across either path, i.e. router tunnel destinations are via ECMP default route, then either tunnel could use either path, and there's a 50/50 chance both tunnels are using the same physical path.

That might be overcome by using a static route, with a higher priority than the default, selecting a specific egress interface for each tunnel destination.  (If one path fails, either the tunnel fails, forcing all traffic to use one tunnel, or the tunnel traffic shifts to the remaining path.  Redundancy in either case.)

I've been trying to implement static routes in the lab to prevent all tunnels from using the same path.

So I have two circuits and four tunnels, static routes to each of the tunnel destination endpoints. We use GRE keepalives in our environment so as part of testing I've lowered these to 5 2.

The desired setup would be to have two tunnels to use Circuit A, and then the other two to use Circuit B. To test, I shut down circuit A and expect two tunnels to fail, however what happens is three tunnels fail even with the static routes.

Are the static routes alone enough to force the tunnels to use a circuit / physical path or does the return traffic need to be engineered to mirror the outbound flow as well?

 

Are the static routes alone enough to force the tunnels to use a circuit / physical path or does the return traffic need to be engineered to mirror the outbound flow as well?

Cannot say without more information, like the lab's actual topology, device configurations, and exactly what you did to shut down the circuit.

I don't think it likely, but what were the actual lab platforms (and IOS) being used?  (I wondering if it's something I could replicate in my copy of CML.)

Sorry it's been a while. I had both of my tunnel source IPs advertised out of both circuits and I think this was causing issues, once I changed this to advertise one source IP per circuit the tunnels now appear to behave as expected.

Thanks for all of your help on this!