03-30-2016 12:40 PM - edited 03-05-2019 03:41 AM
Hi,
I've implemented a simple per session load-balancing setup between two Internet providers on a 2911 router.
ISP "A" is on GigabitEthernet0/0 and ISP "B" on GigabitEthernet0/1.
Gi0/2 is for the LAN side, 192.168.0/24.
One of the simplest approaches for balancing traffic, as explained in a lot of docs and forum posts, consists of declaring two default routes. And that's what I did.
For the sake of this post, let's assume 111.222.333.444 is ISP A's gateway and 555.666.777.888 ISP B's gw. So, I did:
ip route 0.0.0.0 0.0.0.0 111.222.333.444
ip route 0.0.0.0 0.0.0.0 555.666.777.888
Now onto the problem.
Load-balancing works. The problem is that for some reason the router picked ISP B's gateway as its "Route of last resort". This unfortunately breaks the existing IPSec VPN whose "route-map" is on Gi0/0. Telecommuters get into the office LAN through their Cisco VPN Client pointing to the public IP address on GI0/0. This setup worked for a long time when I had another ISP on Gi0/1, not the new ISP B. Since I plugged the new ISP B on Gi01/1, the router automatically chose it as its "Route of last resort". I don't know the logic the router applies to select the "Route of last resort" between two default gw's with the same metric, but no matter what I did, it always prefers ISP B's gw, something I don't want for the reason explained above.
This is a partial output of "sh ip route":
sh ip route
Codes: L - local, C - connected, S - static, R - RIP, M - mobile, B - BGP
D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area
N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
E1 - OSPF external type 1, E2 - OSPF external type 2
i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
ia - IS-IS inter area, * - candidate default, U - per-user static route
o - ODR, P - periodic downloaded static route, H - NHRP, l - LISP
+ - replicated route, % - next hop override
Gateway of last resort is 555.666.777.888 to network 0.0.0.0
S* 0.0.0.0/0 [1/0] via 555.666.777.888
[1/0] via 111.222.333.444
To summarise, I couldn't get the router to swap the order of the two default gateways (in light blue) so that it prefers 111.222.333.444 as its "Gateway of last resort" over 555.666.777.888. The fact is that even though ISP A's pipe to the Internet is slower than ISP B's, I'd rather keep it as my primary service due to quality, stability and other things as well.
As a workaround for the VPN, I could try to move the "route-map" from Gi0/0 to Gi0/1 and move on, but I don't want to do it for the moment because I don't know how long I'll keep ISP B. Besides, all VPN clients would need reconfiguration, which I want to avoid at all costs.
So, my question is: How do I override the router's logic for choosing its "Route of last resort" and force it choose "the one I want", eg. 111.222.333.444 ?
There are no dynamic routing protocols configured, only static routing.
Thank you.
Fernando
03-30-2016 09:30 PM
Hi Fernando,
It is advisable to use a provider independent address space whenever one wants to use two different ISPs for load balancing out to the internet. Do you have provider independent public address space? As per your post it seems that the VPN is specifically using ISP A's IP address space. This was never a stable setup from the beginning, since if you use provider assigned address spaces chances are the providers will not route each others address space out to the internet and vice versa. If this is the case and a link goes down then the gateway of last resort will change and point to ISP B which will break all the flows using ISP A's address block.
Which device is the VPN termination/aggregation point? Is it the router?
If you're concerned about the locally originated VPN traffic you can think about implementing a local PBR on the router. That will allow you to match the specific data which is being originated by the router and send it out the next hop you prefer. This will affect the locally originated traffic so from the router to a specific destination e.g. a ping to a public IP address and not the transit traffic.
Changing the way the RIB is programmed is not possible and is not deterministic in nature in my opinion, but when it comes to load balancing with equal cost routes the FIB is programmed with the hash buckets referencing both the next hops and a hash is picked based on the flow. This logic is for transit traffic.
Try to shut no shut the link to ISP B or 555.666.777.888 and see if that makes a difference, chances are slim though.
Thanks,
Shaunak
03-31-2016 03:09 AM
not only provider independent address space but also BGP AS.
now I think its impossible to purchase, only possible IPv6
03-31-2016 03:26 AM
It's not a mandate to always run BGP although it's a good practice to in a multi-homed environment but you can always set up an SLA with your providers to advertise your purchased address space and also have them tweak the BGP parameters when they advertise it.
Thanks,
Shaunak
03-31-2016 06:52 AM
Thank you Shaunak,
The two Internet feeds are from different providers. So no IP address clash of any sort.
No dynamic routing protocols are configured on the 2911. All routing is static.
The VPN termination is interface Gi0/0 on the router, which has a public IP address from ISP A. The remote VPN clients (they use Cisco VPN Client v.5) get IP addresses from a pool of private addresses. Then, when the VPN is connected, the remote users have immediate access to the LAN behind the router (which also has private IP addresses from another class), and also to the public Internet. This setup, which has been working like a charm for over 4 years, has been built over the Internet service from only one ISP, which I call ISP A.
Having said this, I think it's worth emphasizing that I'm not having any problems at all with the VPN settings on the router. Nor do I have any problems with traffic to and from the VPN clients. My only concern has to do with the side effects I noticed after I plugged a second Internet service (let's call it ISP B) on int Gi0/1 to do simple per session load-balancing with the existing ISP A, which -like I said- has been on int Gi0/0 for a long time.
To implement load-balancing I declared an additional default route to ISP B's gateway with the same AD as the existing (and up to now only one) default route from ISP A. So, now the router has two default routes with the same AD. The problem is that as soon as I added the second default route, that is, the gateway from ISP B, the router automatically picked it as its "Gateway of last resort" to network 0.0.0.0, overriding the existing default gateway from ISP A that had been in place for years. Bear in mind that the default gateway from ISP A is *still* in the routing table as seen in a "sh ip route" output, but it was overridden by the new default route from ISP B, which has become the "candidate route" or "Gateway of last resort". This has created some problems.
First and foremost, int Gi0/0 doesn't respond to ping's anymore. If you go to any computer on the Internet and ping the public IP address on int Gi0/0, the router won't reply to the probe packets. The immediate consequence of this is that VPN users can't reach the VPN endpoint anymore. (Not a VPN problem, but a routing one nonetheless). On the other hand, ping works if you ping the public IP address on int Gi0/1 from any computer on the Internet, which seems rather obvious. Then, it's clear that the router is now honoring ISP B's default gateway as its "Gateway of last resort".
So, under this context, the question is: how am I supposed to successfully implement per session load-balancing with two default routes with the same AD without breaking anything?
A secondary question would be: How can I swap the order of the default routes so that the router picks the one from ISP A as its "candidate route" or "Gateway of last resort" instead the one from ISP B, while at the same time maintaining reachability to both interfaces (Gi0/0 and Gi0/1) from the public Internet?
I tried a lot of things. For example, shut/no shut on int Gi0/1 made no difference. Nor did route deletion/recreation. In all cases, as soon as I configure ISP B's default gateway again (ip route 0.0.0.0 0.0.0.0 555.666.777.888) the router *always* picks it as its candidate default route, rendering int Gi0/0 (ISP A) unreachable from the public internet. I cannot prove it, but it occurs to me that the router must be somehow calculating the bandwidth available on both feeds, ISP A and B, for the selection of its "Gateway of last resort". In fact, the new internet service from ISP B is faster than the older service from ISP A.
Just for the record, if I issue "ping 8.8.8.8 so Gi0/0" (ISP A) from the router itself, it just works even though the "candidate route" or "Gateway of last resort" belongs to ISP B. This means that for locally originated traffic the non-default route from ISP A works (the router knows which next hop it must forward packets to), but not for transit traffic.
I hope this explains the points that were not clear in my original post.
Thanks in advance for your time and support.
Fernando
04-06-2016 10:14 PM
Hi Fernando,
What is the result when you try to ping an address on the internet and source it from ISP A's interface? Does the ping drop or you get responses back?
Also, try to do a trace for a public address and source it from the interface connected to ISP-A.
Read up on local policy routing on IOS devices and check if that'll be a fit here. The router does not do a health check on static routes when it comes to bandwidth or link characteristics. The selection of the gateway is non deterministic and is not influenced by these things.
Local policy routing is used whenever you have multiple gateways and want to forward router generated traffic out a specific gateway.
Thanks,
Shaunak
04-18-2016 06:49 PM
Thank you for your time and support, Shaunak.
Ping's work, no drops observed, regardless of the int I source them from. Traceroute's also work, but with several timeouts when sourced from int Gi0/0 (ISP A) and duplicate hops when sourced from int Gi0/1 (ISP B). This happens only when there are two default routes with the same AD. When there is only one default route (eg. ISP A), then the traceroute output is clean, without timeouts or duplicate hops. And it runs much faster too.
Anyway, it seems that no matter what I try, I can't get load-balancing and the VPN to coexist. The router always prefers ISP B's gateway as its gateway of last resort. This breaks the VPN for remote workers (some users can connect while others cannot). Following this document to the letter didn't make any difference either ( http://www.cisco.com/c/en/us/support/docs/ip/network-address-translation-nat/100658-ios-nat-load-balancing-2isp.html ).
To me, route selection seems deterministic. As soon as I enter "ip route 0.0.0.0 0.0.0.0 <ISP B gateway>", it will override the existing gateway of last resort, which, up to that point, was ISP A's gateway.
Thanks again,
Fernando
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide