Load balancing/sharing on WAN - Page 2

Islam Nadim · ‎05-31-2013

Hello All,

I have this topology where I want to load balance between WAN1 and WAN2 ..

The switch Sw1 is a layer 2 switch not a layer 3 .. Is it possible to enable load balancing on the 2 WAN Connections from the routers directly or should I upgrade the switch to a layer 3 and activate EIGRP on it and then load balance?

Best Regards,

Islam M. Nadim

Bilal Nawaz · ‎06-03-2013

So! I tried my lab with GLBP and it worked out that my theory is correct! I had to clear my arp table for it to work properly, so its rubbish! Just doesn't work for this scenario.

I have tried MHSRP and it works

I have two PC's one in vlan 10, and one in vlan 20.

I want my PC in vlan 10 to go out CE1 to WAN1

I want my PC in vlan 20 to go out CE2 to WAN2

However If any were to fail, this should switch to the other WAN provider/circuit.

Config of both will be attached named HSRP_

CE1

CE2

Just to verify that it is working as expected:

PC in VLAN 10 going via WAN 1

PC in vlan 20 going via WAN 2

So we have got loadbalancing here.... We will test resilience now (note in the config I have HSRP tracking feature of the IP SLA. I will shut WAN2 down so everything should go via WAN 1.

As soon as I shut this interface down I got this on CE2

CE1 shows it went to Active state for Group 2

CE1 shows that he is active for both groups - now the traceroutes to test the path......

So both PC's are now going via WAN 1 and using CE1 for transit. Same thing happens when I fail over WAN 1 and have WAN 2 providing internet.

Next will be Mohammad's suggestion of a layer 3 device - I will use OSPF and try to demonstrate the resilience and loadbalancing too again.

Please rate useful posts & remember to mark any solved questions as answered. Thank you.

blau grana · ‎06-03-2013

Hello Bilal,

Perfect work 5+, HSRP is one way to go!

But I think you are wrong about GLBP. You can configure GLBP with tracking and treshold values. If uplnik fails, GLBP weighting value is lowered below threshold and AVF will not forward traffic anymore. Other AVF will take over virtual MAC address, so no clearing ARP table will be neccesary.

http://www.cisco.com/en/US/docs/ios/12_2s/feature/guide/fs_glbp2.html#wp1027837

Best Regards

Please rate all helpful posts and close solved questions

Best Regards Please rate all helpful posts and close solved questions

Bilal Nawaz · ‎06-03-2013

Hello Blau, thank you for the kind rating I used the weighting on my GLBP lab (with same tracking and the threshold values) - I agree it works fine! However, its not so great when my host had an ARP entry for the AVF as its default gateway. This AVF failed - it lost internet connection, the AVF was 'out of service' which is fine.

But........ I still had an ARP entry for my gateway (pointing to the AVF that failed) the MAC address was for the failed AVF, and I only realised 5 mins later why I couldnt ping outbound. When I did show arp, and I checked MAC addresses being used by the AVF's I had the Non working one in the ARP table.

It was only when I did a 'clear arp-cache' it worked fine - my host arp'd for its gateway and sure enough, the working AVF sent a response. And every time I failed over I had to clear the ARP again.

So then my question is, is GLBP meant to 'pass on' the virtual MAC address of an AVF that has failed, to one that works??

If so then great, what did I configure wrong and how to get this working?

(I guess I'll leave the OSPF one till tomorrow)

Please rate useful posts & remember to mark any solved questions as answered. Thank you.

ALIAOF_ · ‎06-03-2013

Actually Bilal you are correct GLBP is not like HSRP exactly i.e there will be multiple MACs. Check the output below. So basiclly the AVF MAC right now is "c802", but if it changes I will get "c801".

Vlan200 - Group 200

State is Active

1 state change, last state change 7w2d

Virtual IP address is 10.1.200.1

Hello time 3 sec, hold time 10 sec

Next hello sent in 2.944 secs

Redirect time 600 sec, forwarder timeout 14400 sec

Authentication MD5, key-string

Preemption enabled, min delay 0 sec

Active is local

Standby is 10.1.200.250, priority 100 (expires in 8.896 sec)

Priority 110 (configured)

Weighting 5 (configured 5), thresholds: lower 1, upper 5

Track object 200 state Up decrement 20

Load balancing: weighted

Group members:

a493.4cda.587f (10.1.200.249) local

a493.4cdb.10ff (10.1.200.250) authenticated

There are 2 forwarders (1 active)

Forwarder 1

State is Active

1 state change, last state change 7w2d

MAC address is 0007.b400.c801 (default)

Owner ID is a493.4cda.587f

Redirection enabled

Preemption enabled, min delay 30 sec

Active is local, weighting 5

Client selection count: 50763

Forwarder 2

State is Listen

MAC address is 0007.b400.c802 (learnt)

Owner ID is a493.4cdb.10ff

Redirection enabled, 598.912 sec remaining (maximum 600 sec)

Time to live: 14398.912 sec (maximum 14400 sec)

Preemption enabled, min delay 30 sec

Active is 10.1.200.250 (primary), weighting 6 (expires in 9.536 sec)

Client selection count: 60914

blau grana · ‎06-03-2013

Hello Bilal,

So then my question is, is GLBP meant to 'pass on' the virtual MAC address of an AVF that has failed, to one that works??

If so then great, what did I configure wrong and how to get this working?

Virtual Forwarder (VF) Redundancy is similar to VG Redundancy in that one of the gateways will takeover traffic forwarding for the virtual MAC address if the active VF fails. However, there are a number of important differences between VF Redundancy and VG Redundancy.

Each gateway in a GLBP group is assigned a virtual MAC address by the AVG (i.e. it becomes a VF for that MAC address). A VF that is assigned a virtual MAC address will become the Active VF for that forwarder instance, and is known as the Primary Virtual Forwarder.

Other gateways in the GLBP group will learn of this VF instance via Hello messages, and will create their own forwarder instances for this forwarder number. These are known as Secondary Virtual Forwarders.

for more details, read this:

http://www.cisco.com/en/US/prod/collateral/iosswrel/ps6537/ps6550/prod_presentation0900aecd801790a3_ps6600_Products_Presentation.html

section: Virtual Forwarder Redundancy

Text above basically says that for each AVF is monitored by other AVFs and if primary AVF for fails, one of secondary AVF with highest priority will take over forwarding for particular virtual MAC, so no clearing ARP cache is necessary.

Preempting AVF is enabled by default. There are some timers which have to expire before preempting will happen:

glbp NUMBER timers redirect 600 14400

- default values in sec are used

- 600s = 5min -> maybe there was a problem with long convergence

Last thing, after new AVF take over, it will send multicast hello to 224.0.0.102 with src MAC of virtual MAC of failed AVF, so intermediate devices will learn new location this MAC.

If something is unclear I can provide more info later...now is too late and I have to get up to work tommorow, actually today

So to sum up this topic:

HSRP

+ easy configuration and easy troubleshooting

+ fast convergence with default timers

- not equal loadbalancing

GLBP

- config and tshoot require more knowledge and skills

- longer convergence with default timers but convergence time can be same with HSRP with adjusted timers

+ more control over loadbalancing, better distributed traffic over uplinks

Best Regards

Please rate all helpful posts and close solved questions

Best Regards Please rate all helpful posts and close solved questions

ALIAOF_ · ‎06-03-2013

Nice work Bilal so basically you made CE1 Active for VLAN10 and CE2 Active for VLAN20. Now when you shut down Interface pointing to the ISP it fails over.

One thing that comes to my mind here is that what if there is an issue with the IPS but the link is up/up in that case there won't be any change with the Active/Standby router. But I see you have the IP SLA setup with the ICMP echo and then tracking that to over come that issue.

But like the above poster mentioned GLBP should be able to acheieve the same results. It may require a bit more fine tunning with the values though.

kumar4282015 · ‎08-10-2017

Dear Bilal, Great work you did. I prefer HSRP but the drawback here is each Vlan's adding or Network adding, you need to create Groups. If it was an L3, PBR was the efficent. Is it.

Let me know what the config you did in the switch. I hope you added 2 Vlan database in it right.

ravikantt · ‎06-03-2013

It does clear ARP on hosts, it use gratious ARP, & not even GLBP , HSRP also does the same.

This is only machnism to update the ARP of hosts & essentional for both GLBP & HSRP, otherwise it defited the whole purpose of deploying HSRP/GLBP.

ravikantt · ‎06-02-2013

Yes, thats the part of routing protocol; but as you might be able to see the toplogy, these are two gatway devices redundantly working, the very reason of implementing the FHRPs. Moreover, how would you ever load balance the WAN traffic that is not even reaching to you (LAN outgoing traffic, in this case), if you don't implement HSRP/GLBP here?

Secondly CE-1 & CE-2 are L3 devices on which interface tracking will be configured, not on the middle switch.

ravikantt · ‎06-02-2013

Hi Nadim,

I would also little concerned about the 'direction' in which you willing to achieve in the load balancing, if I say for outgoing that's simple FHRPs will take care of the things, but for incoming..!!! that will now depends upon the routing protocol you would be using out there.

Just for sake of understanding, this site is for hosting something like server over here or like redundant internet connectivity? I mean, Are concerned about outgoing traffic or incoming traffic or both?

Cheers

Ashok

Bilal Nawaz · ‎06-04-2013

Hello Blau, 5 Thank you for explaining this to me , You are absolutely right! I thought I was missing something here - I tried it again and it so happens I wasn't waiting long enough! The configuration was correct before just needed that fine tuning as Blau and Mohammad mentioned.

Ashok, I knew this about HSRP its actually different in GLBP.

So I've tweaked the timers as Blau has said, and it works perfectly.

Heres GLBP

Here I have one host in vlan 10. Has a gateway of the GLBP address 10.0.0.1

Configuration is attached - again

Traceroutes from my host, note that when I clear ARP then I get the other MAC for my gateway:

Via WAN 2 then to WAN 1 - so its load-balancing.

On the switch also, I created an SVI (to act as another host) And we get load-balancing here...

Some GLBP output here - CE1 is active with two group members (AVF's) and mac addresses ending 101 and 102 will be used as the GLBP MAC addresses:

CE2 is the standby at the moment:

They are both active for their respective vMAC Addresses.

Now for failover tests. I'll go ahead and shut WAN 1 down. All traffic should route through WAN 2 (via 2.2.2.x network)

As soon as WAN 1 is shut down, I get this output:

CE1:

CE2 - transitions to Active state:

Now on my host if I do my traceroutes again, I should only be going one way - out of WAN 2.

Same thing happens when I failover the other side... Just want to say thanks once again Blau

So GLBP does work, and it works quite smoothly.

101% agree with you here:

HSRP

+ easy configuration and easy troubleshooting

+ fast convergence with default timers

- not equal loadbalancing

GLBP

- config and tshoot require more knowledge and skills

- longer convergence with default timers but convergence time can be same with HSRP with adjusted timers

+ more control over loadbalancing, better distributed traffic over uplinks

Please rate useful posts & remember to mark any solved questions as answered. Thank you.

ALIAOF_ · ‎06-05-2013

Nice work. Now ya wanna help me out with my issue on another post lol.

Joseph W. Doherty · ‎06-05-2013

Disclaimer

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

If your WAN routers support it, optimal load balancing might be provided by OER/PfR. Although OER/PfR, alone, will dynamically load balance your two WAN circuits, when possible, having something like GLBP to statically balance too works well. (GLBP often provides a somewhat decent load balance, OER/PfR will then fine tune it.)

PS:

BTW, if GLBP isn't supported, you can also do static load balancing by using mHSRP and having half your hosts use one gateway IP and half the other.

PPS:

If you sw1 was a L3 switch, it could statically load balance to both your WAN routers, again OER/PfR would "fine tune".

PPPS:

Also BTW, OER/PfR can actually tune for end-to-end performance. I.e. if CE1 has less RTT to some destination vs. CE2, OER/PfR can route for that too.