08-02-2013 07:31 AM
Hi,
I have a question about how Load Balancers work when a Probe fails. I know that there are other discussions about the same issue but they are not very helpful in my scenario. Hopefully you can help me.
According to Cisco documentation "The default behaviour of the ACE is to do nothing with existing connections if a real server fails." So I have assumed that if a Probe configured for Server1 fails in a serverfarm, connections already established would be maintained and new sessions would go to Server2 that is Operational in the same serverfarm.
I have tested this scenario, making a Probe fails for Server1 and the sessions were torn down. However, new sessions went to Server2.
We don't want to tear down user sessions already established because we don't want any impact for end users. I don't know why our load balancers are not following the default behaviour. These are the details:
- 2 ACE 4710 A4(2.0) in Active/Standby configuration.
- 2 ACE20-MOD-K9 A2(1.6a) in Active/Standby configuraion.
I have done the testing for ACE 4710 and ACE20-MOD-K9 and I got the same results: Established connections were torn down from Server1 after the Probe failed.
I know that these versions are old... so do you know if it could be the reason because they are not working as expected? Do I need to include some special configuration/commands that are not enabled by default in these versions?
Thank you in advance.
Joana.
08-02-2013 11:56 AM
Hi Joana,
How did you confirm that existing connections are tear down? Do you have simultaneous captures on both sides of the ACE showing this behavior?
How did you cause the probe failure?
To confirm this behavior you mentioned we need to gather simultaneous captures on both sides, (client side and server side).
---------------------
Cesar R
ANS Team
08-02-2013 02:24 PM
Hello First I would recommend working with version A2 (3.6a) as this will give you more stability.
You do not mention if your balancers work in FT or just a manual redundancy. FT must be careful that the command (show ft group summary) always show as active balancers STANDBY_HOT.
I would enable balancing persistence, although this will take you to that if you needed https traffic service certificates to open the communication calan and insert cookie persistence.
would also help to have (replicate sticky) in the configuration.
At least in this way will ensure that communication will keep the communication level, because the cookies were replicated in your partner when one of the servers fall.
But to get to the stage you want to no end customer afectasion application level should also ensure that the remaining live server to maintain session persistence your partner, not all task balancer.
08-05-2013 02:15 AM
Hi,
We have the load balancers configured for High Availability, one is ACTIVE and the other is STANDBY. Therefore we have configured FT groups in the Admin context and in case of failure, the failover is done automatically, we don’t need to do it manually.
Testing Environment:
ACE-Module-/context# show serverfarm SERVERFARM
serverfarm : SERVER-farm, type: HOST
total rservers : 2
---------------------------------
----------connections-----------
real weight state current total failures
---+---------------------+------+------------+----------+----------+---------
rserver: SERVER1
192.168.1.1:0 8 PROBE-FAILED 0 953550 4324
rserver: SERVER2
192.168.1.2:0 8 OPERATIONAL 3 941003 4421
The number of connections in SERVER1 were 0 and we got new connections in SERVER2 after re-login.
Context Configuration:
probe tcp tcp7_probe1
port 7
interval 20
passdetect interval 10
passdetect count 2
open 1
rserver host SERVER1
description UAT server 1
ip address 192.168.1.1
inservice
rserver host SERVER2
description UAT server 2
ip address 192.168.1.2
inservice
serverfarm host SERVER-Farm
predictor leastconns
probe probe1
rserver SERVER1
inservice
rserver SERVER2
inservice
sticky ip-netmask 255.255.255.255 address source SERVERFARM-Sticky
timeout 720
timeout activeconns
replicate sticky
serverfarm SERVER-Farm
class-map match-all L4VIPSERVERFARM
2 match virtual-address 192.168.2.10 tcp eq www
policy-map multi-match SERVERFARM-VIPs
class L4VIPSERVERFARM
loadbalance vip inservice
loadbalance policy SERVERFARM-Web-policy
loadbalance vip icmp-reply active
loadbalance vip advertise active
interface vlan 15
description Servers
ip address 192.168.1.250 255.255.255.0
alias 192.168.1.251 255.255.255.0
peer ip address 192.168.1.249 255.255.255.0
no normalization
access-group input any
nat-pool 240 192.168.1.240 192.168.1.240 netmask 255.255.255.255 pat
service-policy input SERVERFARM-VIPs
no shutdown
interface vlan 16
description ACE Public Client Side
ip address 192.168.2.9 255.255.255.0
alias 192.168.2.7 255.255.255.0
peer ip address 192.168.2.8 255.255.255.0
no normalization
access-group input any
service-policy input SERVERFARM-VIPs
no shutdown
ip route 0.0.0.0 0.0.0.0 192.168.2.1
Thank you very much for your help.
Joana.
08-05-2013 10:51 AM
Hi Joana,
Try with this command:
serverfarm host SERVER-Farm
failaction reassign==============add this line
predictor leastconns
probe probe1
rserver SERVER1
inservice
rserver SERVER2
inservice
---------------------
Cesar R
ANS Team
08-07-2013 08:23 AM
Hi,
I will try the "failaction reassing" command in the next few days and I will let you know if it makes any difference.
Thanks,
Joana.
08-09-2013 03:29 PM
Hi Joana,
Actually you need to test with "failaction purge", the reasssing works only when there is a backup-rserver configured
---------------------
Cesar R
ANS Team
05-31-2016 12:53 AM
Hi People,
I need help fast .
i have 2 ACE-4710-K9 in active/standby mode and one of them failed .
I opened RMA and i recived new ACE-4710-K9 .
Now i need to configure this and to connect to be again in active/standby .
Does anyone know what i need to do to configure HA again ( some documents ) ?
Will it recive config from currently active unit , and will it affect prodiuction ?
KR
VZ
08-12-2013 12:12 PM
Joana-
What you are looking for is actually failaction reassign, however, you can not use it unless you meet a strict criteria, most people choose not to use it. Usually, you would use reassign only with firewall loadbalancing.
Your basic issue is this ->
By default, when a probe fails, ACE leaves all "active/established" connections on the failed server. All new connection (wether they have a sticky entry that matches the failed server or not) go to the remaining servers left in the serverfarm based on the loadbalancing predictor configured. Sticky entries are updated with the new server ip. For your users, you shut down the port, the server is either not going to respond to the next packet the client sends, or it will trigger a reset. Either way.. .that is not graceful for the client.
With failaction purge, ace sends a reset to both the client and server IP for the failed rserver within the serverfarm it failed in. As with the default behavior, all new connecitons are loadbalanced to the remaining servers.
With failaction reassign, ace sends any packets that would have gone to the failed server on to whatever servers are left in the farm. The moment the probe fails, ACE takes the existing connections for that server and rewrites the flow information to the remaining servers. This is not graceful for your client either.
It sounds to me like you are looking for reassign in order for the users to not see a reset and gracefully handle a failed server. Given that, you will need to check the guidelines under reassign located here:
Regards,
Chris Higgins
Technical Leadership
ANS Loadbalancing Technologies
08-15-2013 03:05 AM
Hi,
First of all, thanks for your help!
After some more testing I was kind of wrong about the Load Balancer behaviour. I have done a big testing with more users using the Web Applications behind the Load Balancers. These are the results:
Now I am confused… why the Server decides to close user sessions when the Probe fails if the ACEs still maintain the session established through them? The script listening on port TCP 7 doesn’t have any impact on the service running on TCP 80 in the same server. They are completely independent. It could be something in the web application itself? Or maybe it is some configuration on the serverfarm that specifically says to the server to finish the sessions when a Probe fails? Sticky sessions? Why is working fine in the first scenario and not in the second one? (Well, I have to say that the web applications are completly different).
I also tried the “failaction reassign” command, but it didn’t make any difference. I think you need a backup server configured in the serverfarm to get it working.
I really appreciate your help.
Cheers,
Joana.
08-15-2013 10:17 AM
Joana-
"But my colleague, who works with the web application servers behind, could see that all user sessions were torn down after the Probe failed although Load Balancers seemed to maintain the session already established."
Who tore down the session (was it a reset or a fin - did it come from the "Client" or the server initiated it?)
Regards,
Chris
Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: