cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Announcements
AMA event- Migrating Existing Networks to Cisco ACI

Datacenter troubleshooting guide - day 4

3498
Views
0
Helpful
2
Comments
Cisco Employee

“Datacenter troubleshooting guide” – a blog by  Gilles Dufour.

Day 4 - Looking at stickyness

When you need the same user to be always sent to the same server, you need some sort of stickyness.

There are many different ways to achieve this.

Some predictor algorithms will by definition always select the same server for a given client.  This is the case with the source ip hashing predictor.

But very often you will need to configure a sticky method in combination with your predictor algorithm.

Why is the source ip hash predictor a sticky method ?

Actually, this is not a sticky method.  But since the hash algorithm always give the same result for a given source ip address, it guarantees that a client using the same ip address will always be sent to the same server.

The advantage is that it does not require to configure a specific sticky method.  It also works without the need for a sticky table.  So it does preserve resources.

But the hash function will have different results when you add or remove a server.  Therefore, when your rserver list is modified your clients might be sent to different servers breaking stickyness.

Is sticky source ip a good solution ?

Because of the changing hash results mentioned above, most people will prefer to use a standard predictor (roundrobin , leastconn, ...) and add a sticky source ip option.

The idea is to also use the source ip address to identify the client and select the corresponding server.

Unlike the hash method, the stick source ip solution will need sticky resources to save the information necessary for ACE to remember which client uses which server.

The advantage of the sticky option is that the sticky table is not affected when the rserver list is modified.

Why not use sticky source ip ?

Very often this solution is enough to guarantee stickyness.

But because a lot of clients do not have a static ip address, this method does not work.

There is also the problem of proxy servers hiding many clients behind a single ip address resulting in rserver overload when using sticky source ip.

For HTTP the solution is to use information contained in the client HTTP request and server HTTP response.

An HTTP Cookie is an object used by a server to identify HTTP clients.  A loadbalancer can therefore also use this information to map a client to a server.

Impact of sticky Cookie

For this solution to work, the loadbalancer needs to "spoof" the connection.  In other words, the loadbalancer needs to terminate the TCP connection and responds on behalf of the server.  This is required since the cookie is only sent in the first data packet after the TCP connection has been established.

Therefore, one of the disavantage of this solution is the impact on CPU and performance.

The loadbalancer needs to act like a proxy and inspect client/server data to create the sticky table and to select appropriate server based on the cookie sent by the client.  This method is therefore CPU intensive and slows down the loadbalancing decision.

This solution works great, but it may sometimes fail.

One of the reason we see very often is the use of invalid cookies in the HTTP header.

When the HTTP header is not RFC compliant, ACE stops parsing the data and falls back to the default class-map.

A new function has been introduced with ddts "CSCtj42969: Add ability in ACE to skip malformed cookie in HTTP parser" to allow the parsing of those illegal headers.  The same ddts exists for the module: CSCtj05814 and has been introduced in version A2(3.3).

This new option is configurable inside an HTTP parameter-map as followed :

switch/Admin(config)# parameter-map type http abc
switch/Admin(config-parammap-http)# ?
Configure http parameters:
  case-insensitive       Enable case insensitive http matching
  cookie-error-ignore   Ignore malformed cookie in request    

Another common mistake is the result of proxy servers using persistent connections.

As mentioned before, a proxy can hide many clients behind a single ip address.

But it can also send inside the same TCP connection requests from different clients.

Therefore different cookies can be observed in a single TCP connection.

By default ACE only looks into the first request and assumes all subsequent requests belong to the same client.

To accomodate proxy servers, you need to configure persistent rebalance with an HTTP parameter-map.

switch/Admin(config)# parameter-map type http Pers
switch/Admin(config-parammap-http)# persistence-rebalance ?
  strict  Enable HTTP connection strict persistence rebalance
  <cr>    Carriage return.

There are 2 modes for this option - normal and strict.

In normal mode we only rebalance the connection when needed.  So, unless we see a different cookie we stay connected with the current server.

The strict option forces ACE to rebalance the connection if there is no need to stay with the current server (for example, if there is no cookie in the client request).

What if my server does not use Cookie ?

If your server is not currently using cookies or if you do not know which cookie is being used, you can use the static cookie functionality provided by ACE.

ACE will generate its own cookie and insert it in every server response before forwarding it to the client.

For the client browsers it is transparent.  The cookie looks as if coming from the server itself.

Use the "cookie insert" option when creating your sticky cookie config.

If you use static cookies, be aware that the sticky info is saved in a different table than the dynamic cookies.

You will use the command 'show sticky database static' command when looking for static cookies.

Another option to look at the static cookie is the command

switch/Admin# show sticky cookie-insert group MyCookie
     Cookie   |        HashKey       |           rserver-instance
  ------------+----------------------+----------------------------------------+
  R557253738 | 3315853035665853393  | linux1-8081/linux1:8081
  R3901416957 | 12864086502718037453 | linux1-8081/linux1-24:8081
switch/Admin#

The cookie value on the left is obtained by hashing a string corresponding to "serverfarm name" + "rserver name" + "port".

Can we use sticky cookie with SSL ?

HTTPS is just HTTP encrypted.

So most people believe they can use sticky cookie with HTTPS.

The problem is that the whole HTTP packet is encrypted, including the HTTP header.

Therefore a loadbalancer is not able to read the HTTP header and look at the cookie.

The only way to use the sticky cookie method with SSL traffic is to first decrypt the data.

What alternative do we have for SSL traffic ?

If you do not wish to use cookie with your ssl traffic or do not want to terminate SSL traffic on your ACE, you can use sticky based on session id.

This method only works with SSLv3 and TLS.

The idea is to look at the SSL session id exchanged in clear text between client and server.

This session id is used by the server to identify the client, therefore ACE can also use it for the same purpose.

There is no specific sticky ssl method inside ACE.

Instead we need to use the generic layer4-payload analysis feature.  It is a bit more complex since you need to know what an SSL client/server hello look like and where to find the session id in the packet.

There is a sample config in the document CSM2ACE Conversion

However, it is a bit incomplete as it does not take into account packets without session id.

The config below is the right one to use.

sticky layer4-payload SSL-SESSION_STKY
  timeout 600
  serverfarm SSL-SESSION
  response sticky
  layer4-payload offset 43 length 32 begin-pattern "(\x20|\x00\xST)"

class-map type generic match-any SSLID-32_REGEX
  2 match layer4-payload regex "\x16\x03[\x00\x01]..[\x01\x02].*"
  3 match layer4-payload regex "\x80\x4c.*"

The use of the xST meta character is important for SSL connections without session id.

This is explained in this document.

Basically, what the config above does is check if the packet starts with the bytes 0x16 0x03 ...

If yes, we jump to position 43 where the session id length is located.

If the length is 0x20 we read the next 32 bytes as the session id.

If the length ix 0x0 we skip and ignore the session id.

Why is ACE not loadbalancing evenly when we configure stickyness ?

This is one of the most frequent question that we get.

You configure a roundrobin or leastconn predictor and see even distribution of the traffic load accross your servers.

Then you add stickyness and suddenly when server is getting most of the traffic.

Is this a bug ?

Actually this is normal.

Let's look at a simple example.

Assume you have 3 clients C1, C2 and C3 and 3 servers S1,S2, and S3.

Each client opens one connections, this is their first connection, so there is no sticky entry for them and ACE performs a basic loadbalancing decision (roundrobin, leastconn, ...) and each server receives one client.

At this time we see in our 'show serverfarm' that each server has 1 active connection.  So perfect loadbalancing.

Now, this is where things will change with stickyness.

If C1 suddently decides to open 3 more connections.

ACE will look into its sticky database and find the mapping C1 - S1.

The 3 connections are sent to S1 since this is what we want (stickyness is there to guarantee the same client always goes to the same server).

Now, if we take a 'show serverfarm' we can see S1 as 4 connections while S2 and S3 have each 1 connection.

Uneven loadbalancing.

Another reason for seeing uneven loadbalancing is when you perform maintenance on one server.

In order to swap memory or do a software upgrade, you bring the server down.

During the maintenance, traffic is redirected to the other servers.

If we continue from the example above.

Assume each server has 5 connections.

We take server S2 down. client C2 detects it has no response from S2 and  it reopens the 5 connections.

Since S2 is down, the traffic is sent to another server - ie S3.

We know have S1:5 connections, S2: 0 and S3: 10.

Again uneven loadbalancing.

You bring S2 up again and after a while you see S1:10   S2:0 and S3: 20

Why is S2 not receiving new connections ?

This is because all sticky entries point to S1 and S3 so S2 is only picking requests from new clients unknown to ACE.

Again, uneven loadbalancing and perfectly normal.

If you do a maintenance on one or more servers, you should probably clear the sticky table after you bring it back up.

Next week I will check some troubleshooting steps related to stickyness.

Gilles

2 Comments
Cisco Employee

Hi Gilles,

A great read for sure.

Regards,

Kanwal

Beginner

Hi, our ACE 4710 Version A4(2.3) does not have the cookie-error-ignore option under HTTP parameter-map.

CreatePlease to create content
Content for Community-Ad
August's Community Spotlight Awards
This widget could not be displayed.