cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2164
Views
0
Helpful
10
Replies

ACE-20 http-cookie sticky performance issues?

MARTIN CHONG
Level 1
Level 1

Has anyone noticed significant performance issues when enabling sticky via http cookie?

I notice that I see failures to the rservers when enabling this, however if I bypass the ACE and go directly to a server, OR I hit the VIP with just serverfarm instead of sticky-serverfarm, I see no connection drops on the service-policy and the client doesn't get a RST.

I have tried a number of things including parameter-maps with wan-optimization and modifying timeouts

If I configure failaction reassign, the client still gets resets.

I have checked the resource limits and I'm not close. CPU is unloaded. My probes have polled thousands of times with no failure.

One thing I do see in my logs is

%ACE-6-302023: Teardown TCP connection 0x1b0e57 for xxx to yyyyyy duration 0:00:00 bytes 2600 Exception

But that just tells me there was a problem setting up the connection.

We're only talking about 10 concurrent connections. Is the ACE slow is reading the cookie in the HTTP header? Is there a bug I haven't found>?

ACE-20 running A2(3.4)

here's a sample of my config:

rserver host myhost1

  ip address 10.10.10.10

  inservice

rserver host myhost2

  ip address 10.10.10.11

  inservice

serverfarm host my-vip.8080

  failaction reassign

  predictor leastconns

  rserver myhost1 8080

    inservice

  rserver myhost2 8080

    inservice

sticky http-cookie uniqservID my-vip.com.8080

  cookie insert browser-expire

  timeout 60

  serverfarm my-vip.8080

policy-map type loadbalance first-match my-vip.com_POLICY

  class class-default

    sticky-serverfarm my-vip.com.8080

policy-map multi-match POLICY

    class my-vip.8080

    loadbalance vip inservice

    loadbalance policy my-vip.com_POLICY

    loadbalance vip icmp-reply

    loadbalance vip advertise active

10 Replies 10

pablo.nxh
Level 3
Level 3

Hi Martin,

Couple of questions:

- Have you tried IP based stickiness just for testing? If yes did you get the same errors?

- Can you gather the output of show stats http then reproduce the issue, take a new output and paste it here.

__ __

Pablo

Hi Pablo,

I tried IP-netmask based stickiness. The exception rate was much higher...but the same behaviour yes.

Here is the output:

+------------------------------------------+

+-------------- HTTP statistics -----------+

+------------------------------------------+

LB parse result msgs sent : 47925      , TCP data msgs sent       : 118808

Inspect parse result msgs : 0          , SSL data msgs sent       : 0

                      sent

TCP fin msgs sent         : 24284      , TCP rst msgs sent:       : 2

Bounced fin msgs sent     : 0          , Bounced rst msgs sent:   : 0

SSL fin msgs sent         : 0          , SSL rst msgs sent:       : 0

Drain msgs sent           : 0          , Particles read           : 135935

Reuse msgs sent           : 0          , HTTP requests            : 47876

Reproxied requests        : 0          , Headers removed          : 0

Headers inserted          : 37792      , HTTP redirects           : 0

HTTP chunks               : 0          , Pipelined requests       : 0

HTTP unproxy conns        : 35164      , Pipeline flushes         : 0

Whitespace appends        : 0          , Second pass parsing      : 0

Response entries recycled : 0          , Analysis errors          : 0

Header insert errors      : 0          , Max parselen errors      : 0

Static parse errors       : 0          , Resource errors          : 0

Invalid path errors       : 0          , Bad HTTP version errors  : 0

Headers rewritten         : 0          , Header rewrite errors    : 0

SSL headers inserted      : 0          , SSL header insert errors : 0

SSL spoof headers deleted : 0

HTTP passthrough stat     : 0

Unproxy msgs sent         : 50526

+------------------------------------------+

+------- Connection statistics ------------+

+------------------------------------------+

Total Connections Created  : 1652327

Total Connections Current  : 546

Total Connections Destroyed: 558967

Total Connections Timed-out: 421728

Total Connections Failed   : 671500

Thanks

Martin,

Try with a parameter like this:

parameter-map type connection test

  set tcp wan-optimization rtt 0

  set tcp ack-delay 0

policy-map multi-match POLICY

    class my-vip.8080

    loadbalance vip inservice

    loadbalance policy my-vip.com_POLICY

    loadbalance vip icmp-reply

    loadbalance vip advertise active

  connection advanced-options test


Cesar R.
--------------------- Cesar R ANS Team

Hi Cesar,

These did not make a difference however I retried ip-netmask stickiness and did not see these connection drops with RSTs back to the client.

This makes me believe it is specific to cookie stickiness. The interesting part is that we do not see it initially, but once the concurrent connections builds up to around 20 we start seeing drops.

Is this a performance issue with writing and parsing the L7 HTTP header?

Thanks

-M-

Hi Martin,

Do you test with real traffic or is a traffic generator?  Would be possible for you gather a tengig captures of the ACE showing the issue.

Cesar R

--------------------- Cesar R ANS Team

Hi Cesar,

I do have that information but for security reasons I can't easily share the captures. I can say this though. I have captures for the LB, the client and the server. Yes this is Jmeter traffic that does HTTP posts.

I can say that there is either a problem closing the connection correctly, or there is a problem with the hanshake. What i see is this

Initial connection with src port of say 32000 works fine. Connection appears to close normally.

a minute or so later, client uses port 32000 again as its src port. For some reason the server gets confused and acks a long lost segment after the client send a syn. The LB does not like this and sends RSTS to both client and server.

Do not see this without cookie sticky.

Thanks

MARTIN CHONG
Level 1
Level 1

I have a very strong suspicion that its a performance issue with parsing L7 header, or performing the L7 handshake. I see the same problem with http-header or http-content inspection, but not with ip sticky or regular balancing. I've tried pretty much everything Cisco has recommended so far with respect to tuning TCP options but am at a loss.

I've seen a lot of bugs on the toolkit that would clearly explain what I'm running into but apparently they have been fixed in my version of code.

Gilles, are you still out there?

Hello Martin-

You wouldn't happen to have normalization disabled on the interface that the client is ingressing do you?

Regards,

Chris

Yep, its disabled on ingress and egress interfaces as we've found that it often interferes with traffic flow (i.e. some of our web response times go above 3 seconds on a regular basis).

I'm surprised they've implemented this as a default as the ACE is supposed to be primarily a load balancer not an IPS.

Hi Martin-

One of my peers asked for help on a case and I couldn't believe the name on the capture file.  I am going to forward my response directly to you though the TAC case you have open currently.  The long of the short is that with normalization disabled, ACE does not send a reset when it purges the flow from its memory.  Because of that, the servers TCP state table may have connections in it that ACE has timed out.  When the same client comes back in and utilizes the same TCP source port as a connection that already exists, a reset will ensue from either ACE or the server and the connection is purged. 

The way to correct this is to either re-enable normalization and clear the TCP connections on the server so the TCP state table is in sync with ACE's view.  Or, you can make your server time out the TCP connections in less than the ACE default idle timeout of 1 hour. 

Review Cisco Networking for a $25 gift card