I have a customer who wants to maintain sticky between TCP80 and TCP443, so we selected a L3 content rule.
content L3_www.ourcustomer.com
vip address 1.2.3.4
sticky-mask 255.255.128.0
add service web004
add service web005
add service web006
sticky-inact-timeout 10
advanced-balance sticky-srcip
active
The problem I see is that without a low inactivity timeout the CPU goes through the roof to 100%. Even with 10 minute inactivity timer it rockets to 60-80% CPU, but at least it isn't locked at 100% (which causes latency and other problems).
The customer is pushing 12Mbps with on average 5-10K connections. No sticky rejects or collisions and plenty of FCBs available. At any time the average number of used sticky entries is 8-10K.
I can make the CPU immediately drop like a rock to 1-10% by simply purging the sticky table or by removing sticky from the content rule. As a result I am confident the issue appears to be sticky related. I didn't find any bugs opened on this issue tho. We run sticky for quite a few customers and have rarely, if ever, seen anything like this before.
I did some testing in production and experienced this high CPU problem with 4.01.44.s, 5.01.69s and 6.10.405 and saw it on a CSS11150 before I moved the customer to a CSS11800 where we experience the same exact issue.
I have dozens of CSS11150s, CSS11800s, and a few CSS11500 series in production but never experienced this on any version of code before.
Has anyone seen this before? Due to "shopping cart" issues we need to maintain persistence between 80->443 transitions so I think I am stuck with sticky and a L3 rule.
Any thoughts would be most appreciated.
Thanks!
Mike