cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2684
Views
5
Helpful
4
Replies

CSM ret-code time-frame

A company I work for has a number of CSM modules (WS-X6066-SLB-APC) installed in 6513 chasis switches. The CSM modules are running version

4.2(14)

These CSM modules are configured to load-balance a number of vservers via serverfarms, each serverfarm containing multiple real servers.

Here is some example configuration:

vserver SITE

  virtual 10.1.2.3 tcp www

  serverfarm SERVERFARM

  persistent rebalance

  inservice

!

serverfarm SERVERFARM

  nat server

  no nat client

  predictor leastconns

  failaction reassign

  retcode-map RETCODE-MAP

  real 10.2.3.4

   inservice

  real 10.2.3.5

   inservice

!

map RETCODE-MAP retcode

  match protocol http retcode 503 503 action remove 5 reset 300

The company is facing a problem with what seems to be related to return code checking. Every once in a while a server will suddenly not receive any traffic for 5 minutes. This always occurs right after the server has sent a HTTP 503 return code. However we cannot see in the CSM logs that the CSM module has actually disabled the real server. For other serverfarms which are running regular HTTP and/or ICMP health checks to real servers we can clearly see in the CSM logs when a real server has been temporarily disabled due to health check failures.

The return code checking is set to disable a real server for 300 seconds after the CSM has received five HTTP 503 responses from the real server. If we check the real server log however we cannot find more than that single 503 return code right before the server stops seeing any incoming traffic unless we move back at least hours in time.

I have tried to figure out what time frame those 5 return codes must be received within for them to count towards the maximum allowed return codes, but nowhere in no documentation can I find any information about this time frame.

For all I know the CSM could keep track of every incoming 503 forever, until the maximum of five 503's is reached, and then the server is disabled for 300 seconds.

Does anyone have any information about the time frame within which those return codes must be received by the CSM to count toward the maximum configured number of return codes before the configured action is taken?