cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
738
Views
5
Helpful
5
Replies

Service status down & recovery to alive

paul.matthews
Level 5
Level 5

I have a strange one. I have a pair of 11503s in redundant interface/vip mode.

Occasionally a subset of servers will go to down on BOTH CSSs. I would normally suspect that meant the servers had all gone (thoug at the same time would be odd)

Suspending and activating the servers brings them back alive.

The config is:

service server1-80

ip address <Address>

port 80

keepalive type http

keepalive method get

active

What actually causes a down state? I would expect a 400 series, a 500 series, or no response at all to take it down, but the server will respond when I kick the CSS

What is the method used to recover a service? I would expect the CSS would have been retrying all along, without waiting for me to sus/act to force the Keepalive.

Paul.

2 Accepted Solutions

Accepted Solutions

Gilles Dufour
Cisco Employee
Cisco Employee

Paul,

since you are using the method get under the keepalive, it means the CSS will verify that the webpage is always the same by computing a hash.

So, I would suspect that when everything goes down, it's because somebody modified the content of the webpage.

By doing a suspend/active you forces the CSS to recompute the hash and so everything is ok until the page changes again.

If you don't want the CSS to verify the content of the page, use a method head.

Regards,

Gilles.

View solution in original post

d.parks
Level 1
Level 1

I suspect that the content on your html page has been changing.

With the keepalive method set to "get", the CSS does a hash on the page at the time of the first keepalive and stores that value. On the subsequent checks, the current content of the page is compared against that first hash. If they are different, the service is placed in a down state. Keepalive attempts continue, but the service will not be seen as alive unless the original content is restored.

Suspending and activating the service refreshes the hash value to the current contents of the page, which is why it goes "alive" when you do this.

If this is not the behavior you want, try switching to a keeplive method of "head" which just checks for a "200 ok" response and does not hash the content.

View solution in original post

5 Replies 5

Gilles Dufour
Cisco Employee
Cisco Employee

Paul,

since you are using the method get under the keepalive, it means the CSS will verify that the webpage is always the same by computing a hash.

So, I would suspect that when everything goes down, it's because somebody modified the content of the webpage.

By doing a suspend/active you forces the CSS to recompute the hash and so everything is ok until the page changes again.

If you don't want the CSS to verify the content of the page, use a method head.

Regards,

Gilles.

Oops, sorry for the "twin" response. You must be quicker on the keys than I this morning.

no problem - I'm giving you a 5 anyway for the good answer :-)

Gilles.

Two of you giving me the same answer is not a problem - thanks for the help guys.

Paul.

d.parks
Level 1
Level 1

I suspect that the content on your html page has been changing.

With the keepalive method set to "get", the CSS does a hash on the page at the time of the first keepalive and stores that value. On the subsequent checks, the current content of the page is compared against that first hash. If they are different, the service is placed in a down state. Keepalive attempts continue, but the service will not be seen as alive unless the original content is restored.

Suspending and activating the service refreshes the hash value to the current contents of the page, which is why it goes "alive" when you do this.

If this is not the behavior you want, try switching to a keeplive method of "head" which just checks for a "200 ok" response and does not hash the content.

Review Cisco Networking for a $25 gift card