cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1917
Views
0
Helpful
2
Replies

Probe skipping

prowler130
Level 4
Level 4

Hello,

I am running into a rather interesting issue and I was curious if anyone may have seen it before or if anyone had any insight into what the problem could be.  On one of my ACE 4710's (running sw A5(1.2) , I am running a fairly large number of layer 7 probes (71) across both 80 and 443.  At seemingly random points in the day, the system reports that the probes are being skipped due to an internal error.  I have seen this before when the system runs out of sockets for the probes, but I am not seeing any indication that is the case.

Here is an example probe config:

probe https CHECK-SOME-SITE

  port 443

  interval 10

  faildetect 2

  passdetect interval 30

  receive 5

  ssl version all

  request method get url /some/url

  header Host header-value "www.somesite.com"

  expect regex "SOMEREGEX"

Here is the relevant output from ''show probe detail'

     real      : some-rserver

                          x.x.x.x  443 PROBE   3093610 1749563 1344047 SUCCESS

   Socket state        : CLOSED

   No. Passed states   : 49         No. Failed states : 49

   No. Probes skipped  : 479         Last status code  : 200

   No. Out of Sockets  : 0         No. Internal error: 0

   Last disconnect err :  -

   Last probe time     : Tue Mar  4 16:45:03 2014

   Last fail time      : Fri Feb 28 13:30:37 2014

   Last active time    : Mon Mar  3 22:08:53 2014

Here are the log messages that are popping up:

Mar  4 2014 14:36:41 : %ACE-3-251014: Could not probe server x.x.x.x on port 443 for 4 consecutive tries - Internal error

The log messages appear for all rservers being probed for about 30 seconds, then they go away until the next event.  Considering the probes are skipped, I do not believe this is actually causing failures at the moment.  I have read that the ACE platform can only run 200 concurrent scripted probes, however I am at a loss as to how to check if that is what I am running into here.  The real confusing thing here is the lack of internal error and out of socket counters. 

Any help or insight would be very appreciated.  Thanks in advance.

-Ed

2 Replies 2

Kanwaljeet Singh
Cisco Employee
Cisco Employee

Hi Ed,

Two things:

Number of skipped probes. A skipped probe occurs when the ACE does not send out a probe because the scheduled interval to send a probe is shorter than it takes to complete the execution of the probe; the send interval is shorter than the open timeout or receive timeout interval.

In your case the interval is 10 which is little aggressive but still less than receive. But if the probe execution is greater than 10 seconds you may see probes getting skipped. Increasing the interval time by another 10 seconds can be helpful for testing to see if this mitigates the issue.

If you have  UDP probes then you need to check this as well:

For UDP probes or UDP-based probes, we recommend a time interval value of 30 seconds. The reason for this recommendation is that the ACE data plane has a management connection limit of 100,000. Management connections are used by all probes as well as Telnet, SSH, SNMP, and other management applications. In addition, the ACE has a default timeout for UDP connections of 120 (ACE module) or 15 (ACE appliance) seconds. This means that the ACE does not remove the UDP connections even though the UDP probe has been closed for two minutes. Using a time interval less than 30 seconds may limit the number of UDP probes that can be configured to run without exceeding the management connection limit, which may result in skipped probes

Are you running any scripted probes?

It could be a stupid bug as well but i would suggest increasing the interval timeout and see how it goes.

You can also alo try debug hm errors/events/all etc and see if you get any detailed output there which can be sent to TAC for further investigation.

Regards,

Kanwal


.

Hello Kanwal,

Thanks very much for the response.  I did not take into account the time the probe actually takes to execute.  I will scale the probes back a bit and see if that alleviates the issue.

-Ed

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: