DFM polling algorithm

Tobias Strandberg · ‎12-30-2011

Hi!

I've searched the community for an answer to my question without finding it so here goes..

You can set different polling parameters for groups such as:

New interval (The number of seconds between each successive poll)
New timeout (The number of milliseconds allowed for a poll request before it times out)
New retry (The number of times to retry a failed poll request)

Now to my question, if I set the total polling time (timeout*#retries) > interval, how does the LMS handle such a situation?

As I see it, either the LMS consistently starts a new poll in accordance with the set interval, regardless of still ongoing previous polls, or it starts a poll and waits until it either finishes successfully or fails (timeout time * #retries) and then starts anew, another interval period.

Thank you on beforehand for taking the time to answer this question.

If the answer depends on the version of LMS, then the version in question is Cisco Prime version 4.1.

Sincerely

Tobias Strandberg

Marvin Rhoads · ‎12-30-2011

I can't find a source to cite that answers your exact question.

However, I am wondering why you would want to set your parameters thus however since such a setup would be so far off of the defaults (which work for most cases).

Defaults (listed here for LMS 4.1) are generally 240 second interval, 700 ms timeout and 3 retries. So that's 240 seconds vs. 2.1 seconds (for the three retries to timeout).

You'd have to change the default over two orders of magnitude net (>100x) to get into an edge situation such as you asked about. Do you have a situation in which you believe that's necessary?

Tobias Strandberg · ‎01-02-2012

Well the situation is like this... I am monitoring several wireless network devices. Some are connected by 3G and others via some other technology.

As you probably already know wireless connections are situational. Sometimes they work and sometimes they don't. What I want to achieve is to reduce the number of false positive alarms from these devices.

The current plan is to increase the timeout to the maximum (60 seconds) and the number of retries to a high enough number to make false positives statistically unprobable.

I would still like the devices to get polled quite often for a quick response, for example each 4 minutes (240 seconds).

The optimal situation would be the following:

Poll repeatedly each 4 minutes...

For each successful poll, the timer resets and after 4 minutes another poll happens, except for when the poll is not successful. When this happens no new polling instances get started and the polling instance that failed continues with a timeout of 60 seconds for X number of repetitions.

As soon as one of the retries is successful, the timer resets and in 4 minutes time a new polling instance gets created anew. If however after all of the retries, the device still fails to respond to the polls an alarm should get created and in 4 minutes time, it's time to check if the connectivity has been restored.

Hopefully the image is a little clearer now... This is why I'm interested in understanding if new polling instances get started independently or not.

Tobias Strandberg · ‎01-09-2012

I still would like to know the answer to my question. Any help is appreciated!

Tobias Strandberg · ‎01-17-2012

Now this thread has been viewed 172 times, which is nice, but still no answer to my question.

Thank you for taking the time to read through it.

Regards,

Tobias