cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
739
Views
1
Helpful
2
Replies

Action callback timeout - killing the daemon

neetimit
Cisco Employee
Cisco Employee

Hi NSO experts,

Any ideas on this problem:

We have a action callback which has a timeout defined with dp.action_set_timeout function. The action is doing fetch ssh keys. We have noticed that if one instance of this action times out, then all other running instances time out. Question is how do we avoid it? We don't want other instances to timeout.

Thanks,

Neetika

2 Replies 2

tohagber
Cisco Employee
Cisco Employee

I'm afraid there is no way, I'm aware of, to ensure other running instances of the action callback not being affected by a socket timeout. Underneath there is one action daemon registered to handle the action callbacks, in case there is a timeout on the socket. Whether it is the control socket (action init() callback) or the worker socket (action() callback) NSO will close its side of the socket towards this action daemon, hence effecting all running instances. The action daemon will then need to re-register its callbacks with NSO.

Basically you need to try to avoid that the timeout to happens for cases where you know the execution can take longer time. E.g If you know you interact with a slow device (or whatever) you can extend the timeout using actionSetTimeout() (java) or action_set_timeout() (python) for the current callback invocation.

I'm aware this is an old question but thought to update it anyway as this was recently discussed in a case.

What one can do to prevent the global ncs.conf query-timeout to expire. Is to basically implement your own timer in the action code. This would require that you introduce a dedicated thread in the action. Thus on an action invocation you would create a thread which job is to implement a timer loop (that is shorter then the query-timeout). Once that timer expires you have the option in this timer loop to either
1. Extend the global timeout using action_set_timeout() (Note!! This you clearly do not want to do in infinity)
2. Make sure the action clean up its running thread process and return an ncs.error and use the action_seterr() to provide a error string back to the caller.