cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
5835
Views
5
Helpful
39
Replies

actions from different eem applets can be conflicted ???

HDBank Network
Level 1
Level 1

I've configured many eem applets on my router, each applet has a lot of CLI actions. I'm afraid when events occur almost at the same time, their actions can be conflicted with each other.

Thanks

39 Replies 39

Joe Clarke
Cisco Employee
Cisco Employee

Sure, actions can conflict.  By default, up to 32 EEM applets can run at the same time.  This means that you can clog VTY lines preventing some instances from running.  This also means that if different applets configure conflicting commands, you can run into race conditions.  You can work around this by reducing the number of applet threads, or by converting to Tcl, and using mutexes to prevent concurrent execution.

To reduce the number of EEM threads, use the "event manager scheduler applet thread class default number X" configuration command.  Where X is the number of threads to allow.

Thanks Joseph

I think converting TCL, using mutexes are suitable in my case because each applet have to monitor a leaseline, when it goes down, actions have to reconfigure next-hop & QoS profile for traffic, I can't reduce the number of applet threads.

So, I hope you will give some detail informations about this solution for me (I don't know tcl at all).

Thank you very much

I'm not sure exactly what you're doing with EEM, but you can use a file on flash as a mutex.  For example, before opening the CLI session in your Tcl policy, create a file on flash:

set fd [open "flash:/lock" w]

close $fd

But before that, check for the existence of this file, then loop while it exists.  Just be sure to delete the file before exiting.

while { [file exists "flash:/lock"] } {

    after 500

}

if { ! [file exists "flash:/lock" w] } {

    set fd [open "flash:/lock" w]

    close $fd

} else {

    puts "ERROR: Failed to obtain exclusive lock."

}

if { [catch {cli_open} result] } {

    set ei $errorInfo

    file delete -force -- "flash:/lock"

    error $result $ei

}

array set cli $result

...

catch {cli_close $cli(fd) $cli(tty_id)}

file delete -force -- "flash:/lock"

while { [file exists "flash:/lock"] } {
    after 500
}

or

while { ! [file exists "flash:/lock"] } {
    after 500
}

???

But I hope you will write me an example, it'll be easier for me to read, understand & modify than begin studying tcl & write by myself.

I attached my eem configuration that monitor 1 leaseline based on packet loss rate & packet success rate.

Thank you very much

I've converted your first policy to Tcl to illustrate the mutex locking.  The same lock code would be applicable to the other policies.

::cisco::eem::event_register_timer watchdog time 50

namespace import ::cisco::eem::*
namespace import ::cisco::lib::*

proc cleanup { status result } {
    file delete -force -- "flash:/lock"
    if { $status } {
        error $result
    }

    exit 0
}

while { [file exists "flash:/lock"] } {
    after 500
}

if { ! [file exists "flash:/lock"] } {
    set fd [open "flash:/lock" w]
    close $fd
} else {
    puts "ERROR: Failed to obtain exclusive lock!"
    exit 1
}

if {[catch {array set _ca [register_counter name pk_loss_LINE1]} result]} {
    cleanup 1 $result
}
if {[catch {counter_modify event_id $_ca(event_id) val 0 op set} result]} {
    cleanup 1 $result
} else {
    set _counter_value_remain $result
}
if {[catch {unregister_counter event_id $_ca(event_id) event_spec_id $_ca(event_spec_id)} result]} {
    cleanup 1 $result
}

if {[catch {array set _ca [register_counter name pk_success_LINE1]} result]} {
    cleanup 1 $result
}
if {[catch {counter_modify event_id $_ca(event_id) val 0 op set} result]} {
    cleanup 1 $result
} else {
    set _counter_value_remain $result
}
if {[catch {unregister_counter event_id $_ca(event_id) event_spec_id $_ca(event_spec_id)} result]} {
    cleanup 1 $result
}

cleanup 0 {}

I tried converting my applet LINE1_DOWN to tcl language, I hope you'll glance over this

Here's the applet:

event manager applet LINE1_DOWN 
event counter name pk_loss_LINE1 entry-val 3 entry-op ge exit-val 3 exit-op lt
action 01 syslog priority critical msg "LINE1 is DOWN"
action 02 cli command "enable"
action 03 cli command "conf t"
action 04 cli command "int Tunnel100006"
action 05 cli command "no service-policy output QoS_Metro-Parent"
action 06 cli command "service-policy output QoS_1_line-Parent"
action 07 cli command "exit"
action 08 cli command "route-map ROUTE_PER_PROTOCOL permit 20"
action 09 cli command "no set ip next-hop 10.0.173.25"
action 10 cli command "set ip next-hop 10.0.172.25"
action 11 cli command "exit"
action 12 cli command "event manager applet LINE1_DOWN"
action 13 cli command "no event counter"
action 14 cli command "event manager applet PK_LOSS_LINE1"
action 15 cli command "no event snmp"
action 16 cli command "event manager applet PK_SUCCESS_LINE1"
action 17 cli command "event snmp oid 1.3.6.1.4.1.9.9.42.1.2.9.1.6.2 get-type exact entry-op eq entry-val 2 poll-interval 5"
action 18 cli command "event manager applet LINE1_UP"
action 19 cli command "event counter name pk_success_LINE1 entry-val 9 entry-op ge exit-val 9 exit-op lt"
action 20 counter name pk_loss_LINE1 op set value 0

And here's the tcl LINE1_Down.tcl:

::cisco::eem::event_register_counter name pk_loss_LINE1 entry-val 3 entry-op ge exit-val 3 exit-op lt

namespace import ::cisco::eem::*
namespace import ::cisco::lib::*

proc cleanup { status result } {
    file delete -force -- "flash:/lock"
    if { $status } {
        error $result
    }

    exit 0
}

while { [file exists "flash:/lock"] } {
    after 500
}

if { ! [file exists "flash:/lock"] } {
    set fd [open "flash:/lock" w]
    close $fd
} else {
    puts "ERROR: Failed to obtain exclusive lock!"
    exit 1
}

action_syslog priority critical msg "LINE1 is DOWN"

if [catch {cli_open} result] {
    cleanup 1 $result
} else {
    array set cli1 $result
}
if [catch {cli_exec $cli1(fd) "enable"} result] {
    cleanup 1 $result
}
if [catch {cli_exec $cli1(fd) "conf t"} result] {
    cleanup 1 $result
}
if [catch {cli_exec $cli1(fd) "int Tunnel100006"} result] {
    cleanup 1 $result
}
if [catch {cli_exec $cli1(fd) "no service-policy output QoS_Metro-Parent"} result] {
    cleanup 1 $result
}
if [catch {cli_exec $cli1(fd) "service-policy output QoS_1_line-Parent"} result] {
    cleanup 1 $result
}
if [catch {cli_exec $cli1(fd) "exit"} result] {
    cleanup 1 $result
}
if [catch {cli_exec $cli1(fd) "route-map ROUTE_PER_PROTOCOL permit 20"} result] {
    cleanup 1 $result
}
if [catch {cli_exec $cli1(fd) "no set ip next-hop 10.0.173.25"} result] {
    cleanup 1 $result
}
if [catch {cli_exec $cli1(fd) "set ip next-hop 10.0.172.25"} result] {
    cleanup 1 $result
}
if {[catch {array set _ca [register_counter name pk_loss_LINE1]} result]} {
    cleanup 1 $result
}
if {[catch {counter_modify event_id $_ca(event_id) val 0 op set} result]} {
    cleanup 1 $result
} else {
    set _counter_value_remain $result
}
if {[catch {unregister_counter event_id $_ca(event_id) event_spec_id $_ca(event_spec_id)} result]} {
    cleanup 1 $result
}
if [catch {cli_exec $cli1(fd) "event manager policy pk_success_LINE1.tcl"} result] {
    cleanup 1 $result
}
if [catch {cli_exec $cli1(fd) "event manager policy LINE1_Up.tcl"} result] {
    cleanup 1 $result
}
if [catch {cli_exec $cli1(fd) "no event manager policy pk_loss_LINE1.tcl"} result] {
    cleanup 1 $result
}
if [catch {cli_exec $cli1(fd) "no event manager policy LINE1_Down.tcl"} result] {
    cleanup 1 $result
}
catch {cli_close $cli1(fd) $cli1(tty_id)}
cleanup 0 {}

I'm affraid the command "no event manager policy LINE1_Down.tcl" will cause error because it's being in the LINE1_Down.tcl

Thank you very much

Your ED registration line needs to be:

::cisco::eem::event_register_counter name pk_loss_LINE1 entry_val 3 entry_op ge exit_val 3 exit_op lt

I tested on my router, but nothing happened when the leaseline went down (tcl didn't run any CLI actions). I checked the flash: and the lock file always was there.

I saw this in debugging mode:

*Dec 30 06:29:55.999: fh_tcl_esi_open: fd=1
*Dec 30 06:29:55.999: fh_tcl_esi_open: fd=2
*Dec 30 06:29:55.999: fh_tcl_get_mode: mode = 1, StartupScript = system:/lib/tcl/base.tcl, RealScript = system:/lib/tcl/eem_scripts_registered/pk_loss_FPT_CongHoa.tcl
*Dec 30 06:29:56.027: fh_register_evreg_cmds: tctx=63D02A38, dummy=1
*Dec 30 06:29:56.035: fh_tcl_compile_policy: evaluating policy: startup_scriptname=system:/lib/tcl/base.tcl, real_scriptname=system:/lib/tcl/eem_scripts_registered/pk_loss_FPT_CongHoa.tcl
*Dec 30 06:29:56.039: fh_tcl_slave_interp_init: interp=64188598, tctx=63D02A38, fh_mode=1, real=system:/lib/tcl/eem_scripts_registered/pk_loss_FPT_CongHoa.tcl, curr=
*Dec 30 06:29:56.051: fh_register_evreg_cmds: tctx=63D02A38, dummy=1
*Dec 30 06:29:56.187: fh_tcl_esi_close: fd=4
*Dec 30 06:29:56.187: %HA_EM-6-LOG: timer50s.tcl: Process Forced Exit
*Dec 30 06:29:56.187: %HA_EM-6-LOG: timer50s.tcl:     while executing
*Dec 30 06:29:56.187: %HA_EM-6-LOG: timer50s.tcl: "after 500"
*Dec 30 06:29:56.187: %HA_EM-6-LOG: timer50s.tcl:     invoked from within
*Dec 30 06:29:56.187: %HA_EM-6-LOG: timer50s.tcl: "$slave eval $Contents"
*Dec 30 06:29:56.187: %HA_EM-6-LOG: timer50s.tcl:     (procedure "eval_script" line 7)
*Dec 30 06:29:56.187: %HA_EM-6-LOG: timer50s.tcl:     invoked from within
*Dec 30 06:29:56.187: %HA_EM-6-LOG: timer50s.tcl: "eval_script slave $scriptname"
*Dec 30 06:29:56.187: %HA_EM-6-LOG: timer50s.tcl:     invoked from within
*Dec 30 06:29:56.187: %HA_EM-6-LOG: timer50s.tcl: "if {$security_level == 1} {       #untrusted script
*Dec 30 06:29:56.187: %HA_EM-6-LOG: timer50s.tcl:      interp create -safe slave
*Dec 30 06:29:56.187: %HA_EM-6-LOG: timer50s.tcl:      interp share {} stdout slave
*Dec 30 06:29:56.187: %HA_EM-6-LOG: timer50s.tcl:      interp share {} stderr slave..."
*Dec 30 06:29:56.187: %HA_EM-6-LOG: timer50s.tcl:     (file "system:/lib/tcl/base.tcl" line 50)
*Dec 30 06:29:56.187: %HA_EM-6-LOG: timer50s.tcl: Tcl policy execute failed: Process Forced Exit
*Dec 30 06:29:56.187: %HA_EM-6-LOG: timer50s.tcl: Tcl policy execute failed: Process Forced Exit
*Dec 30 06:29:56.187: fh_tcl_esi_close: fd=5
*Dec 30 06:29:56.191: fh_tcl_assoc_data_delproc: freeing tctx=6603BA68
*Dec 30 06:29:56.451: [fh_dummy_cmd]

Help me please

You have to make sure that when nothing is running, the lock file doesn't exist.  If you had this file generated while converting your policies to Tcl, and there was an error such that it never got removed, then all policies will spin until you manually remove the lock file.  The two policies I've seen should be good in terms of deleting the lock file, but make sure all policies that create the lock also delete it.

Sorry, it's my bad. My action_syslog command was wrong, it needs to be:

action_syslog priority crit msg "LINE1 is DOWN"


Thank you very much, Joseph. I wish you a happy new year.

And the last question, is it ok with more than 200 tcl files (like this) run on 1 Router 3845 ?

Yes, that's fine, but be aware that unlike applets, only one Tcl policy runs at a time by default.  If you need to have multiple policies run in parallel (like with applets), you will need to incrase the thread count with the command:

event manager scheduler script thread class default number NUM

Where NUM is the number of threads.  For a 3845, you can probably increase that number to around 25.

If I set multiple policies run in parallel by that command, how the policies can concurrently execute because of the lock file.

Thanks

The lock will prevent them from truly executing concurrently.  While the EEM Serve will spawn them concurrently, they will spin on the lock file until they can move forward (i.e. when they can create the lock themselves).