09-20-2013 10:50 AM
Hi Experts,
I need help with an EEM TCL script for the CRS platform that generates a SYSLOG message after the CPU reaches a threshold value and then stays over the threshold value for 15 minutes, I've already tryied several thing and the last TCL script that I tested generated the SYSLOG message when the CPU reaches the threshold but I can't seem to find any way to make it wait the 15 min over the threshold and then generate the message.
My current script looks like this:
::cisco::eem::event_register_wdsysmon timewin 900 sub1 cpu_tot op ge val 70
namespace import ::cisco::eem::*
namespace import ::cisco::lib::*
array set event_details [event_reqinfo]
action_syslog msg "sub1 is $event_details(sub1)"
action_syslog msg "High CPU threshold value over 70%"
puts ok
I've tryied using the 'period' option for the 'cpu_tot' variable but the TCL script was'nt recognized and couldn't be registered, and I'm using the 'timewin' option here but it seems to be wrong as it says it's the time it has for multiple sub-events to ocurr in order for the script to execute.
timewin | (Optional) Time window within which all of the subevents have to occur in order for an event to be generated and is specified in SSSSSSSSSS[.MMM] format. SSSSSSSSSS format must be an integer representing seconds between 0 and 4294967295 inclusive. MMM format must be an integer representing milliseconds between 0 and 999). |
Also, the 'period' option I believe wouldn't have worked because I understand that it referrs to the time period that the script will take to monitor the CPU:
op | (Optional) Comparison operator that is used to compare the collected total system CPU usage sample percentage with the specified percentage value. If true, an event is raised. |
val | (Optional) Percentage value in which the average CPU usage during the sample period is compared. |
period | (Optional) Time period for averaging the collection of samples and is specified in SSSSSSSSSS[.MMM] format. SSSSSSSSSS format must be an integer representing seconds between 0 and 4294967295, inclusive. MMM format must be an integer representing milliseconds between 0 and 999. If this argument is not specified, the most recent sample is used. |
As I said, I couldn't try this because the script send an error when I tried to register using the following line:
::cisco::eem::event_register_wdsysmon sub1 cpu_tot op ge val 70 period 900
This is the error message that appeared:
RP/0/RP0/CPU0:CRS(config)#event manager policy test.tcl username cisco
RP/0/RP0/CPU0:CRS(config)#commit
Thu Aug 29 12:35:43.569 CDT
% Failed to commit one or more configuration items during a pseudo-atomic operation. All changes made have been reverted. Please issue 'show configuration failed' from this session to view the errors
RP/0/RP0/CPU0:CRS(config)#sh conf fail
Thu Aug 29 12:35:52.427 CDT
!! SEMANTIC ERRORS: This configuration was rejected by
!! the system due to semantic errors. The individual
!! errors with each failed configuration command can be
!! found below.
event manager policy test.tcl username cisco persist-time 3600
!!% Embedded Event Manager configuration: failed to retrieve intermediate registration result for policy test.tcl
end
Anyway, to make this work I understand that I need nested TCL scripts that do the following:
I don't know how I can acomplish this so if anyone can help me with this or show me another way to do this I would really appreciate it.
Thanks in advance for all your help!
09-21-2013 04:06 PM
Neither option is likely to do what you want. The timewin is for correlating multiple events, and period is the polling interval. What you want is to create a timer when the CPU is first detected as being high, countdown 15 minutes, then alert you. You can do this with a nested EEM policy. For example, you can add the following to your existing policy:
proc get_pol_dir { fd } {
set res {}
set output [cli_exec $fd "show event manager directory user policy"]
set output [string trim $output]
regsub -all "\r\n" $output "\n" result
set lines [split $result "\n"]
foreach line $lines {
if { $line == "" } {
continue
}
if { ! [regexp {\s} $line] && ! [regexp {#$} $line] } {
set res $line
break
}
}
if { $res == {} } {
return -code error "The user policy directory has not been configured"
}
return $res
}
if { [catch {cli_open} result] } {
error $result $errorInfo
}
array set cli $result
set output [cli_exec $cli(fd) "show event manager policy registered | inc tm_alert_high_cpu.tcl"]
if { [regexp {tm_alert_high_cpu.tcl} $output] } {
exit 0
}
set poldir [get_pol_dir $cli(fd)]
set polname "${poldir}/tm_alert_high_cpu.tcl"
set fd [open $polname "w"]
puts $fd "::cisco::eem::event_register_timer countdown time 900"
puts $fd "namespace import ::cisco::eem::*"
puts $fd "namespace import ::cisco::lib::*"
puts $fd "action_syslog msg \"CPU has been over 70% for 15 minutes\""
close $fd
cli_exec $cli(fd) "config t"
cli_exec $cli(fd) "event manager policy tm_lert_high_cpu.tcl username eem"
cli_exec $cli(fd) "commit"
cli_exec $cli(fd) "end"
catch {cli_close $cli(fd) $cli(tty_id)}
###
Additionally, you'll want another permanently configured policy that checks for a low CPU threshold. Something like:
::cisco::eem::event_register_wdsysmon sub1 cpu_tot op le val 10
namespace import ::cisco::eem::*
namespace import ::cisco::lib::*
if { [catch {cli_open} result] } {
error $result $errorInfo
}
array set cli $result
cli_exec $cli(fd) "config t"
cli_exec $cli(fd) "no event manager policy tm_alert_high_cpu.tcl"
cli_exec $cli(fd) "commit"
cli_exec $cli(fd) "end"
catch {cli_close $cli(fd) $cli(tty_id)}
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide