cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1460
Views
5
Helpful
4
Replies

EEM Average Rate of Change BGP Route Oscillation

Daniel Mckibbin
Level 1
Level 1

We have a service provider that has had multiple issues in the past with route oscillation (withdrawing routes and then injecting routes), causing the BGP total routes to fluctuate significantly and affect internet access. I'm working on an EEM that will detect significant variations in routes received from a peer and take an action on it. For my test I'm just trying to get it to detect an average rate of change of 2 routes or more over a 4 minute time period and generate a syslog based on it.

I’m trying to test the entry-type rate feature and it doesn’t seem to be working.  From my understanding the configuration below should say:

 

Poll the value of the OID every 60 seconds for a time period of 4 minutes (Poll interval 60 x  Average Factor of 4 / 60 seconds), taking the difference between the current poll and the last, recording that as an absolute value and averaging the four sets at the end of the time period (4 minutes). The average is then compared to the entry value to determine if the average rate of change is 2 or greater.

event manager applet EEM-MONITOR-CORP-ROUTE-OSCILLATION

event snmp oid enterprises.9.9.187.1.2.4.1.1.72.43.225.101.1.1 get-type exact entry-op ge entry-val "2" entry-type rate average-factor 4 poll-interval 60

action 1.0 syslog msg "BGP Route Change"

 

Here is the data I’m feeding into it. I’m having 1 route for the first poll interval, adding 2 routes the next poll interval, removing 2 on the next interval etc..

 

 

 

Difference between value at start of EEM “1” and value at 60 seconds “3” = 2

Difference between value at 120 seconds “1” and value at 60 seconds “3” = 2

Difference between value at 180 seconds “3” and value at 120 seconds “1” = 2

Difference between value at 240 seconds “1” and value at 180 seconds “3” = 2

 

 

2 + 2 + 2 + 2 = 8/4 = Average of 2 which meets the requirements of 2 or more.

 

In the debugs here is what I interpret the values to be:

 

Curr = Number of current run? (If this is the case it ran through 7 times including the initial storing of the OID value before resetting to 0. I flipped between 1 and 3 routes just in the event this was the case)

Max =  Number of Runs

OPN1 = Current Value of OID value

DIFF = Difference between current and last OID value

Op2 = I believe this to be the comparison in the configuration (entry-val 2)

Op1= I would think this would be the average of all the runs, but instead it always shows 0

 

I think it’s failing because op2 is comparing to op1 and op1 is not greater than or equal to op2. I’m not sure why this is happening.

 

R3(config)#r 25 18:44:15.086: EEM: server creates event group: api_conn=1, grpid= 25 corr_id=5890, timewin=1

*Apr 25 18:44:15.090: EEM: server registers event 0 with esid=32

*Apr 25 18:44:15.090: fh_reg_send_msg_to_fd server sending FH_MSG_EVENT_REGISTER message to FD:snmp xos_ipc_sync_send to fdc->eph = 1179665)

*Apr 25 18:44:15.094: fh_fd_snmp_oid_val_fetch: storing OID value

*Apr 25 18:44:15.098: fh_fd_snmp_start_poll_timer: start_t=60000

*Apr 25 18:44:15.098: fh_fd_snmp_event_register: re=0x6ADAD428, sid=32

*Apr 25 18:44:15.102: fh_reg_send_msg_to_fd server sent FH_MSG_EVENT_REGISTER message to FD:snmp event passed to fdc->eph = 1179665 status 0

*Apr 25 18:44:15.106: EEM: server registers multi event with eid=25

*Apr 25 18:44:15.110: fh_set_applet_process_exit_submode: Called event register: rc = 0, eid = 25

 

 

*Apr 25 18:45:15.098: fh_fd_snmp_process_async

*Apr 25 18:45:15.098: fh_fd_snmp_process_poll_timer: re=0x6ADAD428, timer_type=POLL

*Apr 25 18:45:15.102: fh_fd_snmp_oid_val_fetch: storing OID value

*Apr 25 18:45:15.102: snmp_value_uint_store:can_wrap=0, curr=1 max=4, opn1=3 diff=2

*Apr 25 18:45:15.106: snmp_value_compare:inuse=1 poll_secs=60, uopn1=0

*Apr 25 18:45:15.106: snmp_value_uint_compare:op1=0 op2=2 ret=FALSE

*Apr 25 18:45:15.106: snmp_entry_value_check:Returning FALSE

*Apr 25 18:45:15.110: fh_fd_snmp_start_poll_timer: start_t=60000

R3(config)#

*Apr 25 18:45:32.162: ND Update CDP Notification Event for R4 on Fa0/0

 

 

 

*Apr 25 18:46:15.110: fh_fd_snmp_process_poll_timer: re=0x6ADAD428, timer_type=POLL

*Apr 25 18:46:15.114: fh_fd_snmp_oid_val_fetch: storing OID value

*Apr 25 18:46:15.114: snmp_value_uint_store:can_wrap=1, curr=1 max=4, opn1=1 diff=2

*Apr 25 18:46:15.118: snmp_value_compare:inuse=1 poll_secs=60, uopn1=0

*Apr 25 18:46:15.118: snmp_value_uint_compare:op1=0 op2=2 ret=FALSE

*Apr 25 18:46:15.118: snmp_entry_value_check:Returning FALSE

*Apr 25 18:46:15.122: fh_fd_snmp_start_poll_timer: start_t=60000

 

 

*Apr 25 18:47:15.122: fh_fd_snmp_process_poll_timer: re=0x6ADAD428, timer_type=POLL

*Apr 25 18:47:15.126: fh_fd_snmp_oid_val_fetch: storing OID value

*Apr 25 18:47:15.126: snmp_value_uint_store:can_wrap=0, curr=2 max=4, opn1=3 diff=2

*Apr 25 18:47:15.130: snmp_value_compare:inuse=2 poll_secs=60, uopn1=0

*Apr 25 18:47:15.130: snmp_value_uint_compare:op1=0 op2=2 ret=FALSE

*Apr 25 18:47:15.130: snmp_entry_value_check:Returning FALSE

*Apr 25 18:47:15.134: fh_fd_snmp_start_poll_timer: start_t=60000

 

 

 

*Apr 25 18:48:15.134: fh_fd_snmp_process_poll_timer: re=0x6ADAD428, timer_type=POLL

*Apr 25 18:48:15.138: fh_fd_snmp_oid_val_fetch: storing OID value

*Apr 25 18:48:15.138: snmp_value_uint_store:can_wrap=1, curr=2 max=4, opn1=1 diff=2

*Apr 25 18:48:15.142: snmp_value_compare:inuse=2 poll_secs=60, uopn1=0

*Apr 25 18:48:15.142: snmp_value_uint_compare:op1=0 op2=2 ret=FALSE

*Apr 25 18:48:15.142: snmp_entry_value_check:Returning FALSE

*Apr 25 18:48:15.146: fh_fd_snmp_start_poll_timer: start_t=60000

 

 

 

*Apr 25 18:49:15.146: fh_fd_snmp_process_async

*Apr 25 18:49:15.146: fh_fd_snmp_process_poll_timer: re=0x6ADAD428, timer_type=POLL

*Apr 25 18:49:15.150: fh_fd_snmp_oid_val_fetch: storing OID value

*Apr 25 18:49:15.150: snmp_value_uint_store:can_wrap=0, curr=3 max=4, opn1=3 diff=2

*Apr 25 18:49:15.154: snmp_value_compare:inuse=3 poll_secs=60, uopn1=0

*Apr 25 18:49:15.154: snmp_value_uint_compare:op1=0 op2=2 ret=FALSE

*Apr 25 18:49:15.154: snmp_entry_value_check:Returning FALSE

*Apr 25 18:49:15.158: fh_fd_snmp_start_poll_timer: start_t=60000

 

 

 

, timer_type=POLL

*Apr 25 18:50:15.162: fh_fd_snmp_oid_val_fetch: storing OID value

*Apr 25 18:50:15.162: snmp_value_uint_store:can_wrap=1, curr=3 max=4, opn1=1 diff=2

*Apr 25 18:50:15.166: snmp_value_compare:inuse=3 poll_secs=60, uopn1=0

*Apr 25 18:50:15.166: snmp_value_uint_compare:op1=0 op2=2 ret=FALSE

*Apr 25 18:50:15.166: snmp_entry_value_check:Returning FALSE

*Apr 25 18:50:15.170: fh_fd_snmp_start_poll_timer: start_t=60000

R3(config)#

*Apr 25 18:50:55.978: ND Update CDP Notification Event for R4 on Fa0/0

*Apr 25 18:50:55.982: fh_fd_nd_event_match: num_matches = 0

 

Does anyone have any idea why this isn't working or have a better way to go about getting the same information? Am I incorrectly understanding how the configuration is supposed to work?

4 Replies 4

Joe Clarke
Cisco Employee
Cisco Employee

Using a rate type won't work as expected here.  This object is a gauge, not a counter.  Rate is designed to look at rate of change of a counter (which increments monotonically).  In your case, the change can be either positive or negative.

What would work better for you is to use a watchdog timer policy that periodically polls this OID and caches the value using a context (i.e., action context save/retrieve).  Then you can compare an absolute value by checking both positive and negative change.

Ok, thank you for the feedback. I'll do some research on what you mentioned and post back with further questions or when I have a working solution. Does Cisco have an example somewhere of this type of configuration that you know of?

Honestly, this is the first I've heard of this use case.  It's a good one, but I don't know of any existing examples.

Ok, I'll do some testing in the lab. Thanks for sending me in the right direction.

Review Cisco Networking for a $25 gift card