cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Announcements

Community Helping Community

3205
Views
5
Helpful
9
Replies
Highlighted
Beginner

EEM / IP SLA to shutdown lossy high RTT BGP neighbor

Hi,

I'm relatively new to the IP SLA procedure and very new to EEM. I'm searching for the most efficient way to monitor the availability (packet loss and latency) of a BGP neighbor from a router to actively shutdown the neighbor relationship in order to failover to a back up L2L VPN I have configured on an ASA. It's important that I'm able to continue monitoring the BGP neighbor so that when the neighbor becomes stable again, I can reenable the BGP neighbor relationship. I've put something quick together (below) but am not sure if it will do what I want. I'd appreciate any suggestions and feedback.

Thank you!

-Mike

 

 

ip sla 90
 icmp-echo <neighbor_ip> source-ip <source_ip>
 threshold 250
 timeout 500
 frequency 3
ip sla schedule 90 life forever start-time now
ip sla enable reaction-alerts
!
track 90 ip sla 90 reachability
  delay down 3 up 180
!
!
!
event manager applet BGP_NEIGHBOR_DIRTY
 description SHUT DOWN BGP NEIGHBOR IF RTT OVER 250 FOR 3 SECONDS
 event syslog pattern "90 ip sla 90 reachability Up->Down"
 action 1.0  cli command "enable"
 action 1.1  cli command "configure term"
 action 1.2  cli command "router bgp 63320"
 action 1.3  cli command "neighbor <neighbor_ip> shutdown"
 action 1.4  cli command "end"
 
event manager applet BGP_NEIGHBOR_CLEAN
 description ENABLE BGP NEIGHBOR IF RTT UNDER 250 FOR 3 MINUTES
 event syslog pattern "90 ip sla 90 reachability Down->Up"
 action 1.0  cli command "enable"
 action 1.1  cli command "configure term"
 action 1.2  cli command "router bgp 63320"
 action 1.3  cli command "no neighbor <neighbor_ip> shutdown"
 action 1.4  cli command "end"
!

Everyone's tags (2)
9 REPLIES 9
Hall of Fame Cisco Employee

This should do what you want.

This should do what you want.

Beginner

Thanks for the confirmation,

Thanks for the confirmation, Joseph. I have a maintenance window for testing scheduled for Friday night. I'll post the results once testing is complete.

Thanks again,

Mike

Enthusiast

Did this work?  I'd like to

Did this work?  I'd like to implement this same solution but wanted to see if it worked for you like you posted it or if you needed to make any changes

 

Erik

Beginner

Hi Eric, It did work just as

Hi Eric,

 

It did work just as expected. The problem we had with it after testing is that it works just fine if the latency or outage exists between this device and its next hop neighbor. But if the problem is between this device and some other device beyond the neighbor, the configured IP SLA is never triggered and the backup VPN route never comes into play. If you find a workaround, please post it.

 

Thanks,

-Mike

Beginner

I meant to open this up to

I meant to open this up to Joseph Clarke as well. Looking through this forum, it seems he's the go to guy for the more complex EEM scripting questions!

-Mike

Enthusiast

Understood on the limitations

Understood on the limitations of this.  Even with that limitation it will solve 95% of my MPLS BGP outages and bring the reroute times down to a level where the end users won't even notice.

Beginner

Great. Hope it helps you out.

Great. Hope it helps you out.

Hall of Fame Cisco Employee

All you would need to do is

All you would need to do is change the target IP in the IP SLA collector to be some remote device instead of the neighbor IP.  You could pick something like 8.8.8.8 for Google, for example.

Beginner

By chosing a target that is

By chosing a target that is along your desired path, you can certainly have a more robust script. I would use loopback to loopback communication as well, this will force the traffic through the router, and also find any potential issues where the peer is alive and sending bgp but not actually passing traffic. You will definitely need some "fudge" factors in there to deal with routers have to process the ICMP packets (Any CoPP will really really skew the results you are getting). I have had experiences where testing to/from a Nexus device gives wildly different results vs testing through the boxes. 

 

HTH

CreatePlease to create content
Content for Community-Ad
July's Community Spotlight Awards