10-06-2016 05:39 AM - edited 03-08-2019 07:42 AM
Hi All,
I have a location with two ISP circuits, 2 ASA's, and a layer 3 core switch:
ISP1-----5505PRI----3650----5505SEC-----ISP2
Each ASA has 2 VPN tunnels to two data centers (1 in UK, 1 in US). I am using IP SLA with boolean route tracking and delay down/up timers on the layer 3 switch to control when the fail-over from primary ASA to secondary ASA occurs. When all is normal, primary ASA accesses both data centers over it's VPN tunnels, and secondary ASA is dormant. I only want the SLA to cause an ASA fail-over if both tracked objects drop.
ip sla 1
icmp-echo 109.xxx.xxx.171
threshold 1000
timeout 1000
frequency 3
ip sla schedule 1 life forever start-time now
ip sla 2
icmp-echo 50.xxx.xx.190
threshold 1000
timeout 1000
frequency 3
ip sla schedule 2 life forever start-time now
track 1 ip sla 1 reachability
track 2 ip sla 2 reachability
track 100 list boolean or
object 1
object 2
delay down 45 up 90
ip route 0.0.0.0 0.0.0.0 192.168.168.2 track 100
ip route 0.0.0.0 0.0.0.0 192.168.168.3 251
So basically neither sla should go Up->Down unless the responder still can't be reached after 45 seconds (delay down timer). However, my logs are full of Up->Down events happening pretty much around the clock at all times (albeit mostly with sla 2). As a test, I did "term mon" and set up a 5,000 packet ping test to the IP address called in SLA 2 (50.xxx.xx.190). The results suggest that ip sla 2 Up->Down is being triggered intermittently and ignoring the delay timers.
SW168AsiaPacific-FLC3650A#ping 50.xxx.xx.190 repeat 5000
Type escape sequence to abort.
Sending 5000, 100-byte ICMP Echos to 50.xxx.xx.190, timeout is 2 seconds:
!!!!!!!!!!!!!!!!.!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!.!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!.!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!.!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!.!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!.!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!.!!!!!!!!.!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!.!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Oct 6 2016 07:48:47.382: %TRACKING-5-STATE: 2 ip sla 2 reachability Up->Down
Oct 6 2016 07:48:52.382: %TRACKING-5-STATE: 2 ip sla 2 reachability Down->Up
!!.!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!.!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!.!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!.!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!.!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Success rate is 99 percent (2059/2073), round-trip min/avg/max = 260/278/310 ms
Granted, there is some packet loss, but I expect that, as this site is 10,000 miles away from the responder address defined in sla 2 and not in a country with superior ISP services, which is why I have the delay down timer set to 45 seconds, the delay up timer set to 90 seconds, and the icmp-echo frequency set to 3.
Are there any known bugs where the delay timers do not work if configured under a boolean track list? Should I remove the delay down timers from the boolean list and put them each under the individual sla 1 and sla 2 track commands, like so and retest?
track 100 list boolean or
no delay down 45 up 90
track 1 ip sla 1 reachability
delay down 45 up 90
track 2 ip sla 2 reachability
delay down 45 up 90
10-06-2016 06:35 AM
Hi,
Although you are not losing 3 pings in row, but maybe the up and down is happening because you are loosing quite few pings. Can you change the the icmp-echo frequency set to 4 or 5 and delay down to 60 and test again?
HTH
10-06-2016 08:01 AM
Hi Reza,
Thanks for replying. I'm not sure that is the problem, however. It is my understanding that once the sla misses an icmp-echo, the delay down timer starts, and will not declare the sla down until it goes X amount of seconds without a successful icmp-echo received from the responder. In my case, the frequency of the check is every 3 seconds, and the delay down timer is 45 seconds. So if the sla misses a ping from the responder, the 45 second timer starts, and during that 45 second period, the sla check will continue to run every 3 seconds. If any of the checks receive an icmp-echo from the responder, the delay timer is stopped, but if we have 45 seconds worth of failed icmp-echo's, then the sla will declare Up->Down. That means the sla needs to fail to receive an icmp-echo from the responder 15 consecutive times before the sla status should change, unless my understanding of the delay timers is incorrect.
I think I am going to try putting the delay down command on the actual track statements and not under the boolean list and retest. I will report back with my findings.
10-06-2016 10:16 AM
Looks like putting the delay timers on the individual tracked objects themselves and removing them from the boolean track list did the trick.
track 100 list boolean or
no delay down 45 up 90
track 1 ip sla 1 reachability
delay down 45 up 90
track 2 ip sla 2 reachability
delay down 45 up 90
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide