cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
11898
Views
10
Helpful
18
Replies

IP SLA Troubleshooting

gmacdonald11
Frequent Visitor
Frequent Visitor

Hi folks.  We have had IP SLA turned on for ISP failover for a few years now.  We have been having issues lately where it fails over to the secondary ISP for several minutes/hours and then switches back to the primary.  I have connected a workstation directly on the primary ISP and do not see any problems with the ping.  I have also had the ISP vendor of the primary link in and they can't find any problems with their link/equipment.  Firewall is an ASA 5515 and below is the SLA configuration.  I have turned on message 622001 so I see when the tracked route goes up or down.  My question is what is the best way to troubleshoot what is going on?  (ie. syslog messages I should track, other parameters to turn on, etc.)  

 

sla monitor 10
type echo protocol ipIcmpEcho 8.8.8.8 interface Eastlink
frequency 5


sla monitor schedule 10 life forever start-time now

track 1 rtr 10 reachability

route Eastlink 0.0.0.0 0.0.0.0 x.x.x.x 1 track 1
route Aliant 0.0.0.0 0.0.0.0 x.x.x.x 2

 

In the down state the 'show sla monitor operational-state' shows this:

 

Result of the command: "show sla monitor operational-state"

Entry number: 10
Modification time: 09:12:00.857 AST Mon Jan 21 2019
Number of Octets Used by this Entry: 2056
Number of operations attempted: 500
Number of operations skipped: 500
Current seconds left in Life: Forever
Operational state of entry: Active
Last time this entry was reset: Never
Connection loss occurred: FALSE
Timeout occurred: TRUE
Over thresholds occurred: FALSE
Latest RTT (milliseconds): NoConnection/Busy/Timeout
Latest operation start time: 10:35:10.857 AST Mon Jan 21 2019
Latest operation return code: Timeout
RTT Values:
RTTAvg: 0 RTTMin: 0 RTTMax: 0
NumOfRTT: 0 RTTSum: 0 RTTSum2: 0

 

 

Thanks. Grant. 

18 Replies 18

add the ff lines:

 

logging monitor debugging
terminal monitor

A little more Syslog info below showing when it switched over at 8:11.  Nothing really interesting!  ICMP timeout is 15 seconds and num-packets is 3.  A Cisco support tech will be WebEx'd in this afternoon.

 

7|Jan 23 2019|08:11:04|609001|8.8.8.8||||Built local-host Aliant:8.8.8.8
7|Jan 23 2019|08:11:04|609002|8.8.8.8||||Teardown local-host Aliant:8.8.8.8 duration 0:00:00
7|Jan 23 2019|08:11:04|609001|8.8.8.8||||Built local-host Aliant:8.8.8.8
7|Jan 23 2019|08:11:04|609002|8.8.8.8||||Teardown local-host Aliant:8.8.8.8 duration 0:00:00
7|Jan 23 2019|08:11:04|609001|8.8.8.8||||Built local-host Aliant:8.8.8.8
7|Jan 23 2019|08:11:04|609002|8.8.8.8||||Teardown local-host Aliant:8.8.8.8 duration 0:00:00
7|Jan 23 2019|08:11:03|609001|8.8.8.8||||Built local-host Aliant:8.8.8.8
7|Jan 23 2019|08:11:02|609002|8.8.8.8||||Teardown local-host Aliant:8.8.8.8 duration 0:00:00
7|Jan 23 2019|08:11:01|609001|8.8.8.8||||Built local-host Aliant:8.8.8.8
7|Jan 23 2019|08:11:01|609002|8.8.8.8||||Teardown local-host Aliant:8.8.8.8 duration 0:00:00
7|Jan 23 2019|08:11:01|609001|8.8.8.8||||Built local-host Aliant:8.8.8.8
7|Jan 23 2019|08:11:01|609002|8.8.8.8||||Teardown local-host Aliant:8.8.8.8 duration 0:00:00
7|Jan 23 2019|08:11:00|609001|8.8.8.8||||Built local-host Aliant:8.8.8.8
7|Jan 23 2019|08:10:52|609001|8.8.8.8||||Built local-host Eastlink:8.8.8.8
7|Jan 23 2019|08:10:51|609002|8.8.8.8||||Teardown local-host Eastlink:8.8.8.8 duration 0:00:00
7|Jan 23 2019|08:10:51|609001|8.8.8.8||||Built local-host Eastlink:8.8.8.8
7|Jan 23 2019|08:10:51|609002|8.8.8.8||||Teardown local-host Eastlink:8.8.8.8 duration 0:00:00
7|Jan 23 2019|08:10:51|609001|8.8.8.8||||Built local-host Eastlink:8.8.8.8
7|Jan 23 2019|08:10:51|609002|8.8.8.8||||Teardown local-host Eastlink:8.8.8.8 duration 0:00:00
7|Jan 23 2019|08:10:51|609001|8.8.8.8||||Built local-host Eastlink:8.8.8.8
7|Jan 23 2019|08:10:50|609002|8.8.8.8||||Teardown local-host Eastlink:8.8.8.8 duration 0:00:00
7|Jan 23 2019|08:10:50|609001|8.8.8.8||||Built local-host Eastlink:8.8.8.8
7|Jan 23 2019|08:10:50|609002|8.8.8.8||||Teardown local-host Eastlink:8.8.8.8 duration 0:00:00
7|Jan 23 2019|08:10:50|609001|8.8.8.8||||Built local-host Eastlink:8.8.8.8

An update on this, right before the %ASA-6-622001 (Removing tracked route) happens there is the following error:

 

%ASA-3-317012: Interface IP route counter negative - GigabitEthernet0/1

 

The time it takes to come back varies from a few minutes to a few hours.  

 

It looks like a bug in the ASA.  Looking for a solution (probably a software update; currently 9.9(2).)  I found one other case of this which suggested rebooting the ASA to fix (we did that.)

 

Has anybody else seen this?

 

Thanks.

Hi folks.  We figured out what the problem was.  We ended up mirroring the port from the ASA to the switch and ran WireShark on the mirrored port.  We have a block of 5 static IP's.  We noticed that when the failover occurred the primary IP would not respond to the ping, however the standby ping did respond.  However, both IP's were still pingable from outside our office.  We expect that our ISP was somehow reusing our primary static IP.  

 

Since we discovered this the problem has not reoccurred.  We expect the ISP recognized the problem as well and fixed it.  

 

Thanks. Grant.