01-21-2019 06:13 AM - edited 01-21-2019 06:39 AM
Hi folks. We have had IP SLA turned on for ISP failover for a few years now. We have been having issues lately where it fails over to the secondary ISP for several minutes/hours and then switches back to the primary. I have connected a workstation directly on the primary ISP and do not see any problems with the ping. I have also had the ISP vendor of the primary link in and they can't find any problems with their link/equipment. Firewall is an ASA 5515 and below is the SLA configuration. I have turned on message 622001 so I see when the tracked route goes up or down. My question is what is the best way to troubleshoot what is going on? (ie. syslog messages I should track, other parameters to turn on, etc.)
sla monitor 10
type echo protocol ipIcmpEcho 8.8.8.8 interface Eastlink
frequency 5
sla monitor schedule 10 life forever start-time now
track 1 rtr 10 reachability
route Eastlink 0.0.0.0 0.0.0.0 x.x.x.x 1 track 1
route Aliant 0.0.0.0 0.0.0.0 x.x.x.x 2
In the down state the 'show sla monitor operational-state' shows this:
Result of the command: "show sla monitor operational-state"
Entry number: 10
Modification time: 09:12:00.857 AST Mon Jan 21 2019
Number of Octets Used by this Entry: 2056
Number of operations attempted: 500
Number of operations skipped: 500
Current seconds left in Life: Forever
Operational state of entry: Active
Last time this entry was reset: Never
Connection loss occurred: FALSE
Timeout occurred: TRUE
Over thresholds occurred: FALSE
Latest RTT (milliseconds): NoConnection/Busy/Timeout
Latest operation start time: 10:35:10.857 AST Mon Jan 21 2019
Latest operation return code: Timeout
RTT Values:
RTTAvg: 0 RTTMin: 0 RTTMax: 0
NumOfRTT: 0 RTTSum: 0 RTTSum2: 0
Thanks. Grant.
01-22-2019 11:33 AM
add the ff lines:
logging monitor debugging
terminal monitor
01-23-2019 04:33 AM
A little more Syslog info below showing when it switched over at 8:11. Nothing really interesting! ICMP timeout is 15 seconds and num-packets is 3. A Cisco support tech will be WebEx'd in this afternoon.
7|Jan 23 2019|08:11:04|609001|8.8.8.8||||Built local-host Aliant:8.8.8.8
7|Jan 23 2019|08:11:04|609002|8.8.8.8||||Teardown local-host Aliant:8.8.8.8 duration 0:00:00
7|Jan 23 2019|08:11:04|609001|8.8.8.8||||Built local-host Aliant:8.8.8.8
7|Jan 23 2019|08:11:04|609002|8.8.8.8||||Teardown local-host Aliant:8.8.8.8 duration 0:00:00
7|Jan 23 2019|08:11:04|609001|8.8.8.8||||Built local-host Aliant:8.8.8.8
7|Jan 23 2019|08:11:04|609002|8.8.8.8||||Teardown local-host Aliant:8.8.8.8 duration 0:00:00
7|Jan 23 2019|08:11:03|609001|8.8.8.8||||Built local-host Aliant:8.8.8.8
7|Jan 23 2019|08:11:02|609002|8.8.8.8||||Teardown local-host Aliant:8.8.8.8 duration 0:00:00
7|Jan 23 2019|08:11:01|609001|8.8.8.8||||Built local-host Aliant:8.8.8.8
7|Jan 23 2019|08:11:01|609002|8.8.8.8||||Teardown local-host Aliant:8.8.8.8 duration 0:00:00
7|Jan 23 2019|08:11:01|609001|8.8.8.8||||Built local-host Aliant:8.8.8.8
7|Jan 23 2019|08:11:01|609002|8.8.8.8||||Teardown local-host Aliant:8.8.8.8 duration 0:00:00
7|Jan 23 2019|08:11:00|609001|8.8.8.8||||Built local-host Aliant:8.8.8.8
7|Jan 23 2019|08:10:52|609001|8.8.8.8||||Built local-host Eastlink:8.8.8.8
7|Jan 23 2019|08:10:51|609002|8.8.8.8||||Teardown local-host Eastlink:8.8.8.8 duration 0:00:00
7|Jan 23 2019|08:10:51|609001|8.8.8.8||||Built local-host Eastlink:8.8.8.8
7|Jan 23 2019|08:10:51|609002|8.8.8.8||||Teardown local-host Eastlink:8.8.8.8 duration 0:00:00
7|Jan 23 2019|08:10:51|609001|8.8.8.8||||Built local-host Eastlink:8.8.8.8
7|Jan 23 2019|08:10:51|609002|8.8.8.8||||Teardown local-host Eastlink:8.8.8.8 duration 0:00:00
7|Jan 23 2019|08:10:51|609001|8.8.8.8||||Built local-host Eastlink:8.8.8.8
7|Jan 23 2019|08:10:50|609002|8.8.8.8||||Teardown local-host Eastlink:8.8.8.8 duration 0:00:00
7|Jan 23 2019|08:10:50|609001|8.8.8.8||||Built local-host Eastlink:8.8.8.8
7|Jan 23 2019|08:10:50|609002|8.8.8.8||||Teardown local-host Eastlink:8.8.8.8 duration 0:00:00
7|Jan 23 2019|08:10:50|609001|8.8.8.8||||Built local-host Eastlink:8.8.8.8
01-29-2019 10:53 AM
An update on this, right before the %ASA-6-622001 (Removing tracked route) happens there is the following error:
%ASA-3-317012: Interface IP route counter negative - GigabitEthernet0/1
The time it takes to come back varies from a few minutes to a few hours.
It looks like a bug in the ASA. Looking for a solution (probably a software update; currently 9.9(2).) I found one other case of this which suggested rebooting the ASA to fix (we did that.)
Has anybody else seen this?
Thanks.
02-15-2019 05:03 AM
Hi folks. We figured out what the problem was. We ended up mirroring the port from the ASA to the switch and ran WireShark on the mirrored port. We have a block of 5 static IP's. We noticed that when the failover occurred the primary IP would not respond to the ping, however the standby ping did respond. However, both IP's were still pingable from outside our office. We expect that our ISP was somehow reusing our primary static IP.
Since we discovered this the problem has not reoccurred. We expect the ISP recognized the problem as well and fixed it.
Thanks. Grant.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide