cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
29849
Views
1
Helpful
23
Replies

IP SLA TRACK issue

matejbernat
Level 1
Level 1

Hello,

I am facing problem with ip sla track mechanism.

I have two ISPs connected to my router C881.

  • ISP1 = primary (connected to FastEthernet4)
  • ISP2 = backup (connected to FastEterhet3/Vlan20)

I am using ISP1 as primary ISP and tracking reachability of IP address 8.8.4.4 through ip sla track 200:

!
ip sla 200
 icmp-echo 8.8.4.4
 request-data-size 200
 timeout 3000
 threshold 1000
 owner SYSADMIN
 frequency 5
 history hours-of-statistics-kept 25
 history distributions-of-statistics-kept 20
 history lives-kept 2
 history buckets-kept 60
 history filter all
!
ip sla schedule 200 life forever start-time now
ip sla enable reaction-alerts
!
track 200 ip sla 200 reachability
 delay down 30 up 180
!

 

Default-route to ISP1 is tracked and second default-route is configured with higher value of metric.
This is how my static routing looks like:
 

!
ip route 0.0.0.0 0.0.0.0 FastEthernet4 1.1.1.1 name ISP1 track 200
ip route 0.0.0.0 0.0.0.0 Vlan20 2.2.2.2 250 name ISP2
ip route 8.8.4.4 255.255.255.255 FastEthernet4 1.1.1.1 name force-ISP1
ip route 8.8.4.4 255.255.255.255 Null0 250 name deny-via-ISP2
!

 

It works almost as expected:

- when ISP1 is going down (i mean if 8.8.4.4 becomes unreachable via ISP1), after 30 seconds, default route is pointing to ISP2
- also when ISP1 is going up (8.8.4.4 becomes reachable again via ISP1), after 180 seconds, default route is pointing back to ISP1

*Mar 14 14:09:52.034: %TRACKING-5-STATE: 200 ip sla 200 reachability Up->Down
*Mar 14 14:12:57.039: %TRACKING-5-STATE: 200 ip sla 200 reachability Down->Up

...but

In some cases (I believe that it may be in situation, that ISP1 is down for longer time), ip sla/track is unable to detect that ISP1 becomes UP again and the default route is pointing to ISP2 forever (at least until FastEthernet4 is disconnected/connected again, or shut/no shut command is applied).

*Mar 17 14:18:13.019: %TRACKING-5-STATE: 200 ip sla 200 reachability Up->Down

 

This is how some show command outputs looks like:

ROUTER-MD#show ip route static
     8.0.0.0/32 is subnetted, 2 subnets
S       8.8.4.4 [1/0] via 1.1.1.1, FastEthernet4
S*   0.0.0.0/0 [250/0] via 2.2.2.2, Vlan20

ROUTER-MD#show ip sla statistics 200 details
IPSLAs Latest Operation Statistics

IPSLA operation id: 200
        Latest RTT: NoConnection/Busy/Timeout
Latest operation start time: *12:17:51.494 MET Wed Mar 18 2015
Latest operation return code: Timeout
Over thresholds occurred: FALSE
Number of successes: 0
Number of failures: 31
Operation time to live: Forever
Operational state of entry: Active
Last time this entry was reset: Never

ROUTER-MD#show track 200
Track 200
  IP SLA 200 reachability
  Reachability is Down
    42 changes, last change 22:00:06
  Delay up 180 secs, down 30 secs
  Latest operation return code: Timeout
  Tracked by:
    STATIC-IP-ROUTING 0

But as you can see here, 8.8.4.4 is reachable from the router:

ROUTER-MD#show ip route 8.8.4.4
Routing entry for 8.8.4.4/32
  Known via "static", distance 1, metric 0
  Routing Descriptor Blocks:
  * 1.1.1.1, via FastEthernet4
      Route metric is 0, traffic share count is 1

ROUTER-MD#ping 8.8.4.4
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 8.8.4.4, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 40/41/44 ms

 

During that behavior, I see no icmp traffic destined to 8.8.4.4 with "debug ip icmp" command enabled.

Debug IP sla & track results are here:

ROUTER-MD#show debug
Track debugging is on
IP SLAs:
  TRACE debugging is on for entries:
    200
  ERROR debugging is on for entries:
    200

*Mar 18 12:40:16.530: IP SLAs(200) Scheduler: saaSchedulerEventWakeup
*Mar 18 12:40:16.530: IP SLAs(200) Scheduler: Starting an operation
*Mar 18 12:40:16.530: IP SLAs(200) echo operation: Sending an echo operation - destAddr=8.8.4.4, sAddr=1.1.1.2
*Mar 18 12:40:16.530: IP SLAs(200) echo operation: Sending ID: 27
*Mar 18 12:40:19.530: IP SLAs(200) echo operation: Timeout - destAddr=8.8.4.4, sAddr=1.1.1.2
*Mar 18 12:40:19.530: IP SLAs(200) Scheduler: Updating result
*Mar 18 12:40:19.530: IP SLAs(200) Scheduler: start wakeup timer, delay = 2000

*Mar 18 12:40:21.530: IP SLAs(200) Scheduler: saaSchedulerEventWakeup
*Mar 18 12:40:21.530: IP SLAs(200) Scheduler: Starting an operation
*Mar 18 12:40:21.530: IP SLAs(200) echo operation: Sending an echo operation - destAddr=8.8.4.4, sAddr=1.1.1.2
*Mar 18 12:40:21.530: IP SLAs(200) echo operation: Sending ID: 27
*Mar 18 12:40:24.530: IP SLAs(200) echo operation: Timeout - destAddr=8.8.4.4, sAddr=1.1.1.2
*Mar 18 12:40:24.530: IP SLAs(200) Scheduler: Updating result
*Mar 18 12:40:24.530: IP SLAs(200) Scheduler: start wakeup timer, delay = 2000

...etc

I would appreciate any help.

 

Thank you,

 

MB

23 Replies 23

That should be working. I cannot see why it would not work with your configuration. As long as the source Ip in the SLA 200 is a valid source IP, then that should work.

are you able to show us any debugging infromation on the routing table and IP SLA ? The debugs should show how the router is routing the ICMP SLA packets.

also check to make sure that your software version that you are running has no known bug.

 

Mario

I also think that it should be working...

I can provide debug outputs later, when the situation will be repeated (now ISP1 is market as alive).

IOS is universal v. 12.4(22)T (c880data-universalk9-mz.124-22.T.bin) with advsecurity license activated.

i will try to lab this up later to test the config and I will see if i can do a bug search on that IOS.

can I have the model number of the router that you are running please?

Mario

Right,

i believe you are hitting bug CSCso46681 "timeout issue on ip sla " which although it does not list your IOS version as a known affected version, it's not listed as a known fixed version either.

12.4(22)t1 is listed as a known fixed version. If you can upgrade to that and retest then let us know how you get on?

Mario

Whops... Sorry, i click "Correct Answer" instead of "Reply" :-(

But I hope this will be the correct one :-)

 

I have 12.4(22)t5 downloaded.
I can try upgrade to this version...

 

 

Yes that version is listed as a known working release for this bug.

Let us know how you get on.

 

Mario

Better late then never....

I changed IOS to 15.1(3)T2.

Reconfiguration of ip sla was needed, because there was incorrectly generated source address, from previous version of IOS.

After that... It works as expected...

Thanks for you help.

Router PN is CISCO881W-GN-E-K9

Hallo matejbernat,

 

I have no idea why your orginial setup did not work. As I was interested myself I labbed it up using real 7200 routers and it did work exactly as you would have expected, so you may hit some kind of software bug here.

 

However, your design is not that appealing to me for reasons already mentioned by other users, so I'd offer another way that may work around a possilble bug or logical flaw that we have not yet realized.

Be aware that configureing a source interface or source ip does not define what interface is used as actual exit interface. It does only define the source ip address for the packet. The only way to influence the exit interface and therefore the actual path the packet is taking through the network is manipulating the routing table or setting a policy, thus bypassing the routing table.

 I suggest using a local policy to force the ip sla icmp packet to always use the next hop of ISP1, no matter the routing table or interface state:

 

ip route 0.0.0.0 0.0.0.0 FastEthernet4 1.1.1.1 name ISP1 track 200
ip route 0.0.0.0 0.0.0.0 Vlan20 2.2.2.2 250 name ISP2
ip sla 200
 icmp-echo 8.8.4.4 source-ip 1.1.1.2 
ip local policy route-map IPSLA-VIA-ISP1

route-map IPSLA-VIA-ISP1 permit 10
 match ip address IPSLA-ISP1
 set ip next-hop 1.1.1.1

ip access-list extended IPSLA-ISP1
 permit icmp host 1.1.1.2 host 8.8.4.4

 

Using a local policy offers the great advantage that it does not match on productive traffic like an ACL or null0-route would do.

 

Regards

pille

 

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Review Cisco Networking products for a $25 gift card