cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1860
Views
0
Helpful
5
Replies

IP SLA "failed operation" question

r.perera
Level 1
Level 1

We have 3825 router (vestion 124-5a) which has 2 connections to our ISP.

The primary link has static route to y.y.y.y (ISP's DG) and "IP SLA" is monitoring this route with these commands.

Please see the related configs below.

ip sla monitor 1

type echo protocol ipIcmpEcho y.y.y.y

timeout 2000

threshold 1000

frequency 3

ip sla monitor schedule 1 life forever start-time now

track 10 rtr 1 reachability

ip route x.x.x.x 255.255.255.0 y.y.y.y track 10

Originally the primary route was flapping very frequently and we found that we have set "timeout" and "threshold" too low, so we reconfigured them as above.

Now, we do not see any route flaps.

However, we still see "Number of failed operations due to a Timeout" as below outputs.

My question is what is exactly a failed operation? We still see timeouts while no route flaps occurs. My understanding was every time a failed operation happens, it deletes the primary route and put the secondary route on, eg, the secondary link takes it over. I am assuming that a failed operation refers to ping response exceeds "timeout 2000" in this scenario. Why do we see the timeouts but not a route flaps? What is the relation between "failed operation", "timeout" and route flaps?

router#show ip sla monitor collection-statistics 1

Entry number: 1

Start Time Index: 15:40:00.699 AESST Wed Mar 8 2006

Number of successful operations: 501

Number of operations over threshold: 0

Number of failed operations due to a Disconnect: 0

Number of failed operations due to a Timeout: 1

Number of failed operations due to a Busy: 0

Number of failed operations due to a No Connection: 0

Number of failed operations due to an Internal Error: 0

Number of failed operations due to a Sequence Error: 0

Number of failed operations due to a Verify Error: 0

RTT Values:

RTTAvg: 12 RTTMin: 7 RTTMax: 260

NumOfRTT: 502 RTTSum: 6214 RTTSum2: 269470

Start Time Index: 14:40:00.699 AESST Wed Mar 8 2006

Number of successful operations: 1195

Number of operations over threshold: 0

Number of failed operations due to a Disconnect: 0

Number of failed operations due to a Timeout: 5

Number of failed operations due to a Busy: 0

Number of failed operations due to a No Connection: 0

Number of failed operations due to an Internal Error: 0

Number of failed operations due to a Sequence Error: 0

Number of failed operations due to a Verify Error: 0

RTT Values:

RTTAvg: 15 RTTMin: 7 RTTMax: 680

NumOfRTT: 1195 RTTSum: 18655 RTTSum2: 2700453

5 Replies 5

attrgautam
Level 5
Level 5

What does sh track give you when there are timeouts ?

Hi attrquntam,

Thank you for your response and your question is a very good one.

We do not know when it is timing out at all (that is the problem.)

We put "debug ip sla monitor error" and debug ip sla monitor trace" but all we are seeing is this.

Apr 6 13:31:20 AEST: IP SLA Monitor(1) Scheduler: Starting an operation

.Apr 6 13:31:20 AEST: IP SLA Monitor(1) echo operation: Sending an echo operation

.Apr 6 13:31:20 AEST: IP SLA Monitor(1) echo operation: RTT=8

.Apr 6 13:31:20 AEST: IP SLA Monitor(1) Scheduler: Updating result

.Apr 6 13:31:23 AEST: IP SLA Monitor(1) Scheduler: Starting an operation

.Apr 6 13:31:23 AEST: IP SLA Monitor(1) echo operation: Sending an echo operation

.Apr 6 13:31:23 AEST: IP SLA Monitor(1) echo operation: RTT=16

.Apr 6 13:31:23 AEST: IP SLA Monitor(1) Scheduler: Updating result

We are never able to catch when it is timing out.

I also applied "debug track" but no output is seen.

router#sh track

Track 10

Response Time Reporter 1 reachability

Reachability is Up

2420 changes, last change 01:33:00

Latest operation return code: OK

Latest RTT (millisecs) 8

Tracked by:

STATIC-IP-ROUTING 0

Yeah looking at the number of timeouts you have got, i dare say that probably you didnot even notice the route flap as the link would have restored in 3 secs. If you want to avoid this, you can change the down delay and up delay times .

track 1 rtr 10

delay down 10 up 2

Let me know if this helps

Hi Attrqautam,

Thank you for your advice and we will implement it.

However, can I ask you one more thing?

While we have "timeout" but no route flaps are seen, do you think the route does not flap since it is only for a very short time ping "timeouts". I can assure we do not see route flaps, if there is, we should see in the log even for a short time (debug routing is on).

That is what suprises me. Can you also do a debug track on the router and see if you can relate route flaps to the track instances. That may help with some information.