Re: OSPF neibor often going to down

duke100 · ‎03-22-2005

Hi

sh log

Mar 22 17:27:47.153: %OSPF-5-ADJCHG: Process 200, Nbr x.x.x.x on Serial1.1 from LOADING to FULL, Loading Done

Mar 22 17:28:52.338: %OSPF-5-ADJCHG: Process 200, Nbr x.x.x.x on Serial1.1 from LOADING to FULL, Loading Done

Mar 22 17:30:52.350: %OSPF-5-ADJCHG: Process 200, Nbr x.x.x.x on Serial1.1 from LOADING to FULL, Loading Done

Mar 22 17:32:02.450: %OSPF-5-ADJCHG: Process 200, Nbr x.x.x.x on Serial1.1 from LOADING to FULL, Loading Done

the link is "clear" no errors

soft: mc3810-a2jk9sv5-mz.122-19.bin

What can it be?

Georg Pauwen · ‎03-22-2005

Hello,

reasons for this can vary. Can you post the configuration of your router, and also, if possible, the relevant configuration of the other side, that is, where Serial1.1 is connected to ?

Regards,

GP

duke100 · ‎03-22-2005

Router1 connect Router2 via FR

Kevin Dorrell · ‎03-23-2005

Unfortunately your debug does not go far enough. It shows the two routers synchronising, and going into FULL state, but it does not show what follows, i.e. what causes it to keep going from LOADING to FULL. If you keep the debug going for just a few minutes longer until you get those console messages you showed in your first posting, then I am sure the answer is there.

However, I do have a couple of comments. You have configured your area 1027 as an NSSA, which implies that this is part of a larger picture. In particular, I would expect to see an ABR somewhere to connect the area to the backbone. The no-summary should only appear on the ABR - unless these are both ABRs? Since you are NSSA, I would also expect to see some ASBR and another routing protocol somewhere, with redistribution, but I do not see that.

I think that either there is a lot going on that you have not told us about, or your TS-NSSA (Totally-Stubby Not-So-Stubby-Area) is incorrect.

Kevin Dorrell

Luxembourg

duke100 · ‎03-23-2005

Look at the scheme.

I use OSPF stub no-summary on all routers, connected to 7206.

And no problems.

But when I have wanted to connect Router3 to area 0 via Router2 that there were problems, about which I already wrote.

Router3 used only static and this is not my router.

Kevin Dorrell · ‎03-23-2005

I think the most telling line is at 11:24:32.998. We are seeing a hello from the partner, but we are not seeing our own RID in it. We treat this as a 1-WayReceived event, so we go into init state, and the whole process starts again. Not surprisingly, this is cycling each DeadInterval, ie. 4 times our Hello time, i.e. 2 minutes.

In fact, we go into the 2-way state not because we see our own RID in his Hello, but because we see a DBD from him. But if he is sending us a DBD, he must have gone into 2-way first, so he must be seeing our Hello with his RID in it.

So, once the routing is established, why does he stop seeing our Hello? In fact, he misses at least three of them.

He is sending hellos every 10 seconds like he was on a broadcast network. Wouldn't this normally be every 30 seconds on a nonbroadcast link like this? Stranger, I don't see any evidence in the debug of us sending out any Hellos. We think we should be sending them every 30 seconds, as evidenced by the 2-minute DeadInterval.

Sorry, I'm a bit confused by this puzzle. I'll post again if I work it out.

Are you saying this used to work before you added Router3? If so, what is it about the routes injected by Router3 that would prevent our hellos going out to Router1?

Kevin Dorrell

Luxembourg

Kevin Dorrell · ‎03-25-2005

I think I got that wrong on two counts. Firstly, I didn't realize that debug ip ospf hello only shows the received hello processing, not transmitted hellos. (I wonder why not? It could be useful.) Secondly, of course, this is a point-to-point subinterface link, so the OPSF interface type is point-to-point and the hello time is 10 seconds.

So the conclusion is different. We are seeing all of the hellos from the 7602, but he is missing some of ours, and sometimes enough to exceed the DeadTimer. That needs him to miss 3 (or maybe even 4) hellos in a row. In fact, if I had looked at the debug more closely, I should have seen that the cycle time was not always 2 minutes, but in any case it was always a multiple of 10 seconds.

So the conclusion is that as soon as the OSPF comes up, the payload traffic from R2 to R1 is so great that hellos get droped in that direction. So, as Georg points out, traffic engineering (in the direction from the 3810 to the 7602) should solve the problem. (You might also want to look at why the traffic is excessive.)

Please let us know how you get on.

Kevin Dorrell

Luxembourg

Georg Pauwen · ‎03-23-2005

Hello,

what is the actual CIR that your provider is giving to you ? I think your OSPF traffic gets starved out by the voice and other traffic. I guess the easiest way to find out if the classes are the root of the problem is to take them off the interface and see if that solves the issue. If that is the case, you might want to change the classes and add a service-policy to the classes that prioritizes OSPF traffic, it would look something like this (depending on your IOS version):

class-map OSPF

match prot ospf

!

policy-map SET-DE

class OSPF

class class-default

set fr-de

Then, add the service policy under your map-classes, this will make sure that in case of congestion, OSPF traffic always gets through...

HTH,

GP

duke100 · ‎03-23-2005

well i shall try, but When I used here it earlier:

class-map match-all OSPF

match access-group 104

!

policy-map bandwidth

class OSPF

bandwidth 8

class class-default

fair-queue 1024

access-list 104 permit ospf any any

access-list 104 permit tcp any any eq telnet

access-list 104 permit tcp any eq telnet any

map-class frame-relay data4

service-policy output bandwidth

did not work.