cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
5189
Views
21
Helpful
41
Replies

STRANGE PROBLEM WITH OSPF

qualxarnu
Level 1
Level 1

Dear Community,

I would like to to ask you for a little help or hint regarding the strange issue which I have with the OSPF protocol.
Below I will try to explain how the network topology looks like.

I have few locations which in most cases have two routers. Few of them have only one router.
In main location we have two routers which are DMVPN hubs.
Each router in HQ is a hub for own DMVPN cloud and in branches each router is connected only to one DMVPN hub.
Exception are location with one router which have DMVPN connection to both hubs.

Then each location have also a multilayer switch which has SVI interfaces for specific location.

Now the OSPF is configured in a way that all routers using DMVPN are in area 0 and interface towards the multilayer switch is a normal area. Each location has own area number.

Now the problem is that some multilayer switches doesn't install routes to the routing tables from other areas and I'm not sure what can be the problem.

From things which I checked are:
- all DMVPN routers are in OSPF FULL state with particular DMVPN hub (branch router 1 is in FULL state with hub1, router 2 with hub 2)
- in specific branch location both routers are in FULL state with each other and with multilayer switch
- network type for tunnel interfaces is set to point-multipoint
- all routers in area 0 has information about all subnets in the environment
- databases on multilayer switches however looks kind of suspicious for me.

I aslo tried in two locations to switch from EIGRP to OSPF which should replace the first one, and first branch which has one DMVPN router and multilayer switch it worked like a charm.
But other two locations with two DMVPN routers and multilayer switches didn't go so well.
The problem was that not all routes were installed to the routing table by OSPF o multilayer switches.

What is interesting,both locations have the same type of a switch, the same software version and the same license level (for the first time it was ipbase, then increasedto ipservices - same result).
But for some reason one switch dosent install all routes.
Configurations looks the same I think, but maybe I missed something and maybe there is some reason why it's happening like that.

Can someone give a hint what will be the good/fastest way to check/debug why multilayer switch doesn't install all routes?
I will be really grateful.

41 Replies 41

Thanks, when I get more time, I'll comb through this additional configuration info, in depth.

Somethings I'll mention now (w/o in depth review):

When you've done your testing, a branch with dual VPN routers (each with connection to single hub), was, you believe, in error, while AT THE SAME TIME a branch with just one VPN router (with connection to both hubs) was active (and, apparently working correctly)?

As I earlier mentioned, and Peter also noted, you really, really want to insure your area 0 is not partitioned.  Currently, your design seems to only do this "by accident" via a branch having just a single VPN router (connected to both hubs).

Peter suggests having both tunnels with connections to both hubs, or having a separate (network) area zero 0 connections between the hubs.  (I'm a fan of the latter; I may have mentioned that earlier, along with having area zero connections between branches with dual VPN routers.)

Peter and I (and others?) have mentioned, for a phase II network, setting OSPF priority to zero on all but the hub routers.  (BTW, just setting the hub routers to a higher OSPF priority does not guarantee they will be DRs.  If DR goes down, and a branch router becomes DR, hub coming back on-line will not preempt a running DR [or so I recall].)

For testing, consider removing non-essential parameters, such as BFD, manual OSPF link costing, (maybe - unless someone notes this is critical for DMVPM phase II) tunnel keys (, and maybe HSRP on branches with single VPN router).

This is the HUB1 configuration:  should be HUB2, correct?

router ospf 10
router-id 10.100.0.2

===========================

Now the Spoke 2 from area with two spokes:

router ospf 10
router-id 10.100.0.23
auto-cost reference-bandwidth 100000
shutdown
nsf
area 0 authentication
area 10 authentication
redistribute bgp 65224 metric-type 1 subnets route-map BGP2OSPF
passive-interface default
no passive-interface GigabitEthernet0/0/0
no passive-interface Tunnel201
network 10.100.0.23 0.0.0.0 area 0
network 10.100.1.0 0.0.0.255 area 0  Should be 10.100.2.1, correct?
network 10.110.100.0 0.0.0.255 area 10

I've checked HUB1 and HUB2 config and their router-ids are correct.
However indeed secondary spoke had this network statement wrong.
But I've checked the other location with the same setup. It was ok, but when I was testing it, there was similar problem like in case spoke with mistake in the config.

Nevertheless thanks for indicating this.

"I've checked HUB1 and HUB2 config and their router-ids are correct."

Yup, not the error (?) I was pointing out, though i.e. . . .

So I will now paste some configs with more elements for you.
This is the HUB1 configuration:

router ospf 10
router-id 10.100.0.1
.
.

This is the HUB1 configuration:

router ospf 10
router-id 10.100.0.2
.
.

"However indeed secondary spoke had this network statement wrong.
But I've checked the other location with the same setup. It was ok, but when I was testing it, there was similar problem like in case spoke with mistake in the config."

Okay, so we've verified one accidental misconfiguration.  There might be other configuration issues either due to other accidental misconfiguration, or due to possibly misunderstanding what the correct configuration should comprise (you also do have a lot "going on", such as explicit OSPF costing, FVRF/IVRF, IPSec, area authentication, etc.).

So, I suggest there are two approaches to use to help resolve your issues and they're NOT mutually exclusive.  You can provide all the information requested, so that we can further analyze what's going on.  You can pare down your configuration to the minimum, to eliminate additional variables (once working as expected, you can add them back in one by one).

Generally/usually, when everything is properly configured, Cisco works as it should.

Hi Guys,

Ok, so I had a chance to dig into tests and configurations.
I can share with you with some results.

So during the weekend I've prepared a lab in GNS3 to create as much as possible the same topology which I ave in production.
For the lab topology I used Cisco C3725 IOS routers and I created a main site with two DMVPN Hubs and one router acting as as core switch. I ceate also a site with 2 spokes + core router and site with 1 spoke + core router.
I configured for the first step EIGRP protocol and then in pararell OSPF to simulate what I want to do right now.
For EIGRP of course evertyhing was working and once I started to turn off EIGRP on the core router on the swite with 2 spokes, the OSPF routes were installed in the routing table and everything was working fine.
So in case of a lab in GNS3, everything works as it should.

Unfortunately in case of production - still not. :(((
I've noticed that there is a problem with some static routes which should be redistributed into OSPF like for example subnet 192.168.180.0/24, but they are not installed in OSPF RIB for some reason.
I used the same command on the Hub1, Hub2 and L3 switch in area 1 (connected to hubs) to redistribute static routes and use E1 metric. If I remove redistribution from the L3 switch, it doesn't change anything.

What i tried this time:
In production for testing I turned on OSPF only on the main site on the L3 switch + 2 hubs, then on the site with one spoke and one L3 switch and on the swit with 2 spokes and L3 switch.

On the site with 2 spokes there is still the same problem.
I tried to add on both spokes another tunnel interfaces so in situation where both spokes have connection to both hubs, the problem persists.
I also tried to extend area 0 on the location with hubs to the L3 switch, but it didn't change anything.
The last thing which I tried is to replace one command this command on the tunnel interfaces:

ip nhrp nhs aaa.aaa.aaa.aaa nbma bbb.bbb.bbb.bbb multicast

with these commads:

ip nhrp map aaa.aaa.aaa.aaa bbb.bbb.bbb.bbb
ip nhrp map multicast bbb.bbb.bbb.bbb
ip nhrp nhs aaa.aaa.aaa.aaa

But still doesn't work as it should.
By the way, I also noticed that when I put the

ip nhrp shortcut

command, it's not showed in the config, but for example command

no ip nhrp shortcut

is showed, thus I guess "ip nhrp shortcut" is just by default and that's why it's not visible.

I removed also cost configuration and BFD, but still no luck.

Thus I'm wondering if there is something which can make this working. Maybe it's a bug or maybe problem with license ....................
For the hub routers we use ISR 4431 routers with software version 16.09.08.
For spokes ISR 4321 with the same soft.
Routers have license IPbase, but I haven't found that in case of this model the OSPF can be in some kind of stub version or something like that.

I also discovered that the site with 1 spoke works only because there is a static default route on the L3 switch pointing the spoke router which diverts traffic in situation when I turn off EIGRP. If not that, the situation would be the same like in case of site wit h2 spokes.


Regarding shows from area 2-like, there will be a lot of it, thus I will attach it in the files.

But anyway now for me it looks like the key to solve this issue is to figure out why static routes, which should be redistributed by both HUBs are not installed by SPOKEs.
In the GNS3 lab, it was wrking properly and in the same way I configured redistribution in the production, but now I see that OSPF RIB doesn't have these routes installed. :(((


...

Adv Router is not-reachable in topology Base with MTID 0 <<- start from here check why the Adv is not reachable 
LS age: 3133
Options: (No TOS-capability, DC)
LS Type: Router Links
Link State ID: 10.101.100.4
Advertising Router: 10.101.100.4
LS Seq Number: 8000004C
Checksum: 0xD053
Length: 36
Area Border Router
AS Boundary Router
Number of Links: 1


I make note from where you can start troubleshooting. 
https://www.cisco.com/c/en/us/support/docs/ip/open-shortest-path-first-ospf/7112-26.html

check the mismatch between network types 

sorry if I can not reply  I am bust now. 

good luck 
MHM

Hmm, still more likely, I think, something isn't configured correctly, than a bug (although Cisco software can have those too).

In this reply you note "I also discovered that the site with 1 spoke works only because there is a static default route on the L3 switch pointing the spoke router which diverts traffic in situation when I turn off EIGRP.", I presume (?) you believe the static route shouldn't be needed?  If so, another configuration error?  (This, is again, not to fault you, but as I wrote earlier, you have a lot going on in your config, i.e. easy to miss something.)

In the statement I quoted, you mention "when I turn off EIGRP", is EIGRP currently being used across these tunnels, and you're trying to convert to using OSPF?  If so, as EIGRP has a better AD than OSPF (for internal routes, and by default), I wonder if it's possible not all of EIGRP is "turned off" which might suppress some OSPF routes.

In your GNS3 testing, you also had everything like in production, statics, EIGRP, VRF, etc.?  If not, again, some part of the configuration might not have been done exactly as needed.

Unfortunately, with the information we've been provided, we cannot truly double check whether your configurations are as expected.  Or, without the kind of information like @Peter Paluch requested, it's very difficult to focus on what's actually wrong and what that might imply for possible causes.

To fully work out why you're getting the results you've been getting, we might even need "telnet" priv 15 access to your devices; not something you would likely want to do, or might even be allowed to do (for many good reasons).

You'll have to see what the others on this thread think, but without much more concrete data, including possibly accessing your devices, if you cannot figure this out, you might need to resort to bringing in an external consulting network engineer (something I used to do [NB: that's not a hint to hire me - laugh]) or open a case with TAC (since you're here, and not there, no maintenance contract?).

Wish I could help more, but personally, for me to help further, believe we need to go beyond it's configured or acts (like) this.

just confirm the following and I can build Lab for your 
1- DMVPN phase2 or 3
2- OSPF you prefer broadcast or P2MP
3- how many tunnel you config, meaning are you use multi NHS in same tunnel or multi tunnel share same or different interface. 


hostname R1
!
interface Tunnel0
ip address 5.0.0.1 255.255.255.0
no ip redirects
ip nhrp map multicast dynamic
ip nhrp network-id 5
ip ospf network broadcast
ip ospf priority 100
tunnel source FastEthernet0/0
tunnel mode gre multipoint
tunnel key 5
!
!
interface FastEthernet0/0
ip address 100.0.0.1 255.255.255.0
duplex half
!
interface FastEthernet2/0
ip address 10.0.0.1 255.255.255.0
duplex auto
speed auto
standby 10 ip 10.0.0.10
!
router ospf 55
log-adjacency-changes
passive-interface FastEthernet2/0
network 5.0.0.0 0.0.0.255 area 0
network 10.0.0.0 0.0.0.255 area 1
!
ip route 0.0.0.0 0.0.0.0 100.0.0.6





hostname R2
!
interface Tunnel10
ip address 50.0.0.2 255.255.255.0
no ip redirects
ip nhrp map multicast dynamic
ip nhrp network-id 50
ip ospf network broadcast
ip ospf priority 100
tunnel source FastEthernet0/0
tunnel mode gre multipoint
tunnel key 50
!
interface FastEthernet0/0
ip address 110.0.0.2 255.255.255.0
duplex half
!
interface FastEthernet2/0
ip address 10.0.0.2 255.255.255.0
duplex auto
speed auto
standby 10 ip 10.0.0.10
!
router ospf 55
log-adjacency-changes
passive-interface FastEthernet2/0
network 10.0.0.0 0.0.0.255 area 1
network 50.0.0.0 0.0.0.255 area 0
!
ip route 0.0.0.0 0.0.0.0 110.0.0.6



hostname R4
!
interface Tunnel5
ip address 5.0.0.4 255.255.255.0
no ip redirects
ip nhrp map 5.0.0.1 100.0.0.1
ip nhrp map multicast 100.0.0.1
ip nhrp network-id 5
ip nhrp nhs 5.0.0.1
ip ospf network broadcast
ip ospf 55 area 0
tunnel source FastEthernet0/0
tunnel mode gre multipoint
tunnel key 5
!
interface FastEthernet0/0
ip address 120.0.0.4 255.255.255.0
duplex half
!
interface FastEthernet2/0
ip address 20.0.0.4 255.255.255.0
duplex auto
speed auto
standby 20 ip 20.0.0.20
!
router ospf 55
log-adjacency-changes
passive-interface FastEthernet2/0
network 5.0.0.0 0.0.0.255 area 0
network 20.0.0.0 0.0.0.255 area 2
!
ip forward-protocol nd
no ip http server
no ip http secure-server
!
ip route 0.0.0.0 0.0.0.0 120.0.0.6




hostname R5
!
interface Tunnel10
ip address 50.0.0.5 255.255.255.0
no ip redirects
ip nhrp map 50.0.0.2 110.0.0.2
ip nhrp map multicast 110.0.0.2
ip nhrp network-id 50
ip nhrp nhs 50.0.0.2
ip ospf network broadcast
ip ospf 55 area 0
tunnel source FastEthernet0/0
tunnel mode gre multipoint
tunnel key 50
!
interface FastEthernet0/0
ip address 130.0.0.5 255.255.255.0
duplex half
!
interface FastEthernet2/0
ip address 20.0.0.5 255.255.255.0
duplex auto
speed auto
standby 20 ip 20.0.0.20
!
router ospf 55
log-adjacency-changes
passive-interface FastEthernet2/0
network 20.0.0.0 0.0.0.255 area 2
network 50.0.0.0 0.0.0.255 area 0
!
ip forward-protocol nd
no ip http server
no ip http secure-server
!
!
ip route 0.0.0.0 0.0.0.0 130.0.0.6

 

Screenshot (288).png

I run lab and I success get advertise 20.0.0.0/24 from both router R4 + R5 in Site2 toward Site1 dual hub.

please make check my config, If you have any Q please ask. 

 

Spoiler
 

Point I see in my lab 
1

-show ip ospf 


check if area backbone (inactive) <<- this solve by add ip ospf directly under DMVPN tunnel 
2-I break the routing Loop by passive interface between Hubs and between Spoke in same Site
3- as I mention before use dual cloud 

qualxarnu
Level 1
Level 1

Dear Community,

It was long time ago when I created this topic, but I'm writing to finally share a solution for my problem.
It looks like configration which I implemented was ok, but it turned out that the reson of why OSPF was not working properly was probably the level of license on our core switches which was not supportun a full version of OSPF.
Second reason could be also a EIGRP protocol workin in parallel which I could not turn off completely, but because of that a looping prevention mechanism could prevent from announcing some of the routes.
Anyway we are now after the process of replacing all our network devices and I had to do also a transision from EIGRP to OSPF.
The proces was successful but what was a different between my previous actions is that this time I also configured redistribution between EIGRP and OSPF on both DMVPN hub routers.
Thus this together with devices having a proper license, solved the issue.
I just wanted to share this with you to not leave this topic unclosed.

P.S.
Thank you once again for your previous effort in trying to find the reason of my problems.



Review Cisco Networking for a $25 gift card