09-23-2011 02:09 AM - edited 03-04-2019 01:42 PM
Hi all,
Back with another question / issue on my OSPF routing
I have 2 ASBR routers, AGFR01RTR03 and AGFR02RTR03, performing OSPF to OSPF redistribution in both ways for the same ASs.
They also do summarization for our private addressing scheme. It is all working just fine for that part (neighbors, summarization, redistribution).
AGDC01RTR01 --- AGDC02RTR01 (OSPF 1000 ABRs)
| |
| |
AGFR01RTR03 --- AGFR02RTR03 (OSPF 1000 / 53 ASBRs)
Let's focus on AGDC01RTR01 with a specific entry here (IP subnet is fake) :
Routing entry for 1.1.1.0/25
Known via "ospf 1000", distance 110, metric 300, type inter area
Last update from 10.2.244.76 on GigabitEthernet5/1, 1d03h ago
Routing Descriptor Blocks:
* 10.2.244.76, from 10.2.1.249, 1d03h ago, via GigabitEthernet5/1
Route metric is 300, traffic share count is 1
Compare with AGDC02RTR01 :
Routing entry for 1.1.1.0/25
Known via "ospf 1000", distance 110, metric 400, type inter area
Last update from 10.2.244.121 on GigabitEthernet5/2, 1d03h ago
Routing Descriptor Blocks:
* 10.2.244.121, from 10.2.1.249, 1d03h ago, via GigabitEthernet5/2
Route metric is 400, traffic share count is 1
I would expect AGFR01RTR03 (in full state on its peering with AGDC01RTR01) to learn that network from OSPF 1000, but no :
Routing entry for 1.1.1.0/25
Known via "ospf 53", distance 110, metric 23000
Tag 1000, type extern 2, forward metric 45
Redistributing via ospf 1000
Last update from 10.5.1.9 on GigabitEthernet1/2, 3d00h ago
Routing Descriptor Blocks:
* 10.5.1.9, from 10.5.0.134, 3d00h ago, via GigabitEthernet1/2
Route metric is 23000, traffic share count is 1
Route tag 1000
The route comes from OSPF 1000 being redistributed into OSPF 53 on AGFR02RTR03 (which tagged the route + adjusted metric to 23K).
HEre is the OSPF and route maps existing on AGFR01RTR03. Note that config is very close on AGFR02RTR03 (only the metric goes to 23K instead of 18K)
router ospf 1000
log-adjacency-changes
nsf
summary-address 10.4.0.0 255.254.0.0 tag 33
summary-address 172.18.0.0 255.255.0.0 tag 33
redistribute ospf 53 metric 18000 subnets route-map EU_US
network 10.2.244.80 0.0.0.3 area 0
router ospf 53
log-adjacency-changes
auto-cost reference-bandwidth 10000
nsf
network 10.5.1.8 0.0.0.7 area 0
summary-address 10.2.0.0 255.254.0.0 tag 1000
summary-address 172.16.0.0 255.255.0.0 tag 1000
redistribute ospf 1000 metric 18000 subnets route-map US_EU
route-map US_EU deny 10
match tag 33
!
route-map US_EU permit 20
set tag 1000
!
route-map EU_US deny 10
match tag 1000
!
route-map EU_US permit 20
set tag 33
Would you have any recommendation on how I could possibly debug this issue ? I'm a little confused on what to verify.
Tom
Solved! Go to Solution.
09-23-2011 08:40 AM
Hello Tom,
Opening a TAC case is not necessary yet, I believe. What you are experiencing is normal.
Running two OSPF processes and redistributing between them on various places (multipoint bidirectional redistribution) is a complicated stuff. What you are seeing here is merely a race condition: FR01 simply learnt about the offending network via OSPF process 53 sooner than via OSPF process 1000.
I suggest you first read the following document very, very carefully - it explains the common caveats when running multiple OSPF processes and redistributing between them.
http://www.cisco.com/en/US/tech/tk365/technologies_tech_note09186a00801069aa.shtml
Please feel welcome to ask further after reading that document!
Best regards,
Peter
09-30-2011 03:17 PM
Tom,
So far, I am confused by the results of our experiment. None of what we are seeing makes sense to me. Would you mind performing another test?
The point of that experiment is to prohibit the OSPF process 1000 from installing the route 192.168.74.64/26 into the routing table, thereby allowing the OSPF process 53 to offer its own candidate - if it has any.
The configuration would be performed on FR01 as follows:
ip prefix-list Experiment deny 192.168.74.64/26
ip prefix-list Experiment permit 0.0.0.0/0 le 32
!
router ospf 1000
distribute-list prefix Experiment in
After a couple of seconds, have a look into the routing table about the network 192.168.74.64/26. I would be interested in seeing if a replacement route is installed into the routing table and where it is going to point towards to. Having the debug ip routing should be illustrative as well.
Please note that this experiment may very well result in temporary unreachability of the network 192.168.74.64/26, and should therefore be performed only in times of low volume.
Best regards,
Peter
09-23-2011 08:29 AM
Opening TAC request.
09-23-2011 08:40 AM
Hello Tom,
Opening a TAC case is not necessary yet, I believe. What you are experiencing is normal.
Running two OSPF processes and redistributing between them on various places (multipoint bidirectional redistribution) is a complicated stuff. What you are seeing here is merely a race condition: FR01 simply learnt about the offending network via OSPF process 53 sooner than via OSPF process 1000.
I suggest you first read the following document very, very carefully - it explains the common caveats when running multiple OSPF processes and redistributing between them.
http://www.cisco.com/en/US/tech/tk365/technologies_tech_note09186a00801069aa.shtml
Please feel welcome to ask further after reading that document!
Best regards,
Peter
09-23-2011 09:35 AM
Hi Peter,
You're right once again. I modified the administrative distance in the "edge" AS to prevent external routes from being eligible when inter/intra area routes for the same destination exist.
I can close my TAC request and thank you once again for the help and knowledge
Tom
09-27-2011 03:07 AM
Hi all,
I have an update with an issue that doesn't make sense to me :
As shown in the original post, AGFR01RTR03 has two OSPF processes. One is 53 where the router is an ABR with several totally stubby and two NSSA areas. Focusing on one of my NSSAs, the next hop router inside that area has an NSSA E1 route.
agde04rtr03#show ip route 192.168.74.64
Routing entry for 192.168.74.64/26
Known via "ospf 2000", distance 110, metric 464, type NSSA extern 1
Last update from 10.5.129.3 on GigabitEthernet0/0, 2w0d ago
Routing Descriptor Blocks:
* 10.5.129.3, from 10.5.129.3, 2w0d ago, via GigabitEthernet0/0
Route metric is 464, traffic share count is 1
The peering with agfr01rtr03 is in full state. Since external routes received from the other OSPF process is now set at a higher administrative distance, that router should now prefer my NSSA E1 route as long as it exist, right ? Well this is not the case :
agfr01rtr03#show ip route 192.168.74.64
Routing entry for 192.168.74.64/26
Known via "ospf 1000", distance 115, metric 23000
Tag 33, type extern 2, forward metric 304
Redistributing via ospf 53
Last update from 10.2.244.81 on Serial3/0/0, 00:01:29 ago
Routing Descriptor Blocks:
* 10.2.244.81, from 10.2.244.86, 00:01:29 ago, via Serial3/0/0
Route metric is 23000, traffic share count is 1
Route tag 33
I don't understand why the admin distance is not used as tie breaker in this scenario. If someone can explain that...
Thanks,
Tom
09-27-2011 09:13 AM
Hello Tom,
I hope you do not find it annoying that I am still occupying your threads here...
In order to understand your situation better, I need to see the following output from FR01:
show ip ospf database external 192.168.74.64
show ip ospf database 192.168.74.64
These outputs will produce one or more LSA-5 and LSA-7. Please be sure to also check the Advertising Router ID in all of these LSAs and please include information who these Router IDs are (which routers they correspond to).
Thank you!
Best regards,
Peter
09-28-2011 12:18 AM
Hi Peter,
You're very welcome to camp in my threads as long as you like
#show ip ospf database external 192.168.74.64
OSPF Router with ID (10.5.5.101) (Process ID 2000)
OSPF Router with ID (10.5.0.133) (Process ID 53)
Type-5 AS External Link States
Routing Bit Set on this LSA in topology Base with MTID 0
LS age: 556
Options: (No TOS-capability, DC)
LS Type: AS External Link
Link State ID: 192.168.74.64 (External Network Number )
Advertising Router: agfr02rtr03.archon.net
LS Seq Number: 80000027
Checksum: 0x7058
Length: 36
Network Mask: /26
Metric Type: 1 (Comparable directly to link state metric)
MTID: 0
Metric: 5555
Forward Address: 10.177.98.21
External Route Tag: 0
OSPF Router with ID (10.2.244.82) (Process ID 1000)
Type-5 AS External Link States
Routing Bit Set on this LSA in topology Base with MTID 0
LS age: 911
Options: (No TOS-capability, DC)
LS Type: AS External Link
Link State ID: 192.168.74.64 (External Network Number )
Advertising Router: agfr02rtr03-vzb-s3-0-0.archon.net
LS Seq Number: 80000F55
Checksum: 0x36BB
Length: 36
Network Mask: /26
Metric Type: 2 (Larger than any link state path)
MTID: 0
Metric: 23000
Forward Address: 0.0.0.0
External Route Tag: 33
The show ip ospf database 192.168.74.64 is not accepted. Do you need the "show ip ospf database network 192.168.74.64 ?
Tom
09-28-2011 01:48 AM
Hello Tom,
I apologize - the command that did not work was supposed to say:
show ip ospf database nssa 192.168.74.64
Can you please post the output of that command now? Thank you!
Best regards,
Peter
09-28-2011 01:55 AM
No problem Peter.
Here it is :
# show ip ospf database nssa 192.168.74.64
OSPF Router with ID (10.5.5.101) (Process ID 2000)
OSPF Router with ID (10.5.0.133) (Process ID 53)
Type-7 AS External Link States (Area 4)
Routing Bit Set on this LSA in topology Base with MTID 0
LS age: 379
Options: (No TOS-capability, Type 7/5 translation, No DC)
LS Type: AS External Link
Link State ID: 192.168.74.64 (External Network Number )
Advertising Router: agde04b2bfw01.archon.net
LS Seq Number: 80002413
Checksum: 0x538A
Length: 36
Network Mask: /26
Metric Type: 1 (Comparable directly to link state metric)
MTID: 0
Metric: 444
Forward Address: 10.177.98.17
External Route Tag: 0
Type-7 AS External Link States (Area 6)
Routing Bit Set on this LSA in topology Base with MTID 0
LS age: 681
Options: (No TOS-capability, Type 7/5 translation, No DC)
LS Type: AS External Link
Link State ID: 192.168.74.64 (External Network Number )
Advertising Router: agde06fw01.archon.net
LS Seq Number: 80001479
Checksum: 0x1084
Length: 36
Network Mask: /26
Metric Type: 1 (Comparable directly to link state metric)
MTID: 0
Metric: 5555
Forward Address: 10.177.98.21
External Route Tag: 0
09-28-2011 02:31 AM
Tom,
Thank you. Which of these names displayed in the Advertising Router fields corresponds to the Router ID 10.2.244.86 please?
Best regards,
Peter
09-28-2011 02:57 AM
Peter,
That is the AGFR02RTR03 ID on the remote AS side (process 1000 where my local routes are redistributed)
AGDC01RTR01 --- AGDC02RTR01 (OSPF 1000 ABRs)
| |
| |
AGFR01RTR03 --- AGFR02RTR03 (OSPF 1000 / 53 ASBRs)
The AGFR01RTR03 router receives the redistributed route on the ospf 1000 process without redistributing it into 53 due to route-map blocking any tag 33 routes.
Hope this helps.
Tom
09-29-2011 02:24 AM
Thomas,
I am having a hard time wrapping my head around your topology, as what I am seeing still is just a couple of routers without really knowing where the individual areas are, what processes they are assigned to, where the routes are originated or redistributed, etc. A more illustrative exhibit of your topology including depiction of processes and areas would be helpful.
So far, what I am able to understand from the outputs:
1) FR01 has 4 LSAs regarding the same network:
In process 1000: one LSA:
LSA-5 generated byagfr02rtr03-vzb-s3-0-0.archon.net, tagged with tag 33 due to redistribution
In process 53: three LSAs:
LSA-5 generated by agfr02rtr03.archon.net, not tagged
LSA-7 generated by agde04b2bfw01.archon.net in area 4, not tagged
LSA-7 generated by agde06fw01.archon.net in area 6, not tagged
I assume that none of these router IDs corresponds to the FR01 itself, as if it was FR01 itself, it would in a way explain this issue.
2) For some reason, the process 1000 wins when installing the route although its AD is higher than default OSPF AD, namely, 115. If the OSPF process 53 selected its own version of the best candidate, it should have offered it to the routing table with the AD of 110 and it should have won. Depending on the version of NSSA implementation, FR01 would either prefer the N1 or E1 route (assuming it is not performing the LSA-7/LSA-5 translation itself).
I wonder whether this state is not simply a metastable remainder of a change in your network. I would personally suggest trying to use the clear ip route 192.168.74.64 255.255.255.192 to remove the route from your routing table on FR01 and let the OSPF processes compete again when installing the route. And even better, I would also suggest running the debug ip routing command before the clear ip route command - the debug should show us all changes performed to the routing table, which may yield some additional information about what is happening.
Would you mind performing those tests? As they may result in short-lived connectivity issues with the corresponding network, I suggest using a more quiet period when the intermittent connectivity won't be too harmful.
Best regards,
Peter
09-30-2011 08:28 AM
Hi Peter,
I just finished an updated schema of my OSPF area layout. I hope it will clear things a little bit.
I completely agree with all your remarks and wonder why AD is not becoming the tie breaker.
To answer your question, here is the output :
agfr01rtr03#clear ip route 192.168.74.64
agfr01rtr03#
033952: Sep 30 15:26:43 UTC: RT: del 192.168.74.64 via 10.2.244.81, ospf metric [115/23000]
033953: Sep 30 15:26:43 UTC: RT: delete subnet route to 192.168.74.64/26
033954: Sep 30 15:26:43 UTC: RT(multicast): delete subnet route to 192.168.74.64/26
033955: Sep 30 15:26:43 UTC: RT: updating ospf 192.168.74.64/26 (0x0) via 10.2.244.81 Se3/0/0
033956: Sep 30 15:26:43 UTC: RT: add 192.168.74.64/26 via 10.2.244.81, ospf metric [115/23000]
033957: Sep 30 15:26:43 UTC: RT: updating ospf 192.168.74.64/26 (0x0) via 10.5.4.26 Tu141
033958: Sep 30 15:26:43 UTC: RT: rib update return code: 17
033959: Sep 30 15:26:43 UTC: RT: updating ospf 192.168.74.64/26 (0x0) via 10.5.4.26 Tu141
033960: Sep 30 15:26:43 UTC: RT:
agfr01rtr03#rib update return code: 17
033961: Sep 30 15:26:43 UTC: RT: updating ospf 192.168.74.64/26 (0x0) via 10.5.4.46 Tu161
033962: Sep 30 15:26:43 UTC: RT: rib update return code: 17
033963: Sep 30 15:26:43 UTC: RT(multicast): network 192.168.74.0/24 is now subnetted
033964: Sep 30 15:26:43 UTC: RT(multicast): network 192.168.74.0 is now variably masked
033965: Sep 30 15:26:43 UTC: Replicated ndb 192.168.74.64/26 in table 0x8000 created
033966: Sep 30 15:26:43 UTC: Replicated ndb 457FDB24/4888E15C refcnt 1
agfr01rtr03#show ip route 192.168.74.64
Routing entry for 192.168.74.64/26
Known via "ospf 1000", distance 115, metric 23000
Tag 33, type extern 2, forward metric 304
Redistributing via ospf 53
Last update from 10.2.244.81 on Serial3/0/0, 00:00:20 ago
Routing Descriptor Blocks:
* 10.2.244.81, from 10.2.244.86, 00:00:20 ago, via Serial3/0/0
Route metric is 23000, traffic share count is 1
Route tag 33
Hope this helps,
Tom
09-30-2011 03:17 PM
Tom,
So far, I am confused by the results of our experiment. None of what we are seeing makes sense to me. Would you mind performing another test?
The point of that experiment is to prohibit the OSPF process 1000 from installing the route 192.168.74.64/26 into the routing table, thereby allowing the OSPF process 53 to offer its own candidate - if it has any.
The configuration would be performed on FR01 as follows:
ip prefix-list Experiment deny 192.168.74.64/26
ip prefix-list Experiment permit 0.0.0.0/0 le 32
!
router ospf 1000
distribute-list prefix Experiment in
After a couple of seconds, have a look into the routing table about the network 192.168.74.64/26. I would be interested in seeing if a replacement route is installed into the routing table and where it is going to point towards to. Having the debug ip routing should be illustrative as well.
Please note that this experiment may very well result in temporary unreachability of the network 192.168.74.64/26, and should therefore be performed only in times of low volume.
Best regards,
Peter
10-04-2011 02:29 AM
Hi Peter,
I found out that the OSPF external route administrative distance was 115 on process 53 also. Therefore, no tie breaker was existing and would still be "first installed wins". I took out the setting on ospf 53 and
agfr01rtr03#show ip route 192.168.74.64
Routing entry for 192.168.74.64/26
Known via "ospf 53", distance 110, metric 797, type extern 1
Redistributing via ospf 1000
Advertised by ospf 1000 metric 18000 subnets route-map EU_US
Last update from 10.5.4.26 on Tunnel141, 00:06:13 ago
Routing Descriptor Blocks:
* 10.5.4.26, from 10.5.0.134, 00:06:13 ago, via Tunnel141
Route metric is 797, traffic share count is 1
Not sure if I did that by mistake at some point. I'm very sorry to have mislead you. Thanks much again for the help. Each of your posts are giving me some tools to help myself next time. Hope you don't mind about it
Ps : Prior to that, I tried installing your prefix-list. It did the job but caused huge debug outputs with the hundreds of routes being deleted from tables.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide