01-07-2023 04:52 PM - last edited on 01-24-2023 09:48 PM by Translator
Disclaimer: at first I thought this was merely a question about the title, but I realised it's many questions. I'm sorry if this caused the post to be confusing. In an attempt to improve the situation, I include a short question list at the end.
I have a network setup like in the picture.
BGP speakers are R2, connected only to R6 (no i-BGP betwen R2 and R3), R6 connected to R2 and R5, R5 connected to R4 (i-BGP) and R6, R4 connected to R5 (i-BGP) and R3, and R3 connected to R4.
Inside AS 110 I have OSPF configured, with OSPF redistribution to BGP and BGP redistribution to OSPF (command redistribute bgp 110 in both R2 and R3), metric type 2.
I'm doing a traceroute from R1 to 222.130.10.6
I get this output:
R1#traceroute 222.130.10.6
Type escape sequence to abort.
Tracing the route to 222.130.10.6
1 222.110.20.3 24 msec
222.110.10.2 16 msec
222.110.20.3 16 msec
2 10.110.130.6 40 msec
10.110.120.4 24 msec
10.110.130.6 28 msec
This is not exactly what I expected. I understand from inspection with Wireshark and https://community.cisco.com/t5/routing/traceroute-with-multiple-next-hops/td-p/3404248 why this output appears "out of phase", meaning, why the two paths are scrambled, but I was hoping that the complete path through R1 was shown.
In Wireshark I confirmed that no UDP traceroute packets are being sent with a TTL bigger than 2 in either interface of R1. It seems like since it gets a response from one path with just 2 hops, it doesn't try to find the path with more hops.
Is this expected behaviour?
I have also noticed that ping 222.130.10.6 only uses the path through R2, i.e., the ICMP packets only appear in Wireshark in the interface between R2 and R1. In the interface between R1 and R3 nothing appears no matter how much I ping, even though both routes are valid according to the IP forwarding table of R1.
This is the IP forwarding table of R1:
R1#show ip route
Codes: C - connected, S - static, R - RIP, M - mobile, B - BGP
D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area
N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
E1 - OSPF external type 1, E2 - OSPF external type 2
i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
ia - IS-IS inter area, * - candidate default, U - per-user static route
o - ODR, P - periodic downloaded static route
Gateway of last resort is not set
C 222.110.20.0/24 is directly connected, FastEthernet0/1
C 222.110.10.0/24 is directly connected, FastEthernet0/0
O E2 222.120.10.0/24 [110/1] via 222.110.20.3, 04:08:04, FastEthernet0/1
[110/1] via 222.110.10.2, 04:08:04, FastEthernet0/0
O E2 222.130.10.0/24 [110/1] via 222.110.20.3, 04:08:04, FastEthernet0/1
[110/1] via 222.110.10.2, 04:08:04, FastEthernet0/0
O 222.110.30.0/24 [110/20] via 222.110.20.3, 04:47:26, FastEthernet0/1
[110/20] via 222.110.10.2, 04:47:28, FastEthernet0/0
An analogous situation is happening with 222.120.10.4, the traceroute uses both paths but the longer one is not fully shown, and ping chooses only the shortest path.
What I was expecting would be that ping would choose a random path each time, and that traceroute would show both complete paths everytime. Aren't both paths, from the point of view of R1, of equal cost (even though one goes through more ASes)?
These are the external LSAs for the network of 222.130.10.6
R1#show ip ospf database external
OSPF Router with ID (1.1.1.1) (Process ID 1)
Type-5 AS External Link States
Routing Bit Set on this LSA
LS age: 1208
Options: (No TOS-capability, DC)
LS Type: AS External Link
Link State ID: 222.130.10.0 (External Network Number )
Advertising Router: 2.2.2.2
LS Seq Number: 80000008
Checksum: 0x2E88
Length: 36
Network Mask: /24
Metric Type: 2 (Larger than any link state path)
TOS: 0
Metric: 1
Forward Address: 0.0.0.0
External Route Tag: 130
Routing Bit Set on this LSA
LS age: 1522
Options: (No TOS-capability, DC)
LS Type: AS External Link
Link State ID: 222.130.10.0 (External Network Number )
Advertising Router: 3.3.3.3
LS Seq Number: 80000009
Checksum: 0x5962
Length: 36
Network Mask: /24
Metric Type: 2 (Larger than any link state path)
TOS: 0
Metric: 1
Forward Address: 0.0.0.0
External Route Tag: 120
Question list:
1- Is this standard traceroute behaviour, i.e., if there are multiple routes of different hop count traceroute doesn't fully show the longer routes? Is there a way to make it show the full route to 222.130.10.6 through R3?
2- Can R1 distinguish that one of the paths to 222.130.10.6 is longer in terms of number of ASes? Why/why not? I think not, but my intuition says it should. I realise R1 is not a BGP speaker, so for it to be aware of the path length in ASes, when each of R2 and R3 redistribute their BGP route to 222.130.10.0/24 into OSPF they would have to somehow give some metric that corresponds to path length. Analysing the External LSAs, it doesn't seem to have any such metric. I also checked the Router and Network LSAs, which as per my prediction didn't have any information about external networks.
3- As everything seems to indicate that R1 can't distinguish which path is longer, because that unknown metric I referenced seems to not exist, the only solution would be to close the BGP overlay loop and connect R2 to R3 via iBGP?
Solved! Go to Solution.
01-07-2023 05:34 PM - edited 01-07-2023 05:49 PM
Hi @Pedro Matias ,
1- Yes, it is standard traceroute behavior. The traceroute stops as soon as the packets reach the destination via the shorter path.
2- With current configuration, R1 has no way to know that one path is shorter AS path than the other, as you redistribute the bgp route on both R2 and R3 with the same metric. The easiest way to have R1 know the shortest path would be to run iBGP between R2 and R2. This would cause R2 and R3 to agree on the shortest path to the destination subnet and only the one with the eBGP path (by default) would get redistributed into OSPF.
3- Yes. There would be some other ways, but it is definitely easier to just run iBGP between R2 and R3.
Regards,
01-07-2023 05:34 PM - edited 01-07-2023 05:49 PM
Hi @Pedro Matias ,
1- Yes, it is standard traceroute behavior. The traceroute stops as soon as the packets reach the destination via the shorter path.
2- With current configuration, R1 has no way to know that one path is shorter AS path than the other, as you redistribute the bgp route on both R2 and R3 with the same metric. The easiest way to have R1 know the shortest path would be to run iBGP between R2 and R2. This would cause R2 and R3 to agree on the shortest path to the destination subnet and only the one with the eBGP path (by default) would get redistributed into OSPF.
3- Yes. There would be some other ways, but it is definitely easier to just run iBGP between R2 and R3.
Regards,
01-08-2023 08:29 AM
With regards to point number 2, "you redistribute the bgp route on both R2 and R3 with the same metric." Am I correct in assuming then that when BGP redistributes its routes into OSPF, the following facts are true:
1-for each external network BGP knows, it redistributes a single route into OSPF.
2-Every route it will redistribute will have the same metric inside OSPF, to be precise the metric will be 1. For example, R3 knows a route via BGP to 222.120.10.0/24, which is shorter than the route it knows to 222.130.10.0/24. However, both of them appear with a metric of 1 in the external LSA. I didn't find a web page for OSPFv2, but for OSPFv3 in https://www.cisco.com/c/en/us/support/docs/ip/ipv6-routing/200187-Understand-OSPFv3-AS-External-LSA-Route.html it says
"Note: If a metric is not specified, OSPFv3 puts a default value of 20 when it redistributes routes from all protocols except Border Gateway Protocol (BGP) routes, which receive a metric of 1. "
This is the reason why, as you say, "With current configuration, R1 has no way to know that one path is shorter AS path than the other, as you redistribute the bgp route on both R2 and R3 with the same metric." It is because every external LSA from BGP will have metric 1 in OSPF.
01-08-2023 09:59 AM - edited 01-08-2023 10:02 AM
Hi @Pedro Matias ,
1- That is correct. Only the path selected as the best path for each destination is redistributed into OSPF.
2- That is correct. By default all routes will be redistributed in OSPF as an external type 2 and a metric of 1. You can change the default through a route-map that will change the default metric and metric type (E1 or E2) for each route.
You could actually configure of a route-map that would take AS path length into consideration and assign a metric accordingly. For example, if the AS path length is 1 then set metric to 1, if AS path length is 2 then set the metric to 2, etc.
Regards,
01-08-2023 10:47 AM
Thanks, I understood everything now. The part where you said "You could actually configure of a route-map that would take AS path length into consideration and assign a metric accordingly. For example, if the AS path length is 1 then set metric to 1, if AS path length is 2 then set the metric to 2, etc." is exactly the type of concept I was referring to when I said
" I realise R1 is not a BGP speaker, so for it to be aware of the path length in ASes, when each of R2 and R3 redistribute their BGP route to 222.130.10.0/24 into OSPF they would have to somehow give some metric that corresponds to path length."
I'm still a beginner so I don't know anything about route maps or any other implementation/practical details like that, but I'm glad to see that a real world concept matches my intuition.
01-08-2023 11:10 AM - last edited on 01-24-2023 09:50 PM by Translator
Hi @Pedro Matias ,
BTW, the logic on R2 and R3 that would reflect the AS path length in the E2 metric would look something like this.
router ospf xxx
redistribute bgp xxx subnets route-map bgp2ospf
!
route-map bgp2ospf permit 10
match as-path 1
set metric 1
route-map bgp2ospf permit 20
match as-path 2
set metric 2
route-map bgp2ospf permit 30
match as-path 3
set metric 3
route-map bgp2ospf permit 40
match as-path 4
set metric 4
route-map bgp2ospf permit 50
match as-path 5
set metric 5
!
ip as-path access-list 1 permit ^[0-9]*$
ip as-path access-list 2 permit ^[0-9]*_[0-9]*$
ip as-path access-list 3 permit ^[0-9]*_[0-9]*_[0-9]*$
ip as-path access-list 4 permit ^[0-9]*_[0-9]*_[0-9]*_[0-9]*$
ip as-path access-list 5 permit ^[0-9]*_[0-9]*_[0-9]*_[0-9]*_[0-9]*$
Regards,
01-08-2023 11:21 AM
this exact what I want to clear here.
thanks @Harold Ritter
01-07-2023 05:51 PM
Traceroute and ping, by default, at each L3 hop, use an egress interface using the same "rules" as any other packet. I.e. "worst" (e.g. longer) paths would, generally, not be used.
I haven't studied all your details, but if you do a show route for destination, at any specific L3 hop, an indicated egress interface, for that destination, is not being used?
BTW, you can cause traceroute or ping (actually any IP packet) to follow the path you desire if source routing is used and if honored on your L3 devices.
01-08-2023 06:05 AM
I run your lab and I dont see two path only one path,
can I see show ip route in R6?
in R1 only one path for each subnet, I think there is something wrong.
01-08-2023 07:31 AM - last edited on 01-24-2023 09:53 PM by Translator
There must be two paths. This is a training lab I'm doing and R1 should have two paths for each external subnet. The reason for this is because R2 and R3 don't have an iBGP connection. As such, both will announce in OSPF the path they learn via eBGP to each external network, for a total of two routes (one via R2 and one via R3) to each external network in R1.
To answer your question, this is the output:
R6#sh ip route
Codes: C - connected, S - static, R - RIP, M - mobile, B - BGP
D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area
N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
E1 - OSPF external type 1, E2 - OSPF external type 2
i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
ia - IS-IS inter area, * - candidate default, U - per-user static route
o - ODR, P - periodic downloaded static route
Gateway of last resort is not set
B 222.120.10.0/24 [20/0] via 10.120.130.5, 00:01:15
C 222.130.10.0/24 is directly connected, FastEthernet0/0
10.0.0.0/24 is subnetted, 2 subnets
C 10.110.130.0 is directly connected, FastEthernet1/0
C 10.120.130.0 is directly connected, FastEthernet0/1
B 222.110.0.0/16 [20/0] via 10.110.130.2, 00:01:18
01-08-2023 08:19 AM - last edited on 01-24-2023 09:53 PM by Translator
friend, as I mention before there is something wrong,
in R2 & R3 if you do
show ip route ospf database external
you will see the prefix with same origin router,
that need work to prevent it.
good luck