cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
3048
Views
55
Helpful
24
Replies

Traceroute behaviour with multiple routes of different length

Pedro Matias
Level 1
Level 1

Disclaimer: at first I thought this was merely a question about the title, but I realised it's many questions. I'm sorry if this caused the post to be confusing. In an attempt to improve the situation, I include a short question list at the end.

 

I have a network setup like in the picture.

BGP speakers are R2, connected only to R6 (no i-BGP betwen R2 and R3), R6 connected to R2 and R5, R5 connected to R4 (i-BGP) and R6, R4 connected to R5 (i-BGP) and R3, and R3 connected to R4. 

Inside AS 110 I have OSPF configured, with OSPF redistribution to BGP and BGP redistribution to OSPF (command redistribute bgp 110 in both R2 and R3), metric type 2.

I'm doing a traceroute from R1 to 222.130.10.6

I get this output:

R1#traceroute 222.130.10.6

Type escape sequence to abort.
Tracing the route to 222.130.10.6

1 222.110.20.3 24 msec
222.110.10.2 16 msec
222.110.20.3 16 msec
2 10.110.130.6 40 msec
10.110.120.4 24 msec
10.110.130.6 28 msec

This is not exactly what I expected. I understand from inspection with Wireshark and https://community.cisco.com/t5/routing/traceroute-with-multiple-next-hops/td-p/3404248 why this output appears "out of phase", meaning, why the two paths are scrambled, but I was hoping that the complete path through R1 was shown.

In Wireshark I confirmed that no UDP traceroute packets are being sent with a TTL bigger than 2 in either interface of R1. It seems like since it gets a response from one path with just 2 hops, it doesn't try to find the path with more hops.

Is this expected behaviour?

I have also noticed that ping 222.130.10.6 only uses the path through R2, i.e., the ICMP packets only appear in Wireshark in the interface between R2 and R1. In the interface between R1 and R3 nothing appears no matter how much I ping, even though both routes are valid according to the IP forwarding table of R1.

This is the IP forwarding table of R1:

R1#show ip route
Codes: C - connected, S - static, R - RIP, M - mobile, B - BGP
D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area
N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
E1 - OSPF external type 1, E2 - OSPF external type 2
i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
ia - IS-IS inter area, * - candidate default, U - per-user static route
o - ODR, P - periodic downloaded static route

 

Gateway of last resort is not set

 

C 222.110.20.0/24 is directly connected, FastEthernet0/1
C 222.110.10.0/24 is directly connected, FastEthernet0/0
O E2 222.120.10.0/24 [110/1] via 222.110.20.3, 04:08:04, FastEthernet0/1
[110/1] via 222.110.10.2, 04:08:04, FastEthernet0/0
O E2 222.130.10.0/24 [110/1] via 222.110.20.3, 04:08:04, FastEthernet0/1
[110/1] via 222.110.10.2, 04:08:04, FastEthernet0/0
O 222.110.30.0/24 [110/20] via 222.110.20.3, 04:47:26, FastEthernet0/1
[110/20] via 222.110.10.2, 04:47:28, FastEthernet0/0

 

An analogous situation is happening with 222.120.10.4, the traceroute uses both paths but the longer one is not fully shown, and ping chooses only the shortest path.

What I was expecting would be that ping would choose a random path each time, and that traceroute would show both complete paths everytime. Aren't both paths, from the point of view of R1, of equal cost (even though one goes through more ASes)?

These are the external LSAs for the network of 222.130.10.6

R1#show ip ospf database external

OSPF Router with ID (1.1.1.1) (Process ID 1)

Type-5 AS External Link States

Routing Bit Set on this LSA
LS age: 1208
Options: (No TOS-capability, DC)
LS Type: AS External Link
Link State ID: 222.130.10.0 (External Network Number )
Advertising Router: 2.2.2.2
LS Seq Number: 80000008
Checksum: 0x2E88
Length: 36
Network Mask: /24
Metric Type: 2 (Larger than any link state path)
TOS: 0
Metric: 1
Forward Address: 0.0.0.0
External Route Tag: 130

Routing Bit Set on this LSA
LS age: 1522
Options: (No TOS-capability, DC)
LS Type: AS External Link
Link State ID: 222.130.10.0 (External Network Number )
Advertising Router: 3.3.3.3
LS Seq Number: 80000009
Checksum: 0x5962
Length: 36
Network Mask: /24
Metric Type: 2 (Larger than any link state path)
TOS: 0
Metric: 1
Forward Address: 0.0.0.0
External Route Tag: 120


Question list:

1- Is this standard traceroute behaviour, i.e., if there are multiple routes of different hop count traceroute doesn't fully show the longer routes? Is there a way to make it show the full route to 222.130.10.6 through R3?

2- Can R1 distinguish that one of the paths to 222.130.10.6 is longer in terms of number of ASes? Why/why not? I think not, but my intuition says it should. I realise R1 is not a BGP speaker, so for it to be aware of the path length in ASes, when each of R2 and R3 redistribute their BGP route to 222.130.10.0/24 into OSPF they would have to somehow give some metric that corresponds to path length. Analysing the External LSAs, it doesn't seem to have any such metric. I also checked the Router and Network LSAs, which as per my prediction didn't have any information about external networks.

3- As everything seems to indicate that R1 can't distinguish which path is longer, because that unknown metric I referenced seems to not exist, the only solution would be to close the BGP overlay loop and connect R2 to R3 via iBGP?

24 Replies 24

I know it is Lab, the issue you see one path in R1, this what I focus on why one path and you redistribute in R2 and R3.

I found solution, use metric=1 in redistribute bgp into ospf and then check R1.

then check again ping and traceroute.

Hi @MHM Cisco World ,

This is not the issue. R1 receives both E2 routes and both have a metric of 1, which is the default behavior.

R1#show ip route
...

O E2 222.130.10.0/24 [110/1] via 222.110.20.3, 04:08:04, FastEthernet0/1 <+++++++ via R3
                                    [110/1] via 222.110.10.2, 04:08:04, FastEthernet0/0 <+++++++ via R2

Regards,

Regards,
Harold Ritter, CCIE #4168 (EI, SP)

yes, this output you share is fro your lab, which exact what I get after modify the metric. 
why it issue ? it issue, he ask in his original post that there is one path appear and other is missing, so we first must check where the second path disappear, I found that the path disappear in R2 and R3,, I check show ip route in R2 and R3 and I see the path through eBGP peer which is OK,  I do show ip ospf data external and I see that the origin is same!! that not normal.
so I find the solution to modify the metric to make R1 see both path toward R6. 

I Hope I am right and my solution is correct. 

Hi @MHM Cisco World ,

I check

show ip route 

in R2 and R3 and I see the path through eBGP peer which is OK,  I do

show ip ospf

> data external and I see that the origin is same!! that not normal.

This might be a problem that you encountered in your lab, but the OP had both routes with metric 1, as per his original post:

 

R1#show ip ospf database external

OSPF Router with ID (1.1.1.1) (Process ID 1)

Type-5 AS External Link States

Routing Bit Set on this LSA
LS age: 1208
Options: (No TOS-capability, DC)
LS Type: AS External Link
Link State ID: 222.130.10.0 (External Network Number )
Advertising Router: 2.2.2.2
LS Seq Number: 80000008
Checksum: 0x2E88
Length: 36
Network Mask: /24
Metric Type: 2 (Larger than any link state path)
TOS: 0
Metric: 1
Forward Address: 0.0.0.0
External Route Tag: 130

Routing Bit Set on this LSA
LS age: 1522
Options: (No TOS-capability, DC)
LS Type: AS External Link
Link State ID: 222.130.10.0 (External Network Number )
Advertising Router: 3.3.3.3
LS Seq Number: 80000009
Checksum: 0x5962
Length: 36
Network Mask: /24
Metric Type: 2 (Larger than any link state path)
TOS: 0
Metric: 1
Forward Address: 0.0.0.0
External Route Tag: 120

Regards,

Regards,
Harold Ritter, CCIE #4168 (EI, SP)

I will share my lab @Harold Ritter can you make look.

Hi @MHM Cisco World ,

There is two:

C 222.110.20.0/24 is directly connected, FastEthernet0/1
C 222.110.10.0/24 is directly connected, FastEthernet0/0
O E2 222.120.10.0/24 [110/1] via 222.110.20.3, 04:08:04, FastEthernet0/1
[110/1] via 222.110.10.2, 04:08:04, FastEthernet0/0
O E2 222.130.10.0/24 [110/1] via 222.110.20.3, 04:08:04, FastEthernet0/1 <<- one path only. 
[110/1] via 222.110.10.2, 04:08:04, FastEthernet0/0
O 222.110.30.0/24 [110/20] via 222.110.20.3, 04:47:26, FastEthernet0/1
[110/20] via 222.110.10.2, 04:47:28, FastEthernet0/0

 

Regards,

Regards,
Harold Ritter, CCIE #4168 (EI, SP)

this is my lab, when router work and suddenly not work I know that there is wrong, 
in R2 and R3 we must see 4.4.4.4 (LO in R4 AS130) learn from direct connect eBGP
BUT the R3 show strange behave, the R3 show two entry for 4.4.4.4 !!! this make me crazy, I run lab in my laptop and it cpu reach 100%, so I wait until I run same lab in my PC. 
now return to issue, there are two entry for 4.4.4.4 !! but why t must be learn from direct eBGP only, the next-hop of second entry give me hint what is going on, 
the second entry show that the next-hop is R2 ? this make me think that there is routing LOOP (not data LOOP), 
from where this LOOP 
the R4 advertise the 4.4.4.4 into R2 via eBGP and R2 redistribute the 4.4.4.4 into OSPF but then R3 redistribute the OSPF into BGP. this last redistribute will include external in this case 4.4.4.4 is external and it will re-redistribute into BGP and from here it routing LOOP come, 
since the path from R4-R2-R3 is less that R4-R6-R5-R3. the R3 will select path via R2 not direct eBGP (you can see prefer mark in show ip bgp in R3)

OMG, so I play cat and mouse want to break this LOOP

I return to m config, under bgp 110 redistribute ospf 10 match internal external <<- @Pedro Matias  write comment in other post why we need this.
 for me I need external because I conf t LO and I need to redistribute into BGP, so how I can solve this ?
solution was using route-map with redistribute OSPF into BGP, this route-map deny 4.4.4.4 from redistribute into BGP and permit other. 
and done it work and LOOP is end. 




Screenshot (187).png

Screenshot (188).png

two photo before and after LOOP stop

Hi @MHM Cisco World ,

> since the path from R4-R2-R3 is less that R4-R6-R5-R3. the R3 will select path via R2 not direct eBGP

> (you can see prefer mark in show ip bgp in R3)

The path via R2 is preferred not because of its shorter AS path (the AS path was lost once the route is redistributed in OSPF), but rather because it was redistributed from OSPF to BGP, which causes it to be considered as locally originated and therefore to have a weight of 32768. This is a timing issue. If both routes are learnt and preferred via BGP and redistributed in OSPF, you do not see this issue, but if the routes is leant via R2 first, propagates to R3 and is redistributed into BGP on R3 before it s received from R4, then you see this issue.

It is really important to filter when you do mutual redistribution. Otherwise you end up with some really bad issues most of the time.

Regards,

Regards,
Harold Ritter, CCIE #4168 (EI, SP)

Yes I know this point, but this lead me to change the weight of 4.4.4.4 via eBGP and if I make it equal still path make issue here, 
R4-R2=R3 is short than R4-R6-R5-R3.
so I decide to solve it by using route-map deny 4.4.4.4 from learn again. 

Now after remove LOOP I will clear issue of ICMP/traceroute 
Screenshot (190).png

in R1 there are two path why traceroute show two path and ping show only one path ??
this not relate to RIB it relate to CEF, 
CEF use IP-source IP-destination & L4port  to hash and forward packet 
but as you see below there is no L4 port in ICMP !! Yes ICMP is L3 not L4 and hence there is no L4 port instead the ICMP contain ID, this ID if you see is same for sequance ping, so CEF will select only one path always. 


Screenshot (191).png

where traceroute use UDP L4 port and it not same and hence CEF use two different path
Screenshot (193).png

OK can I change this behave ? Yes you can by config ip load-sharing per-packet 

Screenshot (194).png