05-17-2020 11:36 AM - edited 05-17-2020 12:03 PM
Regarding the post "Definition Of Feasible Distance In EIGRP Convergence":
Peter Paluch's (Hall of Fame Cisco Employee Hall of Fame Cisco Employee) description of the FD:
"Feasible Distance is the lowest distance to the destination experienced since the last time the route went from Active to Passive state. Put differently, the Feasible Distance is the historical record of how closest the router was to the destination since the last diffusing computation for the destination has finished."
My question: Is there a purpose as to why the FD is a historical record, rather than the FD changing higher when the CD (Calculated Distance) changes higher (and the destination prefix does not go from Active to Passive)?
Is this just an inadvertent quirk as to how Cisco implemented EIGRP or is this by design for some reason?
I started testing in GNS3 what Peter was describing and inadvertently I found another reason to ask this question.
I noticed that with my Fastethernet interfaces, the output matched exactly what Peter was describing, in that unless the destination prefix went from Active to Passive, the FD did not change when the CD would go higher, however, I found that the FD does automatically go higher when using Gigabitethernet interfaces. I am not sure if this is unique to GNS3, whether there is a quirk that was fixed with EIGRP as it relates to Gig interfaces or some other reason.
If I have all FE's interfaces connected and I raise the delay along the path and then I do a show ip eigrp topology all-links, I do see the CD change higher, but the FD does not move.
If I have all GE's interfaces connected and I raise the delay along the path and then I do a show ip eigrp topology all-links, I see both the CD and FD go higher (i.e. their metric is both higher and match each other).
-------------------------------------------------------------------------------------------------------
So actually I have two questions:
Is there a purpose for the FD to act as a historical record? i.e Why does the FD not move higher (when the CD moves higher) without the destination prefix having to go from Active to Passive?
And why do I only see the FD act as a historical record with FastEthernet interfaces, but not with GigabitEthernet interfaces?
Solved! Go to Solution.
05-21-2020 08:59 AM
My friends,
@Richard Burts gave me a nudge to check this thread... Please allow me to share a few thoughts. This post may be long - please bear with me.
@dk3874, let me respond to your initial post first.
Is this just an inadvertent quirk as to how Cisco implemented EIGRP or is this by design for some reason?
Oh, not a quirk at all. It is for a very good reason. Routing protocols are distributed and weakly coordinated algorithms reacting to events - attaching or disconnecting a new network, connecting or disconnecting a link between routers, a router coming up or going down. The knowledge about a new event takes time to propagate, and it is impossible for routers in the network to learn about the event all at the same moment, instantly. There is a risk that a router acting on an event - say, a loss of a network - will use a neighbor that still advertises the network as reachable, only because that neighbor does not yet know that the network is gone. The result can be disastrous.
Think of this simple example with RIP, and forget about all extensions such as triggered updates or split horizon, just think of plain basic RIP sending out updates every 30 seconds (each router in its own time, of course).
There's a network X on the right, and R1 advertises it with a hop count of 1 to R2. Since we assume no split horizon, R2 will advertise this network further to any neighbors it has, including R1, with a hop count of 2. Now consider this sequence of events:
Of course, we can argue that this is a simplistic scenario, that we would have split horizon, and route poisoning, and triggered updates active and so on... yes, that is true, but those mechanisms are all just partial improvements and optimizations to the basic distance vector mechanism. None of them, even if all combined together, guarantees a 100% loop-free operation. Note the fundamental problem here: How should R1 validate if the information coming from R2 is up-to-date with the true current state of the network?
This is the reason why we have transient routing loops during network reconvergence: Routers acting on outdated information from their neighbors because there is no way of validating if that information already reflects on the changed state of the network.
This is even more pronounced with event-based protocols as EIGRP that maintain a topology table remembering the last information received from neighbors since there are no periodic updates. If a neighbor did not send us an update around the time we have learned about a topology change, why is that? Is it because the neighbor has not been affected, and so has nothing to update? Or is it because it does not know yet? Those two alternatives are very different - in the first case, the information we have from the neighbor is up-to-date and we can use it; in the second case, using the neighbor means risking a routing loop.
Note that if there was a strong coordination between routers - before any change to a routing table, a router always asking all its neighbors for their most up-to-date information while piggybacking the information about the topology change to the request, and waiting for all responses before making a change itself - this uncertainty would go away. However, this kind of tight coordination could slow down the convergence, and if there was some issue in passing the requests and responses back and forth, the convergence could fail entirely. So this kind of tight synchronization between routers in a routing protocol has never been generally adopted. However, it may sound familiar to you: It is the diffusing computation concept that EIGRP falls back to if a router cannot make a safe routing choice locally.
If a router cannot guarantee that the information it has from neighbors is always up-to-date, then we can at least do a step sideways: Have some kind of a rule, or a condition, that would allow a router to tell if there is a chance that our neighbor's path could have depended on our own path to the destination before the topology change. If we can say with certainty it could not, we're free to use that neighbor right away. If we don't have that certainty, we need to handle the situation differently.
Here is where the Feasible Distance as the "minimal experienced distance to the destination" comes into play. A router cannot tell by looking at the last known distance reported by its neighbor whether that distance is already up-to-date with the current state of the network. What the router can try to tell, though, is whether there is a chance that the neighbor's path could be an extension of the router's own path before the topology event. And here is how we can know:
If we never sold the ride for less than $100 (doesn't matter that right now, we're selling it for $999), while the neighbor's last known offer for the same ride was $80, there's no chance the neighbor is riding down our own route, either through us or through our common next hop. For that, his price itself would need to be at least $100. So even if the offer from the neighbor is not guaranteed to be up-to-date, we at least can be sure that it could never have been derived from our own path and our own price. And so the neighbor cannot be looping the packets back to us if we send packets to him.
This is why Feasible Distance must be a historical minimal distance ("I never sold it for less than..."), rather than the momentary distance. Note that if FD was the momentary distance, EIGRP would degrade to RIP! The whole Feasibility Condition would collapse to "If you are closer to the destination than I am, then..." - but that was exactly the situation up above in the diagram where we constructed the routing loop so easily. Such a condition does not eliminate the possibility of acting on outdated information.
however, I found that the FD does automatically go higher when using Gigabitethernet interfaces. I am not sure if this is unique to GNS3, whether there is a quirk that was fixed with EIGRP as it relates to Gig interfaces or some other reason.
I am sure this is not GNS3's doing. This must be the consequence on how the metrics turned out in your network. However, as @Giuseppe Larosa pointed out, we would need to see a diagram of your topology, and a show ip eigrp topology all-links outputs to understand better what is going on. What could help is to do debug eigrp fsm and watch for the events as you modify metrics. The router obviously believes that it needs to go Active. If not, and yet if it changes the FD without going Active, then this would amount to a bug that needs to be corrected.
@Martin L , please allow me to respond to your post, too:
When the CCIE R&S version 5 book by Narbik/Paluch came out in 2016 we were all confused regarding EIGRP. It created panic and confusion among CCIE candidates as we though we got EIGRP figure out. Just when you think you know the stuff and are ready to take exam.....
I am sincerely sorry if I have caused a permanent inconvenience or harm. I am aware that the EIGRP chapter was quite an eye opener, and I was writing it with that very goal - to finally set some things about EIGRP straight, once and for all, knowing that many readers will be befuddled at first since what they read about was not EIGRP they knew... but it was EIGRP as it has always truly been. All the examples, diagrams, debugs I've used and shown - they're very simple, far from any spectacular topologies or corner cases. I understand how disconcerting it is to start reading about a previously familiar topic and find out that things are so different. But I was writing a book for experts and while there may still have been some space for simplifications, there was no space for misinformation; there never is. My goal was to establish clarity about EIGRP's true mechanics and principles of operation at the same level we have clarity about RIP, OSPF, IS-IS, or BGP.
FD is loop prevention mechanism
Correct. That is its only purpose - provide a guarantee whether the neighbor is certainly not using path through us, or if there is a chance the neighbor could be using us after all. Nothing more, nothing less.
Paluch in the book shows FD historical record helped avoiding selecting a non-optimal route via R3 while better one was available (via R4).
This is a side-effect of how the FD works in combination with the shortest path algorithm. In that example, there was a conflict between the shortest path algorithm that would pick R4, and the FD condition that discredited R4 as being possibly looped. That is why the router needed to go Active.
I would add that keeping FD value as historic record will reduce number of DUAL calculations promoting stability and sustaining convergence.
I would humbly contest this. FD has no direct impact on stability or convergence. Its sole purpose is loop freedom check. In fact, in that same example where R3 was a feasible successor but providing a worse path than R4, with R4 not meeting the feasibility condition, FD has forced the router to go Active, so it in fact triggered a diffusing computation rather than preventing it.
For some reason Load and reliability of link is included in the metric but not actually considered by default.
This requires clarification. With regards to Load and Reliability:
Load and Reliability have been just retaken from IGRP, but acting on them was not implemented. If you think about it, it could be quite dangerous - since you would possibly start "swinging" traffic between a couple of paths as soon as one of them would become more loaded than the other; after switching to another path, the former path would become less loaded, so the metric would improve, and you would move the traffic back. This would bring a permanent churn and instability into the network.
if those were calculated and FD would not be viewed as historical record, such settings could force Dual to performing calculations frequently if not continuously.
A humble correction:
The intuitive logic behind FD and, consequently, FC is relatively simple: "If you are closer to the destination than I have ever been, then you cannot be using the path through me since your distance would need to fully include mine, no matter which of my known distances in the past it would be based on." However, the formal proof that it works as expected is quite tedious - I am attaching a PDF file where Dr. Garcia-Luna-Aceves postulated and proved the DUAL's loop freedom using the FD/FC as we know it (he calls it the Source Node Condition, SNC). If you give it a look, you'll probably appreciate the fact that the FD behavior is in fact very well chosen, and if we wanted to define it in another way, we would need to repeat the whole proof to verify that it would still guarantee the loop freedom as the one we're using today.
Please feel welcome to ask further! ... and thank you for reading through this entire wall of text... :)
Best regards,
Peter
06-01-2020 01:13 AM
Hello,
You are welcome. Let's see about your rewrites.
If our path (FD) was never less than 100 (doesn't matter that right now, our CD (Calculated Distance) is 999), while the neighbor's currently (advertised to us) Reported Distance for the same path is 80, there's no chance the neighbor is or could have been using us as part of its own route or using our common next hop. If our neighbor was using us as part of its own route or using our common next hop, then our neighbor's
FD (RD to us)would need to be at least 100. So even if the RD from the neighbor is not guaranteed to be up-to-date, we at least can be sure that it could never have been derived from our own path andour own FD. And so the neighbor cannot be looping the packets back to us if we send packets to him.
I agree with your reformulation except the two parts highlighted (and striked through) in red.
FD is not RD. It is not the same value, and in the EIGRP code, it is not the same variable.
With this in mind, the two corrected statements would be:
If our neighbor was using us as part of its own route or using our common next hop, then our neighbor's RD would need to be at least 100. So even if the RD from the neighbor is not guaranteed to be up-to-date, we at least can be sure that it could never have been derived from our own path and our own CD.
This is why Feasible Distance must be a historical minimal distance ("My FD was never less than..."), rather than the momentary distance (CD). Note that if FD was the momentary distance, EIGRP would degrade to RIP! The whole Feasibility Condition would collapse to "If you are closer to the destination than I am, then I'll use you even if this creates a loop..." - but that was exactly the situation up above in the diagram where we constructed the routing loop so easily. Such a condition does not eliminate the possibility of acting on outdated information.
Correct.
Please feel welcome to ask further!
Best regards,
Peter
05-18-2020 12:01 AM - edited 05-18-2020 12:05 AM
Hello @dk3874 ,
>> My question: Is there a purpose as to why the FD is a historical record, rather than the FD changing higher when the CD (Calculated Distance) changes higher (and the destination prefix does not go from Active to Passive)?
As you know the Feasible Distance is used to check the FC Feasibility Condition in order to qualify as a feasible successor the reported distance RD must be less then FD.
This is used for loop avoidance an EIGRP router should not consider as a feasible successor for a prefix a router that has a reported distance = the composite metric to the prefix from the point of view of the other router greater (or equal to) then the local FD for the prefix.
Simply changing the FD to be the lowest current metric could lead to non optimal calculations in the future in the case the current successor route is lost.
By leaving unchanged the FD a prefix may go active even when an alternate path would exist if the FD could change. By going active and running the DUAL diffusive algorithm the EIGRP protocol give a chance to calculate a new FD and to pick the best path on the light of the new value of FD.
There is an example about this in Peter Paluch's book CCIE R&S 5th edition volume I in the EIGRP chapter.
>> And why do I only see the FD act as a historical record with FastEthernet interfaces, but not with GigabitEthernet interfaces?
Feel free to post your configurations and appropriate show commands show ip eigrp topology all-links, for the moment it is not possible to say something meaningful.
Are you using modified K values so that only delay is used in metric computation ?
Final note:
the normal state of an EIGRP prefix is P passive, when the state is Active the DUAL is triggered and queries are sent for the prefix to all EIGRP neighbors in your post in some sentences it looks like you are changing the two roles.
Hope to help
Giuseppe
05-19-2020 08:32 AM
Perhaps Peter Paluch will join this discussion but here is my understanding. When there is a convergence event EIGRP puts the route into the active state, sends queries, receives responses, does the calculation to choose a path that is guaranteed to be loop free, and transitions the route to the passive state. The FD is determined as part of that calculation. As long as that path continues to be used without transition to active state the original FD should be used. There may be events that cause the calculated distance to change but the FD still represents the conditions that caused the loop free path to be chosen and should not change until there is a transition to active state.
05-19-2020 02:50 PM - edited 05-19-2020 02:58 PM
When the CCIE R&S version 5 book by Narbik/Paluch came out in 2016 we were all confused regarding EIGRP. It created panic and confusion among CCIE candidates as we though we got EIGRP figure out. Just when you think you know the stuff and are ready to take exam.....
Anyway, as already mentioned above, FD is loop prevention mechanism. Paluch in the book shows FD historical record helped avoiding selecting a non-optimal route via R3 while better one was available (via R4). Route via R3 would be selected because it was feasible successor while R4 was not before the change. I would add that keeping FD value as historic record will reduce number of DUAL calculations promoting stability and sustaining convergence. For some reason Load and reliability of link is included in the metric but not actually considered by default. if those were calculated and FD would not be viewed as historical record, such settings could force Dual to performing calculations frequently if not continuously.
In your case of GIG versus Fast interfaces, my guess is that your change using Gig interfaces forced table to go Active while change using Fast interfaces did not -stayed Passive.
Regards, ML
**Please Rate All Helpful Responses **
05-21-2020 08:59 AM
My friends,
@Richard Burts gave me a nudge to check this thread... Please allow me to share a few thoughts. This post may be long - please bear with me.
@dk3874, let me respond to your initial post first.
Is this just an inadvertent quirk as to how Cisco implemented EIGRP or is this by design for some reason?
Oh, not a quirk at all. It is for a very good reason. Routing protocols are distributed and weakly coordinated algorithms reacting to events - attaching or disconnecting a new network, connecting or disconnecting a link between routers, a router coming up or going down. The knowledge about a new event takes time to propagate, and it is impossible for routers in the network to learn about the event all at the same moment, instantly. There is a risk that a router acting on an event - say, a loss of a network - will use a neighbor that still advertises the network as reachable, only because that neighbor does not yet know that the network is gone. The result can be disastrous.
Think of this simple example with RIP, and forget about all extensions such as triggered updates or split horizon, just think of plain basic RIP sending out updates every 30 seconds (each router in its own time, of course).
There's a network X on the right, and R1 advertises it with a hop count of 1 to R2. Since we assume no split horizon, R2 will advertise this network further to any neighbors it has, including R1, with a hop count of 2. Now consider this sequence of events:
Of course, we can argue that this is a simplistic scenario, that we would have split horizon, and route poisoning, and triggered updates active and so on... yes, that is true, but those mechanisms are all just partial improvements and optimizations to the basic distance vector mechanism. None of them, even if all combined together, guarantees a 100% loop-free operation. Note the fundamental problem here: How should R1 validate if the information coming from R2 is up-to-date with the true current state of the network?
This is the reason why we have transient routing loops during network reconvergence: Routers acting on outdated information from their neighbors because there is no way of validating if that information already reflects on the changed state of the network.
This is even more pronounced with event-based protocols as EIGRP that maintain a topology table remembering the last information received from neighbors since there are no periodic updates. If a neighbor did not send us an update around the time we have learned about a topology change, why is that? Is it because the neighbor has not been affected, and so has nothing to update? Or is it because it does not know yet? Those two alternatives are very different - in the first case, the information we have from the neighbor is up-to-date and we can use it; in the second case, using the neighbor means risking a routing loop.
Note that if there was a strong coordination between routers - before any change to a routing table, a router always asking all its neighbors for their most up-to-date information while piggybacking the information about the topology change to the request, and waiting for all responses before making a change itself - this uncertainty would go away. However, this kind of tight coordination could slow down the convergence, and if there was some issue in passing the requests and responses back and forth, the convergence could fail entirely. So this kind of tight synchronization between routers in a routing protocol has never been generally adopted. However, it may sound familiar to you: It is the diffusing computation concept that EIGRP falls back to if a router cannot make a safe routing choice locally.
If a router cannot guarantee that the information it has from neighbors is always up-to-date, then we can at least do a step sideways: Have some kind of a rule, or a condition, that would allow a router to tell if there is a chance that our neighbor's path could have depended on our own path to the destination before the topology change. If we can say with certainty it could not, we're free to use that neighbor right away. If we don't have that certainty, we need to handle the situation differently.
Here is where the Feasible Distance as the "minimal experienced distance to the destination" comes into play. A router cannot tell by looking at the last known distance reported by its neighbor whether that distance is already up-to-date with the current state of the network. What the router can try to tell, though, is whether there is a chance that the neighbor's path could be an extension of the router's own path before the topology event. And here is how we can know:
If we never sold the ride for less than $100 (doesn't matter that right now, we're selling it for $999), while the neighbor's last known offer for the same ride was $80, there's no chance the neighbor is riding down our own route, either through us or through our common next hop. For that, his price itself would need to be at least $100. So even if the offer from the neighbor is not guaranteed to be up-to-date, we at least can be sure that it could never have been derived from our own path and our own price. And so the neighbor cannot be looping the packets back to us if we send packets to him.
This is why Feasible Distance must be a historical minimal distance ("I never sold it for less than..."), rather than the momentary distance. Note that if FD was the momentary distance, EIGRP would degrade to RIP! The whole Feasibility Condition would collapse to "If you are closer to the destination than I am, then..." - but that was exactly the situation up above in the diagram where we constructed the routing loop so easily. Such a condition does not eliminate the possibility of acting on outdated information.
however, I found that the FD does automatically go higher when using Gigabitethernet interfaces. I am not sure if this is unique to GNS3, whether there is a quirk that was fixed with EIGRP as it relates to Gig interfaces or some other reason.
I am sure this is not GNS3's doing. This must be the consequence on how the metrics turned out in your network. However, as @Giuseppe Larosa pointed out, we would need to see a diagram of your topology, and a show ip eigrp topology all-links outputs to understand better what is going on. What could help is to do debug eigrp fsm and watch for the events as you modify metrics. The router obviously believes that it needs to go Active. If not, and yet if it changes the FD without going Active, then this would amount to a bug that needs to be corrected.
@Martin L , please allow me to respond to your post, too:
When the CCIE R&S version 5 book by Narbik/Paluch came out in 2016 we were all confused regarding EIGRP. It created panic and confusion among CCIE candidates as we though we got EIGRP figure out. Just when you think you know the stuff and are ready to take exam.....
I am sincerely sorry if I have caused a permanent inconvenience or harm. I am aware that the EIGRP chapter was quite an eye opener, and I was writing it with that very goal - to finally set some things about EIGRP straight, once and for all, knowing that many readers will be befuddled at first since what they read about was not EIGRP they knew... but it was EIGRP as it has always truly been. All the examples, diagrams, debugs I've used and shown - they're very simple, far from any spectacular topologies or corner cases. I understand how disconcerting it is to start reading about a previously familiar topic and find out that things are so different. But I was writing a book for experts and while there may still have been some space for simplifications, there was no space for misinformation; there never is. My goal was to establish clarity about EIGRP's true mechanics and principles of operation at the same level we have clarity about RIP, OSPF, IS-IS, or BGP.
FD is loop prevention mechanism
Correct. That is its only purpose - provide a guarantee whether the neighbor is certainly not using path through us, or if there is a chance the neighbor could be using us after all. Nothing more, nothing less.
Paluch in the book shows FD historical record helped avoiding selecting a non-optimal route via R3 while better one was available (via R4).
This is a side-effect of how the FD works in combination with the shortest path algorithm. In that example, there was a conflict between the shortest path algorithm that would pick R4, and the FD condition that discredited R4 as being possibly looped. That is why the router needed to go Active.
I would add that keeping FD value as historic record will reduce number of DUAL calculations promoting stability and sustaining convergence.
I would humbly contest this. FD has no direct impact on stability or convergence. Its sole purpose is loop freedom check. In fact, in that same example where R3 was a feasible successor but providing a worse path than R4, with R4 not meeting the feasibility condition, FD has forced the router to go Active, so it in fact triggered a diffusing computation rather than preventing it.
For some reason Load and reliability of link is included in the metric but not actually considered by default.
This requires clarification. With regards to Load and Reliability:
Load and Reliability have been just retaken from IGRP, but acting on them was not implemented. If you think about it, it could be quite dangerous - since you would possibly start "swinging" traffic between a couple of paths as soon as one of them would become more loaded than the other; after switching to another path, the former path would become less loaded, so the metric would improve, and you would move the traffic back. This would bring a permanent churn and instability into the network.
if those were calculated and FD would not be viewed as historical record, such settings could force Dual to performing calculations frequently if not continuously.
A humble correction:
The intuitive logic behind FD and, consequently, FC is relatively simple: "If you are closer to the destination than I have ever been, then you cannot be using the path through me since your distance would need to fully include mine, no matter which of my known distances in the past it would be based on." However, the formal proof that it works as expected is quite tedious - I am attaching a PDF file where Dr. Garcia-Luna-Aceves postulated and proved the DUAL's loop freedom using the FD/FC as we know it (he calls it the Source Node Condition, SNC). If you give it a look, you'll probably appreciate the fact that the FD behavior is in fact very well chosen, and if we wanted to define it in another way, we would need to repeat the whole proof to verify that it would still guarantee the loop freedom as the one we're using today.
Please feel welcome to ask further! ... and thank you for reading through this entire wall of text... :)
Best regards,
Peter