cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
5329
Views
5
Helpful
9
Replies

How does a router know when a next hop (ethernet) is down?

michaelbs
Level 1
Level 1

Hi,

I might have dumb question. In case I have a router configured with two static routes to the same network over two different hops and with two different metrics. How does the router know when i.e. the next hop from the route with metric i.e. 1 is down and it should now use the second route with i.e. metric 2?

In case Point-to-Point Connections are used the router of course knows when the interface is down so it knows it should Switch to the other router. But in case of multipoint Connections - how does the router know that the next hop is down? In case of Ethernet, I guess as long as the router has the MAC-adress of the next hop in its ARP table it will continue to send Frames to it?

Sorry, but I'm currently not getting it.

Thanks,
Michael

2 Accepted Solutions

Accepted Solutions

Peter Paluch
Cisco Employee
Cisco Employee

Hi Michael,

But in case of multipoint Connections - how does the router know that the next hop is down? In case of Ethernet, I guess as long as the router has the MAC-adress of the next hop in its ARP table it will continue to send Frames to it?

Well, there's a very simple answer to this: the router does not know :) Either the entire Ethernet interface goes down, in which case both next hops become unreachable anyway, or the Ethernet interface stays up and in this case, the router has no direct way of knowing that one of the next hops went down.

You would need to use an additional mechanism that verifies the liveliness of the next hop to determine whether it is alive. You could use a routing protocol to dynamically learn about networks from next hops (and forget them if the next hops go down), or in case of static routes, you could create an IP SLA probe to periodically ping the next hop and make the static route dependent on the state of the IP SLA probe (if it is successful, the route will be installed; if the probe fails, the route will be removed from the routing table automatically). Yet another way would be to use the Bidirectional Forward Detection protocol, or BFD, to verity the liveliness of the particular next hop.

Best regards,
Peter

View solution in original post

Hi Michael,

You are heartily welcome.

In case the physical address for route 1's next hop cannot be resolved - would the router automatically use the second route in this case (that is, try to resolve the physical address for route 2's hop and use it for delivering the frames)?

Unfortunately, it wouldn't. It would always try to ARP for the first next hop's IP/MAC address mapping - and fail. As a result, the packets will be dropped due to encapsulation failure, as the router is unable to construct the frame.

You could theoretically let your two next hops run some sort of first-hop redundancy protocol such as HSRP or VRRP. In that case, you would simply point your single static route to the virtual IP address of the virtual router represented by the two next hops running HSRP/VRRP. As these routers would actively back up each other, making it always look like the virtual IP and the associated virtual MAC address are alive as long as at least one of these next hops is alive, you would be able to have redundancy with just a single static route on your router. However, using HSRP/VRRP to provide redundancy to routers themselves is not a by-the-book practice. If you already run a routing protocol, you would be better off using it instead.

Best regards,
Peter

View solution in original post

9 Replies 9

Peter Paluch
Cisco Employee
Cisco Employee

Hi Michael,

But in case of multipoint Connections - how does the router know that the next hop is down? In case of Ethernet, I guess as long as the router has the MAC-adress of the next hop in its ARP table it will continue to send Frames to it?

Well, there's a very simple answer to this: the router does not know :) Either the entire Ethernet interface goes down, in which case both next hops become unreachable anyway, or the Ethernet interface stays up and in this case, the router has no direct way of knowing that one of the next hops went down.

You would need to use an additional mechanism that verifies the liveliness of the next hop to determine whether it is alive. You could use a routing protocol to dynamically learn about networks from next hops (and forget them if the next hops go down), or in case of static routes, you could create an IP SLA probe to periodically ping the next hop and make the static route dependent on the state of the IP SLA probe (if it is successful, the route will be installed; if the probe fails, the route will be removed from the routing table automatically). Yet another way would be to use the Bidirectional Forward Detection protocol, or BFD, to verity the liveliness of the particular next hop.

Best regards,
Peter

Hello Peter,

thanks a lot for your reply, it helped to confirm my thoughts on this. One additional question: In case the physical address for route 1's next hop cannot be resolved - would the router automatically use the second route in this case (that is, try to resolve the physical address for route 2's hop and use it for delivering the frames)?

Best Regards,
Michael

Hi Michael,

You are heartily welcome.

In case the physical address for route 1's next hop cannot be resolved - would the router automatically use the second route in this case (that is, try to resolve the physical address for route 2's hop and use it for delivering the frames)?

Unfortunately, it wouldn't. It would always try to ARP for the first next hop's IP/MAC address mapping - and fail. As a result, the packets will be dropped due to encapsulation failure, as the router is unable to construct the frame.

You could theoretically let your two next hops run some sort of first-hop redundancy protocol such as HSRP or VRRP. In that case, you would simply point your single static route to the virtual IP address of the virtual router represented by the two next hops running HSRP/VRRP. As these routers would actively back up each other, making it always look like the virtual IP and the associated virtual MAC address are alive as long as at least one of these next hops is alive, you would be able to have redundancy with just a single static route on your router. However, using HSRP/VRRP to provide redundancy to routers themselves is not a by-the-book practice. If you already run a routing protocol, you would be better off using it instead.

Best regards,
Peter

Hello Peter,

thank you very much for your further clarification. Simply put I guess if running a Routing Protocol, a corresponding route on my router is considered dead in case the Routing Protocol updates from the next hop are not received on my router anymore. If they show up again, the routes are again chosen depending on administrative distance / metric.

So, real routing redundancy can only be achived when running a Routing protocol (or, in case of static routes, only with Point-to-Point links).

I have not learned much about Routing Protocols yet but guess doing that will shed some light on this topic.

Your help is much appreciated!

Best Regards
Michael

Hi Michael,

if running a Routing Protocol, a corresponding route on my router is considered dead in case the Routing Protocol updates from the next hop are not received on my router anymore.

Correct.

If they show up again, the routes are again chosen depending on administrative distance / metric

Correct.

So, real routing redundancy can only be achived when running a Routing protocol (or, in case of static routes, only with Point-to-Point links).

Quite correct. More generally speaking, routing redundancy can only be achieved with an additional mechanism that gives you some sort of feedback about the liveliness of the next hop, like the IP SLA, or a mechanism that virtualizes multiple next hops into a single virtual next hop, like HSRP/VRRP/GLBP, or a dynamic routing protocols that is precisely designed to adapt to the current state of network. Point-to-point interfaces would help you with the redundancy only if they went down upon the failure of the next hop. However, even with point-to-point interfaces, there are situations where the next hop stops working but the interface nonetheless remains up - and you're stuck again.

Best regards,
Peter

 

Hi Peter,

yes, thanks, when thinking about this it indeed is clear that there might be situations where a point-to-point interface is up but the next hop might have a problem anyway. Thus everything is still forwarded down the point-to-point link but not any further - so dead end even though the interface is up.

Peter, you actually made my day. Thank you very much.

Best Regards
Michael

 

 

Hello Peter,

>>In case the physical address for route 1's next hop cannot be resolved - would the
>>router automatically use the second route in this case (that is, try to resolve the
>>physical address for route 2's hop and use it for delivering the frames)?

>
Unfortunately, it wouldn't. It would always try to ARP for the first next hop's IP/MAC
>address mapping - and fail. As a result, the packets will be dropped due to
>encapsulation failure, as the router is unable to construct the Frame.

Might this behaviour be different from implementation to implementation? I just tested this with a Windows 7 PC the following way:

- added two routes to the same destination using different gateways and metrics, i.e.:

route add 192.168.230.1 mask 255.255.255.255 172.16.10.1 metric 2
route add 192.168.230.1 mask 255.255.255.255 172.16.10.2 metric 4

When sending data to 192.168.230.1 they are delivered to 172.16.10.1 first as it is the gateway in the route with the lowest metric. And here is what is interesting: If I take the gateway 172.16.10.1 from the network, Windows automatically switches to the second route and delivers data to 172.16.10.2!

This takes place once Windows wants to refresh the ARP entry for 172.16.10.1 and does not receive a corresponding ARP reply. In this case, Windows sends an ARP request for the gateway from the second route and once it knows its physical address it starts sending data to it.

Moreover, Windows continues to resolve the physical address for the first gateway. As soon as Windows receives an corresponding ARP reply, it switches back to the first route/gateway.

This also happens quite quickly as changes have been made to the ARP caching behavior in Windows Vista and newer versions which adapt the rechability states of physical addresses from NDP (IPv6) to IPv4:

  http://support.microsoft.com/kb/949589/en-us

I have also tested the same example with Debian Linux 7 and in this case, the behaviour is like you described. Once the physical address from the first gateway is down and the ARP entry cannot be refreshed, no switchover to the second route takes place - Linux does not even try to resolve the physicall address for the second gateway.

The only thing that changes once Linux knows the physical address is down (due to the fact that the ARP entry cannot be refreshed anymore) is that I receive ICMP Destination Host Unrechable Messages (whereas as long as the ARP entry is still valid frames are just delivered and blackholed).

Best regards
Michael


 

Hi Michael,

Your experiments - and also the document you have referenced - clearly show that an operating system can actually try to deduce the validity of a next hop by the (in)ability of resolving it via ARP.

To be quite honest, I did not know that Windows operated like that. As usual, Microsoft does things in its own way :) Nonetheless, I am sure that Cisco routers do not withdraw static routes as a result of being unable to ARP for the corresponding next hop.

Definitely, though, this can be an implementation-specific behavior. A static route is a tool, not a protocol element, so its behavior including some "higher intelligence" is simply a matter of how far did the implementors of the routing code in an operating system go. While I was surprised to learn that Windows tried to be smart here, it does not really come off as unheard of to me.

Thank you very much for sharing the knowledge!

Best regards,
Peter

Disclaimer

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

Ok, for multipoint, you have a static route to two different next hop IPs, with two different costs, metric 1 and 2.  The router would select only the better route, the "1", and use it.

If the IP cannot be reached (at L2), but the link is up, the router will be unable to deliver the frames but it will not remove the route.  Effectively, you've blacked holed your traffic.  (NB: the router might send an ICMP message back to the sender, that it's unable to effectively forward traffic, but it doesn't "pull" the static route.)

BTW, this "feature" might be used to intentionally route selected traffic to a Null interface.

As Peter has described, you'll need a feature to get the better static route to be removed from the route table.  If such a feature is active, and the better static route is removed, then the router will start to use the next best static route, the "2".

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Review Cisco Networking products for a $25 gift card