12-23-2015 08:16 AM
Whilst looking through info on ether-bundles on XR I've seen an example running BFD over a bundle. I've read the article by Aleksandar Vidakovic on BFD, support for bundles is mentioned, but I don't see the purpose of this. I've found documentation that shows how to configure this, but nothing to explain the problem its designed to solve.
I've used BFD with L3 routing protocols to detect failures and improve detection times over non-directly connected paths so get that. This works fine in IOS and XR.
However a bundle uses directly connected physical links, if all links fail the bundle goes down, so why use BFD over this, I don't get it.
So I've asked around, and have found that in certain cases BFD over a bundle seems to stop the bundle re-establishing post failure of all physical links. The explanation I've been given is that BFD detects a complete L2 failure, when the physical links come back up the bundle fails to establish due to the BFD association failing to get a session up, as its L2 path is down, a race condition.
Can someone explain the purpose of this?
12-23-2015 12:03 PM
I'm not sure what your 'in certain cases' is about. That sounds like a bug and should not be a normal operating state.
The main point of BFD is to allow a lot of the routing protocol keepalive behavior to be offloaded to the linecards. For ISIS or OSPF, those keepalives would be generated/processed by the main RP. If you instead can push those timers really high and use BFD on the LC CPUs with a very fast timer you get the benefit of fast timers (fast discovery of link downs) without some of the hardship (excessive CPU usage to hello processing).
As to why BFD over bundles, a few things. Perhaps an issue might be operational consistency. We can now configure BFD on all the links, no matter if they're bundles or not. You do have the LACP timers running on the links so you do get some low-level notification of outages, but even 'short' timers are 1s, whereas BFD can go much much lower.
12-24-2015 04:18 AM
Before BoB was implemented, BFD session over bundle interface would run over a single bundle member (member selection would be as per the standard bundle load-balancing scheme). That kind of BFD session over bundle doesn't ensure that paths over all members are good. The BFD session may be up because packets flow over member #1, but there might be an issue with member #2. Data traffic over member #2 could drop, but BFD would stay up.
BoB runs across all members and it's therefore capable of detecting a failure on any member.
If you still see unexpected behaviour with latest XR release, please let me know. You may want to try the latest 5.3.3 pre-release. You can find more info on it here: https://supportforums.cisco.com/discussion/12697761/do-you-want-test-xr-533-pre-release-asr9000
hth,
Aleksandar
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide