cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1320
Views
0
Helpful
10
Replies

BFD Hardware offload on Catalyst 9500?

filopeter
Level 1
Level 1

Hello,
do you know if this feature is on the roadmap, or at least if BFD process could have higher priority than other processes?

Current state is

cat9k#show bfd neighbors details 
...
Session state is UP and not using echo function.
Session Host: Software
MinTxInt: 250000, MinRxInt: 250000, Multiplier: 3
Received MinRxInt: 250000, Received Multiplier: 3
Registered protocols: OSPF CEF

I recently run into an issue, where an STP recalculation led to an IGP routing protocol outage, because the switch was not able to process BFD packets ontime.

Best Regards,

Peter

10 Replies 10

shaikmohib
Level 1
Level 1

what is the code train you are using peter ? 

The platform is C9500-48Y4C, IOS XE Version [Fuji] 16.9.6

Reza Sharifi
Hall of Fame
Hall of Fame

Hi,

Is the link you are using BFD on routed/layer-3 ports (no SVI)?

Usually, BFD works well with BGP as it converges slower than IGP.

 

HTH

I am using SVI interfaces, BFD is configured under SVI. 

STP recalculation which I have mentioned earlier ran on different vlans , not on the vlans with SVI/BFD configuration. 

Not sure how your environment is set up but it is possible to use a routed/layer-3 link instead of an SVI and test with that?

HTH

shaikmohib
Level 1
Level 1

Hi Peter, 

the issue seems to be weird, BFD is only for detection and we can use it with OSPF but if there was STP re calculation it should not affect your OSPF neighborship. BFD is used for faster detection of a network failure. Failing to process BFD will not result in OSPF neigbhorship going down. 

If you had a STP re calculation, did that choke out the CPU so much that it dropped everything else including the OSPF? 
In my setup we have BFD with OSPF & BGP - We get many false alarms due to BFD on OSPF but the neighborship doesnt go down. 

Processing of BFD, STP, OSPF packets etc. is a control plane task. Since BFD is an intensive application from packet processing point of view, some platforms support hw-offload of BFD, to avoid control plane overload.
During the outage I have observed a high CPU spike, so I believe the control plane was overloaded with STP recalculation and was not able to process BFD packets ontime, which led to BFD timeout.
Any routing protocol can use BFD for fast convergence. In my case it is OSPF, in your case BGP.

krinaldo
Level 1
Level 1

I'm resurrecting this old thread -- did you ever get an answer on this?

I have asked this same question to TAC and to my SE who escalated with the BU.  Nobody ever had a clear answer.

This seems to be yet another case of a feature that nobody really seems to care enough internally to fully understand, document, or support. I got conflicting answers, ranging from "hardware support only works in echo mode" to "it's on the roadmap" to "it isn't ever going to work that way."

Even the BU could not offer a clear, concise answer.  I have older platforms that support hardware offload (4500-X), but with the same configuration, the Catalyst 9300, 9400, and 9500s we have all run the BFD sessions in software.

What little documentation there is that specifically covers this topic is inconsistent, at best.

Some consistency would be great, eh? 

It looks like Catalyst 9500 switches were not designed to perform BFD offload in HW.
The IP Routing configuration guide (version 17.7.x) contains an chapter related to BFD feature history and BFD hw-offload is not mentioned there yet


https://www.cisco.com/c/en/us/td/docs/switches/lan/catalyst9500/software/release/17-7/configuration_guide/rtng/b_177_rtng_9500_cg/configuring_bidirectional_forwarding_detection.html#Cisco_Reference.dita_a2b01285-f2f8-4750-8602-5f68c51295fa

 

It's not mentioned in a lot of the config guides for platforms that both do and do not support it. The documentation is inconsistent at best, that's nothing new.

I seem to remember seeing blurbs about it in slide decks somewhere, but I don't have the energy to invest in trawling through the Internet trying to find that obscure bit.

Does it really makes sense that Cisco would build their flagship enterprise switch line without support for this very basic thing?

 

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: