cancel
Showing results for 
Search instead for 
Did you mean: 
cancel

BFD Support on Cisco ASR9000

80691
Views
60
Helpful
85
Comments

BFD Support on Cisco ASR 9000

 

Introduction

In both Enterprise and Service Provider networks, the convergence of business-critical applications onto a common IP infrastructure is becoming more common. Given the criticality of the data, these networks are typically constructed with a high degree of redundancy. While such redundancy is desirable, its effectiveness is dependant upon the ability of individual network devices to quickly detect failures and reroute traffic to an alternate path.

 

This detection is now typically accomplished via hardware detection mechanisms. However, the signals from these mechanisms are not always conveyed directly to the upper protocol layers. When the hardware mechanisms do not exist (eg: Ethernet) or when the signaling does not reach the upper protocol layers, the protocols must rely on their much slower strategies to detect failures. The detection times in existing protocols are typically equal to or greater than one second, and sometimes much longer. For some applications, this is too long to be useful.

 

1. The Problem With Convergence

The process of network convergence can be broken up into a set of discreet events:

  • Failure detection: the speed with which a device on the network can detect and react to a failure of one of its own components, or the failure of a component in a routing protocol peer.
  • Information dissemination: the speed with which the failure in the previous stage can be communicated to other devices in the network.
  • Repair: the speed with which all devices on the network-having been notified of the failure-can calculate an alternate path through which data can flow.

 

An improvement in any one of these stages provides an improvement in overall convergence.

 

The first of these stages, failure detection can be the most problematic and inconsistent:

  • Different routing protocols use varying methods and timers to detect the loss of a routing adjacency with a peer.
  • Link-layer failure detection times can vary widely depending on the physical media and the Layer 2 encapsulation used.
  • Intervening devices (eg: Ethernet switch) can hide link-layer failures from routing protocol peers.

 

Packet over SONET (POS) tends to have the best failure detection time amongst the different Layer 1/2 media choices. It can typically detect and react to media or protocol failures in ~50 milliseconds. This has become the benchmark against which other protocols are measured.

2. BFD For Fast Failure Detection

Bi-directional Forwarding Detection (BFD) provides rapid failure detection times between forwarding engines, while maintaining low overhead. It also provides a single, standardized method of link/device/protocol failure detection at any protocol layer and over any media. BFD is a simple, light weight hello protocol and is targeted to achieve a detection time as short as a few tenths of milliseconds.

 

A secondary benefit of BFD, in addition to fast failure detection, is that it provides network administrators with a consistent method of detecting failures. Thus, one availability methodology could be used, irrespective of the Interior Gateway Protocol (IGP) or the topology of the target network. This eases network profiling and planning, because re-convergence time should be consistent and predictable. BFD function is defined in RFC 5880.

 

The Fundamental difference between the BFD Hellos and the Protocol Hellos (OSPF, RSVP etc.) is that BFD adjacencies do not go down on Control-Plane restarts (e.g. RSP failover) since the goal of BFD is to detect only the forwarding plane failures. BFD is essentially a Control plane protocol designed to detect the forwarding path failures.

 

3. BFD Protocl Overview

BFD is a simple Hello protocol that, in many respects, is similar to the detection components of well-known routing protocols. It is a UDP-based layer-3 protocol that provides very fast routing protocol independent detection of layer-3 next hop failures.

 

A pair of systems transmits BFD packets periodically over each path between the two systems, and if a system stops receiving BFD packets for long enough, some component in that particular bidirectional path to the neighboring system is assumed to have failed.

 

BFD runs on top of any data-link protocol that is in use between two adjacent systems.

 

BFD has two versions: Version 0 and Version 1. The ASR 9000 Router based on Cisco IOS XR software supports BFD Version 1.

 

BFD has two operating modes that may be selected, as well as an additional function that can be used in combination with the two modes.

 

The primary mode is known as Asynchronous mode.  In this mode, the systems periodically send BFD Control packets to one another, and if a number of those packets in a row are not received by the other system, the session is declared to be down.

 

The second mode is known as Demand mode.  In this mode, it is assumed that a system has an independent way of verifying that it has connectivity to the other system.  Once a BFD session is established, such a system may ask the other system to stop sending BFD Control packets, except when the system feels the need to verify connectivity explicitly, in which case a short sequence of BFD Control packets is exchanged, and then the far system quiesces.  Demand mode may operate independently in each direction, or simultaneously.

 

An adjunct to both modes is the Echo function.  When the Echo function is active, a stream of BFD Echo packets is transmitted in such a way as to have the other system loop them back through its forwarding path.  If a number of packets of the echoed data stream are not received, the session is declared to be down.  The Echo function may be used with either Asynchronous or Demand mode.  Since the Echo function is handling the task of detection, the rate of periodic transmission of Control packets may be reduced (in the case of Asynchronous mode) or eliminated completely (in the case of Demand mode).

 

Cisco ASR 9000 only supports asynchronous mode and has echo enabled by default.

 

BFD Async Mode

Node running BFD in Async mode transmits periodically BFD Control packets. BFD control packets are encapsulated into UDP, with Source port 49152 and Destination port 3784. The Source address of the IP packet is the local interface address and the destination address is the remote interface address.

bfd async.png

BFD control packets are unidirectional, i.e. they normally don’t require a response. Their purpose is to confirm that BFD control plane on the remote peer is operational.

 

If a predetermined number of intervals passes without receiving a BFD control placket from the peer, the local node will declare the BFD session down.

 

BFD Echo Mode

The Echo mode is designed to test only the forwarding path and not the host stack.

 

BFD Echo packets are transmitted over UDP with source and destination ports as 3785. The source address of the BFD Echo packet is the router ID and the destination address is the local interface address.

bfd echo.png

BFD Echo packets do not require a response. Choice of destination IP address is such that it forces the remote peer to route the original packet back via the same link, without punting to the BFD control plane. Thus, BFD control plane on peer B is not even aware of BFD Echo packets originating from peer A.

 

If a predetermined number of intervals passes without receiving its own BFD Echo placket, the local node will declare the BFD session down.

 

Source address selection of the BFD echo packet:

  1. check for an echo source address configuration for the interface
  2. check for a global echo source address configuration
  3. use the RID as the echo source address
  4. use the interface address

 

Default source/destination IP address and UDP port for BFD Asycn and Echo packets:

 

  BFD Async BFD Echo
Source IP address Local Interface Router ID

Destination IP address

Remote interface

Local Interface

Source UDP Port

49152

3785

Destination UDP Port 3784 3785

 

 

BFD Session

The BFD session parameters are negotiated between the BFD peers in a three-way handshake. Each Peer communicates its session state in the State (Sta) field in the BFD Control packet, and that received state, in combination with the local session state, drives the state machine. When the local peer wants to bring the session up, it sends a control packet with Down state. Figure 3 shows the transition from Down to Up state. Circles represent the state of the local peer and rectangles the received state of remote peer.

 

bfd session.png

If this peer receives a BFD Control packet from the remote system with a Down state, it advances the session to Init  state. If it receives a BFD Control packet from the remote system with an Init state, it moves the session to Up state.

 

The BFD peers continuously negotiate their desired transmit and receive rates. The peer that reports the slower rate determines the transmission rate. The periodic transmission of BFD Control packets is jittered on a per-packet basis by a random value of 0 to 25%, in order to avoid self-synchronization with other systems on the same subnet. Thus, the average interval between packets is roughly 12.5% less than that negotiated.

 

In a BFD session, each system calculates its session down detection time independently. In each direction, this calculation is based on the value of Detect Mult received from the remote system and the negotiated transmit interval.

 

Negotiated_Transmit_Interval =

     max (required minimum received interval, last received desired minimum transmit interval)

 

Detection Time = Detection_Multiplier x Negotiated_Transmit_Interval

 

If the Detection Time is passed without receiving a control packet, the session is declared to be down.

 

4. “Static” and “Dynamic” BFD

“Static BFD” refers to a BFD session that protects a single static route. “Dynamic BFD” refers to a BFD session between dynamic routing protocol peers, by which all of the routes having this peer as the next-hop are protected by BFD.

 

5. ASR 9000 Distributed BFD Architecure

 

In Cisco ASR 9000, BFD Functionality is implemented between RSP and Line card in a distributed architecture. There is a BFD Server Process running on the RSP and a BFD Agent Process running on the Line card. The Architecture is as shown in this figure:

 

distributed bfd architecture.png

 

  • BFD process on RSP:                                     
    • Responsible for interacting with BFD clients and passing BFD session creation/deletion request to LC BFD processes upon configuration request, and maintaining a adjacency database to track all BFD sessions on the router.
    • Notify clients of BFD (OSPF, IS-IS or BGP) interface configuration change and session down events upon receiving BFD event notification from LC.
    • Service show and configuration commands.
    •  
  • BFD process on LC:                                     
    • Create and delete BFD sessions based on commands from BFD RSP process
    • Maintain a database for all BFD sessions on the LC. There is an instance of BFD adjacency per session on the LC that owns the session.
    • Notify BFD RSP process in any event of transmit, receive failure or session down detection.
    • Maintain and update BFD session transmit and receive counters
    • Run BFD FSM for all BFD sessions on the LC, transmit and receive BFD control packets, update and check BFD session detection timers.

 

BFD communication between RSP and LC is through reliable IPC messaging which is designed for communication with minimal delay and being able to survive RSP switchover. In particular:

 

  • LC BFD event message to RSP is sent on a reserved IPC port;
  • On RSP, BFD event port is opened as an IPC reliable port and BFD event sent from LC gets received in RSP IPC receive interrupt context. The BFD event port gets open and ready to receive BFD event on standby RSP in case of a switchover and it becomes active RSP.
  • During RSP switchover, BFD event from LC to RSP will not be sent.  Instead, it is registered with switchover call-backs and get sent after the RSP switchover.

 

The BFD packets have to be punted to the Line Card CPU and follow the slow path. Injected and punted BFD packets have a preferential treatment on all line card architectures. In other words, following BFD packets have a preferential treatment on all ASR9000 line cards:

 

  • BFD control packets
  • BFD Echo packets originated on the LC
  • Response to BFD Echo packets originated on the LC

 

Egress QoS policy map needs to be considered to ensure preferential treatment of BFD Echo packets originated by the BFD peer.

 

Configuring BFD

 

NOTE: In IOS XR an application must “terminate“ the BFD session.

Unlike IOS, direct peering without any application using the session is not allowed.

 

6. BFD on Static Routes

Simplest application is router-static:

 

router static

address-family ipv4 unicast

  172.16.1.1/32 10.1.1.1 bfd fast-detect minimum-interval 500 multiplier 3

!

interface GigabitEthernet0/0/0/0

ipv4 address 10.1.1.2 255.255.255.0

 

7. BFD On Directly Connected Static Host Routes

This is a special case in IOS-XR, supported after 4.2.3 (through CSCua18314). This is a static BFD to a /32 prefix in a directly attached network.

 

interface GigabitEthernet0/1/0/22

ipv4 address 10.52.61.1 255.255.255.0

!

router static

address-family ipv4 unicast

  10.52.61.2/32 GigabitEthernet0/1/0/22 10.52.61.2 bfd fast-detect

 

Corresponding configuration on IOS device:

 

interface GigabitEthernet4/17

ip address 10.52.61.2 255.255.255.0

bfd interval 500 min_rx 500 multiplier 3

bfd neighbor ipv4 10.52.61.1

 

8. BFD With IGP Peers

ISIS  application sample:

 

router isis lab

net 49.0111.0111.0111.0111.00

address-family ipv4 unicast

  redistribute connected

!

interface GigabitEthernet0/2/0/1

  bfd minimum-interval 500

  bfd multiplier 3

  bfd fast-detect ipv4

 

9. Global BFD Parameters

Top level BFD configuration mode is used to configure global BFD parameters (dampening, global echo source address, etc.) and interface specific  BFD paramaters (disable/enable echo mode per interface, set echo source address, etc.):

 

bfd

interface GigabitEthernet0/2/0/1

  echo disable

!

echo ipv4 source 10.0.0.21

dampening secondary-wait 7500

dampening initial-wait 3000

dampening maximum-wait 180000

 

In IOS-XR BFD Control and Echo packet intervals are not configured independently. If Echo mode is enabled, user can configure the  BFD Echo interval, and the BFD Control interval is set to 2 seconds. If BFD Echo is disabled, user configures the BFD Control interval. Same multiplier is applied to both values to calculate the ‘dead’ timer.

 

If Echo mode is enabled, locally configured multiplier will be used by remote peer to calculate detection time for BFD Control packets sent by local peer. And for locally generated echo packets, locally configured multiplier will be used to calculate the detection time.

bfd multiplier.png

BFD Multiplier

(fat green arrow indicates where is the multiplier applied)

 

10. BFD Echo And QoS

Injected and punted BFD packets have a preferential treatment on all line card architectures.

 

Egress QoS policy map needs to be considered to ensure preferential treatment of BFD Echo packets originated by the BFD peer. Sample configuration:

 

ipv4 access-list BFD

5 permit udp any any eq 3785

!

class-map match-any BFDCLASS

match access-group ipv4 BFD

!

policy-map OUT

class BFDCLASS

  priority level 1

  police rate 10 kbps

!

interface GigabitEthernet0/2/0/1

service-policy output OUT

 

If the above still doesn't prevent BFD flaps during sever congestions, create a parent shaper over this policy, shape it to 100% and apply the parent policy to the interface.

 

bfd echo and qos.png

 

 

11. BFD And HSRP

 

HSRP has to learn what the physical addresses of the Active and Standby router are (from the HSRP control packets). Once one of these routers change state, it's possible that another (third) router could take over as Standby. HSRP cannot assume that the same router will come back (configured the same way) following the failure - it must wait for the Standby to be newly elected and discover its physical address to recreate the BFD session to that address. As a consequence BFD session may flap when HSRP state changes.

 

12. BFD Scale

 

NOTE: BFD scale limits differ per line card architecture and may change from one IOS-XR release to another.

 

BFD scale is determined by rate of BFD packets that are allowed to be punted to LC CPU. Also, it depends on the pps limit and max session limit per LC. Limits differ per line card architecture and may change from one IOS-XR release to another. Exec command “show bfd summary” can be used to observe the limits on the router:

 

RP/0/RSP0/CPU0:ASR9K#show bfd summary

Node       All PPS usage   MP  PPS usage   Session number   

           %   Used  Max   %   Used  Max   Total MP     Max

---------- --------------- --------------- ------------------

0/2/CPU0   0   2     16000  0   0     16000  1     0      8000

0/4/CPU0   0   0     9600   0   0     9600   0     0      4000

0/6/CPU0   0   0     16000  0   0     16000  0     0      8000

 

All PPS Usage” indicates the PPS limit on all BFD sessions on a given slot, and “Session number Max” indicates the maximum number of BFD sessions on a given slot.

 

For example, to calculate how many BFD sessions can be hosted presuming a 150ms Echo interval on slots 0/2 or 0/6 from the above sample:

 

  • BFD pps rate = 16000 pps
  • Echo packet interval = 150ms = 0.15s
  • ==> 16000 * 0.15 = 2400 sessions

 

Received BFD packets are policed by the NP. To see the rate of received BFD packets and to check for any drops, run the “show controllers np counters <np|all> location <location>” command and look for BFD counters.

 

 

Monitoring BFD

13. Monitoring BFD Via CLI

 

Essential commands for BFD monitoring are

 

show bfd session [detail]

show bfd counters packet

 

Sample outputs for given configuration:

 

interface GigabitEthernet0/2/0/1

ipv4 address 10.0.9.2 255.255.255.0

!

router isis lab

net 49.0111.0111.0111.0111.00

address-family ipv4 unicast

  redistribute connected

!

interface GigabitEthernet0/2/0/1

  bfd minimum-interval 500

  bfd multiplier 3

  bfd fast-detect ipv4

 

 

RP/0/RSP0/CPU0:ASR9K-1#show bfd session

Interface       Dest Addr           Local det time(int*mult)      State    

                                Echo             Async          

--------------- --------------- ---------------- ---------------- ----------

Gi0/2/0/1       10.0.9.1        1500ms(500ms*3)  6s(2s*3)         UP       

 

RP/0/RSP0/CPU0:WEST-PE-ASR9K-1#show bfd counters packet                  

GigabitEthernet0/2/0/1 Recv       Xmit                 Recv       Xmit

      Async:          44048     44182       Echo:    175384     175384

 

 

RP/0/RSP0/CPU0:ASR9K-1#show bfd session detail

I/f: GigabitEthernet0/2/0/1, Location: 0/2/CPU0

Dest: 10.0.9.1

Src: 10.0.9.2

State: UP for 0d:21h:4m:54s, number of times UP: 1

Session type: PR/V4/SH

Received parameters:

Version: 1, desired tx interval: 2 s, required rx interval: 2 s

Required echo rx interval: 1 ms, multiplier: 3, diag: None

My discr: 2148335618, your discr: 2148073474, state UP, D/F/P/C/A: 0/0/0/1/0

Transmitted parameters:

Version: 1, desired tx interval: 2 s, required rx interval: 2 s

Required echo rx interval: 1 ms, multiplier: 3, diag: None

My discr: 2148073474, your discr: 2148335618, state UP, D/F/P/C/A: 0/0/0/1/0

Timer Values:

Local negotiated async tx interval: 2 s

Remote negotiated async tx interval: 2 s

Desired echo tx interval: 500 ms, local negotiated echo tx interval: 500 ms

Echo detection time: 1500 ms(500 ms*3), async detection time: 6 s(2 s*3)

Local Stats:

Intervals between async packets:

   Tx: Number of intervals=100, min=1663 ms, max=1993 ms, avg=1834 ms

       Last packet transmitted 876 ms ago

   Rx: Number of intervals=100, min=1665 ms, max=2 s, avg=1832 ms

       Last packet received 272 ms ago

Intervals between echo packets:

   Tx: Number of intervals=50, min=25 s, max=25 s, avg=25 s

       Last packet transmitted 142 ms ago

   Rx: Number of intervals=50, min=25 s, max=25 s, avg=25 s

       Last packet received 137 ms ago

Latency of echo packets (time between tx and rx):

   Number of packets: 100, min=1 ms, max=5 ms, avg=3 ms

Session owner information:

                            Desired               Adjusted

  Client               Interval   Multiplier Interval   Multiplier

  -------------------- --------------------- ---------------------

  isis-escalation      500 ms     3          2 s        3        

 

 

Explanation of the relevant fields in the show bfd session detail command output:

 

Field Description
dest: 10.0.9.1 The Destination IP address in the BFD async control packet
src: 10.0.9.2 The source IP address used in the BFD async control packet
State: UP The current state of the BFD session at the local router
for 0d:21h:4m:54s Indicates the time in which this BFD session has maintained this State
number of times UP: 1 Indicates the number of times this session has been UP since the last time of flap or establishment
Received parameters: Explanation of the received parameters is as below
Version: 1 BFD protocol version running on the Remote router
desired tx interval: 2 s The Async packet interval used by Remote router in sending
required rx interval: 2 s The Async packet interval expected by Remote router of this Router
Required echo rx interval: 1 ms The echo interval that can be supported by the Remote router
multiplier: 3 The multiplier that the Remote router chooses to use
diag: None The reason indicated by remote router if at all the BFD session is not in UP state at it's side
My discr: 2148335618 The Discriminator used by the Remote Router for this BFD session
your discr: 2148073474 The Discriminator belonging to this Router as understood by Remote router for this BFD session
state UP The state of BFD session at the Remote router
D/F/P/C/A: 0/0/0/1/0 Bit settings as seen in the Async control packet sent by Remote router:
D - Set if Demand Mode is used by Remote router
F - Final bit in the BFD packet sent by remote router. Remote router is ACKing change in BFD parameter sent by Local router
P - Poll bit in the BFD packet sent by remote router. Remote router has changed some BFD parameters and waits for ACK from Local router
C - Is BFD independent of the Control plane on Remote Router
A - Is Authentication used by Remote router for this session
Transmitted parameters: Explanation of the transmitted parameters
Version: 1 BFD protocol version running on the Local router
desired tx interval: 2 s The Async packet interval used by Local router in sending
required rx interval: 2 s The Async packet interval expected of the Remote router by this Router
Required echo rx interval: 1 ms The echo interval that can be supported by the Local router
multiplier: 3 The multiplier that the Local router chooses to use
diag: None The reason indicated by local router if at all the BFD session is not in UP state
My discr: 2148073474 The Discriminator used by the local Router for this BFD session
your discr: 2148335618 The Discriminator belonging to remote Router as understood by the local router for this BFD session
state UP The state of BFD session at the Local router
D/F/P/C/A: 0/0/0/1/0 Bit settings as seen in the Async control packet sent by the local router:
D - Set if Demand Mode is used by local router
F - Final bit in the BFD packet sent by local router. Local Router is ACKing a change in BFD parameter sent by Remote router
P - Poll bit in the BFD packet sent by local router. Local Router is indicating a change in BFD parameter to the Remote Router
C - Set if BFD is independent of the Control plane on the local Router. As per IOX implementation, this will be set.
A - Set if Authentication used by local router for this session
Timer Values: Details of the timer values exchanged between Local and Remote routers
Local negotiated async tx interval: 2 s The Async interval used by Local Router for transmit after negotiation
Remote negotiated async tx interval: 2 s The Async interval used by Local Router for reception after negotiation
Desired echo tx interval: 500 ms The configured (or default) echo interval used by Local router.
local negotiated echo tx interval: 500 ms Based on the upper value negotiated between Local (configured minimum-interval or default value) and Remote router (published capability for Echo receive interval). In case of IOS-XR, the published interval is 1ms. Between (600ms, 1ms), 600ms is chosen.
Echo detection time: 1500 ms(500 ms*3) The product of local negotiated echo tx interval and the multiplier configured locally
async detection time: 6 s (2 s*3) The product of Remote negotiated async tx interval and the multiplier sent by remote router/td>

 

 

State and Diag

In traces and show commands sometimes the state is provided in an enumeration. This table decodes the state enum to the explanation and meaning of that state /Diag number.

 

STATE:

 

 

State Explanation
State = 0 Admin down (configured)
State = 1 DOWN (loss reported)
State = 2 INIT (starting up)
State = 3 UP (Bfd session all running fine).

 

 

DIAG:

 

 

DIAG Explanation
Diag = 0 DIAG_NONE/reserved

Diag = 1

Control Expired
Diag = 2 Echo Failed
Diag = 3 Neighbor down
Diag = 4 Forward plane reset
Diag = 5 Path down
Diag = 6 Concatenated path down
Diag = 7 Admin down
Diag = 8 Reverse concath path down.

 

 

 

14. Monitoring BFD Via SNMP

 

For SNMP monitoring use CISCO-IETF-BFD-MIB OIDs.

 

 

Multi-Hop And Multi-Path BFD

 

Multi-hop BFD session runs between two endpoints, across one or more layer 3 forwarding devices. Key aspect of BFD multi-hop session is that the return path may be asymmetric, hence it is also a multi-path session by definition.

In asr9k nv edge solutions (aka cluster), BFD Multi-path sessions are supported starting with XR release 5.2.2.

 

15. BFD Async Mode In Multi-Hop BFD Session

On ASR 9000 BFD instance may only run on a line card CPU. User must manually designate the line card CPU on which the BFD session will run. If multiple line card CPUs are designated, multi-hop BFD sessions will be load-balanced between them. There are no mechanisms to bind a specific BFD session to a specific LC CPU. The line card hosting the BFD session doesn't have to be the one that owns the transport interfaces through which BFD packets are exchanged.

 

BFD Async packets pertaining to a single BFD session are always processed by the same LC CPU. The discriminator field in the BFD packet header is used to determine the destination LC CPU for received BFD packets.

 

BFD packets are always originated with TTL=255. To effectively impose a limit on the number of hops that a BFD packet should traverse, TTL threshold can be applied on received BFD packets. If the TTL is less than the configured threshold, packet will be dropped. BFD TTL threshold is configurable only in global BFD configuration mode.

 

16. BFD Echo Mode In Multi-Hop BFD Session

 

BFD Echo mode is not supported in multi-hop BFD.

 

17. Confiuring A Multi-Hop BFD Session

 

Sample multi-hop BFD configuration for BGP:

 

bfd

multihop ttl-drop-threshold 240        !<-- optional

multipath include location 0/1/CPU0    !<-- mandatory

!

router bgp 100

address-family ipv4 unicast

!

address-family vpnv4 unicast

!

neighbor 10.0.0.22

  remote-as 100

  bfd fast-detect

  bfd multiplier 3

  bfd minimum-interval 1000

 

Since echo mode is not supported, configured timers are applied to the async mode.

 

18. QOS Considerations In A Multi-Hop BFD Session

 

Multipath BFD session packets are not injected into high priority egress queue. If user desires to ensure that BFD packets are treated as high priority, such a QoS policy must be created and applied not only on the source of the BFD session, but also along the path.

 

ipv4 access-list BFD

5 permit udp any any eq 4784

!

class-map match-any BFDCLASS

match access-group ipv4 BFD

!

policy-map OUT

class BFDCLASS

  priority level 1

  police rate 10 kbps

!

interface GigabitEthernet0/2/0/1

service-policy output OUT

 

19. Monitoring A Multi-Hop BFD Session

 

RP/0/RSP0/CPU0:ASR9K#show bfd session detail

Mon Apr 15 16:27:28.991 UTC

 

Location: 0/1/CPU0    ! <------ LC CPU hosting the BFD session

Dest: 10.0.0.22

Src: 10.0.0.52

VRF Name/ID: default/0x60000000

State: UP for 0d:0h:45m:14s, number of times UP: 1

Session type: SW/V4/MH    ! <------ Multi-Hop

<...>

Timer Values:

Local negotiated async tx interval: 1 s   ! <------ min interval

Remote negotiated async tx interval: 1 s

async detection time: 3 s(1 s*3)   ! <------ multiplier

<...>

MP download state: BFD_MP_DOWNLOAD_ACK    ! <------ Multi-Path

State change time: Apr 15 15:40:16.123

Session owner information:

                            Desired               Adjusted

  Client               Interval   Multiplier Interval   Multiplier

  -------------------- --------------------- ---------------------

  bgp-default          1 s        3          1 s        3      

 

 

 

BFD Hardware Offload

Introduction

When BFD Hardware Offload is enabled, async packets are not generated and received any more by the LC CPU, but by the Network Processor (NP), thus increasing the BFD scale. BFD HW Offload is supported on Enhanced Ethernet line cards (aka "Typhoon"), on single-hop IPv4/IPv6 sessions over physical interfaces and sub-interfaces, and on MPLS-TP LSP Single-Path sessions.

Configuring BFD HW Offload

Configure "hw-module bfd-hw-offload enable" in the admin configuration mode and reload the line card for the command to take effect.

RP/0/RSP1/CPU0:9K(admin-config)#hw-module bfd-hw-offload enable location 0/0/CPU0

RP/0/RSP1/CPU0:9K(admin-config)#commit
RP/0/RSP1/CPU0:9K(admin-config)#end
RP/0/RSP1/CPU0:9K#hw-module location 0/0/CPU0 reload
WARNING: This will take the requested node out of service.
Do you wish to continue?[confirm(y/n)]y

Scale and Restrictions

Refer to XR release 5.3 routing configuration guide.

Monitoring BFD HW Offload

RP/0/RSP1/CPU0:9K#sh bfd session detail
I/f: TenGigE0/0/0/12.1, Location: 0/0/CPU0
Dest: 201.1.1.1
Src: 201.1.1.2
 State: UP for 0d:0h:0m:14s, number of times UP: 1
 Session type: PR/V4/SH
Received parameters:
 Version: 1, desired tx interval: 150 ms, required rx interval: 150 ms
 Required echo rx interval: 1 ms, multiplier: 3, diag: None
 My discr: 2148335618, your discr: 2147549186, state UP, D/F/P/C/A: 0/0/0/1/0
Transmitted parameters:
 Version: 1, desired tx interval: 300 ms, required rx interval: 300 ms
 Required echo rx interval: 0 us, multiplier: 3, diag: None
 My discr: 2147549186, your discr: 2148335618, state UP, D/F/P/C/A: 0/1/0/1/0
Timer Values:
 Local negotiated async tx interval: 300 ms
 Remote negotiated async tx interval: 300 ms
 Desired echo tx interval: 0 s, local negotiated echo tx interval: 0 us
 Echo detection time: 0 us(0 us*3), async detection time: 900 ms(300 ms*3)
Local Stats:
 Intervals between async packets:
   Tx: Number of intervals=2, min=5 ms, max=3469 ms, avg=1737 ms
       Last packet transmitted 14 s ago
   Rx: Number of intervals=6, min=5 ms, max=1865 ms, avg=444 ms
       Last packet received 13 s ago
 Intervals between echo packets:
   Tx: Number of intervals=0, min=0 s, max=0 s, avg=0 s
       Last packet transmitted 0 s ago
   Rx: Number of intervals=0, min=0 s, max=0 s, avg=0 s
       Last packet received 0 s ago
 Latency of echo packets (time between tx and rx):
   Number of packets: 0, min=0 us, max=0 us, avg=0 us
Session owner information:
                            Desired               Adjusted
  Client               Interval   Multiplier Interval   Multiplier
  -------------------- --------------------- ---------------------
  isis-Escalation      150 ms     3          300 ms     3

H/W Offload Info:
 H/W Offload capability : Y, Hosted NPU     : 0/0/CPU0/NPU4
 Async Offloaded        : Y, Echo Offloaded : N
 Async rx/tx            : 7/3

Platform Info:
NPU ID: 4
Async RTC ID        : 1          Echo RTC ID        : 0
Async Feature Mask  : 0x8        Echo Feature Mask  : 0x0
Async Session ID    : 0x0        Echo Session ID    : 0x0
Async Tx Key        : 0x801  Echo Tx Key        : 0x0
Async Tx Stats addr : 0x205ee800   Echo Tx Stats addr : 0x0
Async Rx Stats addr : 0x215ee800   Echo Rx Stats addr : 0x0

 

BFD Over Bundle Member Interfaces

20. Introduction

 

Implementation of various BFD flavours over bundle interfaces in IOS XR was carried out in 3 phases:

 

  1. IPv4 BFD session over individual bundle sub-interfaces. This feature was called “BFD over VLAN over bundle”.
  2. IOS XR releases 4.0.1 and beyond: “BFD Over Bundle (BoB)” feature was introduced.
  3. IOS XR releases 4.3.0 and beyond: full support for IPv4 and IPv6 BFD sessions over bundle interfaces and sub-interfaces. For disambiguation from the BoB feature, this implementation is called BLB, and sessions are often referred to as native BFD sessions over bundle interfaces and/or sub-interfaces. BFD multipath must be enabled for any of these BFD flavours to work.

 

 

21. “BFD Over VLAN Over Bundle” Feature Operation

 

“BFD Over VLAN Over Bundle” refers to a static or dynamic IPv4 BFD session over bundle-ethernet sub-interface. No other flavours of BFD over bundle interfaces or sub-interfaces were supported.

 

From configuration and monitoring aspect this feature was identical to BFD over physical interface or sub-interafce.

 

Example of a “BFD over VLAN over bundle” protecting a static route:

 

interface Bundle-Ether1.10

ipv4 address 10.0.10.1 255.255.255.0

encapsulation dot1q 10

!

router static

address-family ipv4 unicast

  172.16.1.1/32 Bundle-Ether1.10 10.0.10.2 bfd fast-detect minimum-interval 500 multiplier 3

 

Example of a “BFD over VLAN over bundle” session with IGP peer:

 

interface Bundle-Ether1.20

ipv4 address 10.0.20.1 255.255.255.0

encapsulation dot1q 10

!

router isis lab

net 49.0111.0111.0111.0111.00

address-family ipv4 unicast

  redistribute connected

!

interface Bundle-Ether1.20

  bfd minimum-interval 500

  bfd multiplier 3

  bfd fast-detect ipv4

 

 

22. “BFD Over Bundle (BoB)” Feature Operation

 

In “BFD Over Bundle (BoB)” feature IPv4 BFD session runs over every active bundle member.

 

bob.png

Bundlemgr considers BFD states, in addition to existing L1/L2 states, to determine member link usability. Bundle member state is a function of :

  1. L1 state (physical link)
  2. L2 state (LACP)
  3. L3 state (BFD)

 

BFD Agent still runs on the line card. BFD states of bundle member links are consolidated on RP.

 

Member links must be connected back-to-back, without any L2 switches inbetween.

 

Async Mode

 

BoB supports both async mode and echo mode. Async mode is very similar to BFD over physical interface, as sync packets are running over each individual bundle member.

 

bob async.png

 

Echo Mode

 

Echo mode is also very similar to BFD over physical interface, with some special considerations for the TX/RX path. One member link per line card is designated to be the Echo Active member. Echo Active member is simultaneously sending a sufficient number of Echo packets to ensure that at least one will be returned over each active member link. BoB Echo packet header contains a discriminator field identifying the originating rack&slot and a sequence number which is used by the peer for hashing the Echo packets on the return path. Returned Echo packets are punted to LC CPU only if they were originated by the local Echo Active member.

 

bob echo.png

By default, BFD Over Bundle member link runs in echo mode.

 

L2 Encapsulation

 

Ethernet frames carrying BoB packets are sent with CDP destination MAC address 0100.0CCC.CCCC.

 

On POS bundles, only the cHDLC encapsulation is supported.

 

BFD Over Bundle (BoB) Configuration

 

Configuration for enabling/disabling BFD per member link is under bundle interface submode.

 

interface Bundle-Ether1

bfd address-family ipv4 timers start 60

bfd address-family ipv4 timers nbr-unconfig 3600

bfd address-family ipv4 multiplier 3

bfd address-family ipv4 destination 1.2.3.4

bfd address-family ipv4 fast-detect

bfd address-family ipv4 minimum-interval 300

 

There is no need to configure BFD in IGP. In case of BFD failure bundle manager is notified immediately. If the number of active members is below minimum, bundle is immediately brought down, bringing down the IGP peering.

 

The fast-detect and destination are mandatory for bringing the session up.

 

The nbr-unconfig option was created to allow disabling BFD per member link without interrupting the bundle interface.

 

Async mode can be forced by disabling the echo mode at following sub-modes:

  • global BFD configuration → applies to all sessions.
  • bundle interface configuration → applies to all member link sessions for this bundle interface.
  • Member link configuration → applies only to member link session.

 

Echo packet source IP address is configurable with same granularity.

 

Monitoring BFD Over Bundle (BoB)

Monitoring BoB Via CLI

In example below output BFD runs on Bundle-Ether1, with members being Gi0/0/0/2 and Gi0/0/0/3:

 

RP/0/RP0/CPU0:ASR9K#show bfd session

Interface    Dest Addr           Local det time(int*mult)      State   

                                   Echo            Async

----------- --------------- ---------------- ---------------- -----

Gi0/0/0/2   192.168.1.2     1350ms(450ms*3)  450ms(150ms*3)   UP      

Gi0/0/0/3   192.168.1.2     1350ms(450ms*3)  450ms(150ms*3)   UP      

BE1         192.168.1.2     n/a              n/a              UP      

 

 

RP/0/RP0/CPU0:ASR9K#sh bfd count packet private detail location 0/0/CPU0

Bundle-Ether1             Recv     Rx Invalid   Xmit   Delta

    Async:                 0        0              0        

    Echo:                  0        0              0    0

GigabitEthernet0/0/0/2    Recv     Rx Invalid   Xmit   Delta

     Async:                1533     0            1528

     Echo:                (14976  ) 0            29952  14976

GigabitEthernet0/0/0/3    Recv     Rx Invalid   Xmit   Delta

    Async:                 1533     0            1529

    Echo:                  (14976 ) 0            0      14976    

 

Monitoring BoB Via SNMP

 

Since the BFD session states are consolidated on the RP, SNMP get works only for basic CISCO-IETF-BFD-MIB OIDs. OIDs that can be used for monitoring BoB are:

  • ciscoBfdSessIndex
  • ciscoBfdSessDiscriminator
  • ciscoBfdSessState
  • ciscoBfdSessAddrType
  • ciscoBfdSessAddr
  • ciscoBfdSessVersionNumber
  • ciscoBfdSessType
  • ciscoBfdSessInterface
  • coBfdSessUpTime
  • ciscoBfdSessPerfLastSessDownTime
  • ciscoBfdSessPerfSessUpCount

 

 

23. Native BFD Over VLAN Over Bundle Interfaces And Sub-Iinterfaces (BLB)

 

IOS XR releases 4.3.0 has introduced full support for IPv4 and IPv6 BFD sessions over bundle interfaces and sub-interfaces. For disambiguation from the BoB feature, this implementation is called BLB, and sessions are often referred to as native BFD sessions over bundle interfaces and/or sub-interfaces.

 

BLB operation relies on BFD multipath. Hence, echo mode is not supported.

 

Configuring BLB

BFD multipath must be enabled for any of these BFD flavours to work. All other configuration aspects are identical to “BFD over VLAN over bundle” configuration.

 

bfd

multipath include location 0/1/CPU0

!

interface Bundle-Ether1.10

ipv4 address 10.52.61.1 255.255.255.0

encapsulation dot1q 10

!

router static

address-family ipv4 unicast

  10.52.61.2/32 Bundle-Ether1.10 10.52.61.2 bfd fast-detect minimum-interval 900 multiplier 3

 

BLB QoS Considerations

As in any multipath BFD session, egress BFD packets are not injected into high priority egress queue. If user desires to ensure that BFD packets are treated as high priority, such a QoS policy must be created and applied to the bundle (sub)interface.

 

Monitoring BLB

 

RP/0/RSP0/CPU0:ASR9K#sh bfd session interface bundle-e1.10 detail

Fri Apr 19 12:19:34.330 UTC

I/f: Bundle-Ether1.10, Location: 0/1/CPU0 ! <------ LC CPU hosting the BFD session

Dest: 10.52.61.2

Src: 10.52.61.1

State: UP for 0d:2h:6m:58s, number of times UP: 1

Session type: SW/V4/SH/BL    ! <------ BLB session

< ... >

Timer Values:

Local negotiated async tx interval: 900 ms   ! <--- configured timers applied to async

Remote negotiated async tx interval: 900 ms

Desired echo tx interval: 0 s, local negotiated echo tx interval: 0 s

Echo detection time: 0 s(0 s*3), async detection time: 2700 ms(900 ms*3)

< ... >

MP download state: BFD_MP_DOWNLOAD_ACK    ! <------ Multi-Path

State change time: Apr 19 10:12:32.593

Session owner information:

                            Desired               Adjusted

  Client               Interval   Multiplier Interval   Multiplier

  -------------------- --------------------- ---------------------

  ipv4_static          900 ms     3          900 ms     3        

 

 

24. BoB-BLB Coexistence

 

BoB-BLB coexistence is enabled using the “bundle coexistence bob-blb logical” global BFD configuration command:

 

bfd

multipath include location 0/1/CPU0

bundle coexistence bob-blb logical

 

 

25. BFD Session Inheritance In BFD Over Bundle Member Implementations

 

Due to scaling considerations, only IPv4 BFD sessions runs over member links. Application IPv6 sessions are inheriting the states from the IPv4 BFD session for the same bundle.

 

When BoB feature is enabled, “BFD over VLAN over bundle” sessions (i.e. sessions on bundle sub-interfaces) of all address families are also inheriting the states from IPv4 BFD session for corresponding bundle interface.

 

BFD Session Type BFD Session state
Up to IOS XR release 4.2.3 Starting with IOS XR release 4.3.0
With BoB Without BoB With BoB Without BoB BoB-BLB coexistence
IPv4 over bundle interface Active Dormant Active Native Active
IPv4 over bundle sub-interface Inherits state "BFD Over VLAN Over Bundle" Inherits state Native Native
IPv6 over bundle interface Inherits state Dormant Inherits state Native Native
IPv6 over bundle sub-interface Inherits state Dormant Inherits state Native Native

 

 

 

Troubleshooting BFD

26. BFD Session Not Coming Up

 

Was the session created by the application?

  • Use the "show bfd all session" command to see if the session has been created. If the session is not visible using the bfd show command, check that the adjacency for which the BFD session is being created exists in the application.

 

Can you ping the destination address for which the BFD session is to be created?

  • The BFD session will only be created by an application for which the application adjacency/neighbour exists.

 

Has BFD been configured on both sides?

 

Has BFD been configured with correct destination address?

  • Using non-default IP addresses with BFD requires special attention. The source IP address being used for BFD sessions can be verified with "show bfd all session detail" command.

 

If the session is visible using the BFD show commands but is not UP and the destination ip address of the BFD peer is reachable by pinging:

  • execute the "show bfd all session detail" command and check whether the "Local Stats" from that show command indicate that packets are being transmitted and received?
  • If not, check which NP corresponds to the interface in question and whether that NP is receiving BFD packets by running:

sh controllers np ports all location <location>”

sh controllers np counters <np> location <location> | inc "Rate|BFD"

sh uidb data location <location> <interface> ingress

sh uidb data location <location> <interface> ing-extension

 

If the MAX-PPS LIMIT is reached BFD session will be kept in ADMIN_DOWN state. In this scenario message BFD-6-SESSION_NO_RESOURCES will be reported in the syslog, e.g.:

 

RP/0/RP1/CPU0:Mar 13 09:51:06.556 : bfd[143]: %L2-BFD-6-SESSION_NO_RESOURCES : No resources for session to neighbor 10.10.10.2 on interface TenGigE0/1/0/2, interval=300 ms

 

Is the allowed policer rate for BFD packets reached?

  • Run “show controllers np counters <np|all> location <location>” and look for BFD packet and drop counters.

 

Does this BFD session require BFD multi-path?

  • Look for %L2-BFD-6-SESSION_NO_RESOURCES message in the syslog
  • Run 'show bfd session detail | inc “^(Location|MP)”’ to confirm that a line card CPU was designated to host  the BFD session.

 

27. BFD Session Going Down

Determine if there really is a connectivity issue that BFD has rightly discovered.

  • Run a continuous ping to local i/f address (which is used by local echo and remote async) and remote i/f address (used by remote echo and local async) with a timeout of 1s and repeat the test.

 

Determine if the application adjacency has flapped (for example after RSP failover), which would cause the BFD session to be removed by the application.

  • Examine the "show logging | inc bfd_agent" output on the neighbour to confirm whether the session was removed by the application.

 

Determine if the outgoing interface for a next-hop IP address changed (for example after APS link switch), which would cause the BFD session to be removed by the application, followed by creation of the BFD session with new outgoing interface.

  • Examine the "show logging | inc bfd_agent" output on the neighbour to confirm whether the session was removed by the application.

 

If the log says "Nbor signalled down", check the logs on the peer to verify if the application has removed the session or if the routing adj has flapped.

 

If BFD is flapping continuously because of echo failure, it can be due to:

  • The link is over-subscribed. Async and echo packets are sent to high priority queue by the sender but on the remote router which is switching back the echo packets, these echo packets are regular data packets. So with over-subscription, they can be dropped. Work-around is to add qos policy at each end to send UDP port 3785 packets to high priority queue.
  • uRPF is configured on the interface. Need self-ping and remote routerid to be reachable.
  • There could be timing issues where (e.g. because of SPF delay in IGPs) the remote router-id is installed in CEF after the BFD session comes up. This also causes flaps due to echo packet drops. Good practice is to configure BFD dampening (delay of the BFD session initialisation):

bfd

dampening secondary-wait 7500

dampening initial-wait 3000

dampening maximum-wait 180000

 

  • BFD flapping due to echo failure but no echo packet loss. This is usually due to delay in the echo packets and one of the possible causes is traffic shaping. To check whether echo packets are delayed or dropped, run "show bfd counters packet private location <location>". If tx and rx count differ this means there's echo packet drops. If the tx and rx are the same but the "Rx Invalid" count is non-zero, this means echo packets are being delayed or coming back on wrong interface.
  • BFD flapping after configuration change related to the BFD session. Clear the stale data by executing clear bfd persistent-data unassociated location <location>.

 

BFD flaps on a clear route.

  • This is expected behaviour since "clear route" command clears all FIB entries in s/w and h/w, so it is expected that BFD will fail (since there is an expected forwarding outage). This can be confirmed by doing continuous local and remote ping. See TRG 17921. Similar issue is when fib_mgr is crashed which can cause forwarding disruption due to shmem init

 

28. Essential Information To Collect Before Contacting TAC

 

Collect the following on both peers terminating the BFD session:

 

show tech-support routing bfd file

show logging

 

 

BFD Packet Format

 

BFD has two versions: Version 0 and Version 1. The Difference in Version 1 is the “STA” bits replacing the “H” bit in Version 0.

29. BFD V0 Control Packet

bfd header v0.png

 

30. BFD V1 Control Packet

bfd header v1.png

 

31. Relevant Header Field

  • Vers: Version of BFD control header. XR runs version 1 as default, but can run version 0 as well.
  • Diag: A diagnostic code specifying the local system's reason for the last change in session state, detection time expired, echo failed, etc.
  • Sta: The current BFD session state as seen by the transmitting system.
  • P: Poll bit, if set, the transmitting system is requesting verification of connectivity, or of a parameter change, and is expecting a packet with the Final (F) bit in reply.
  • F: Final bit, if set, the transmitting system is responding to a received BFD Control packet that had the Poll (P) bit set.
  • C: Set if BFD is independent of the Control plane. In Cisco implementation this bit is set.
  • A: - Set if Authentication used. In Cisco implementation authentication is not supported.
  • D: Set if Demand Mode is used.
  • Detect Mult: Detection time multiplier.  The negotiated transmit interval, multiplied by this value, provides the Detection Time for the transmitting system in Asynchronous mode.
  • My Discriminator: A unique, nonzero discriminator value generated by the transmitting system, used to demultiplex multiple BFD sessions between the same pair of systems. Rack, Slot, Instance is encoded as upper 16bits:

 

RRRR RRRR SSSS SSII XXXX XXXX XXXX XXXX

 

  • Your Discriminator: The discriminator received from the corresponding remote system. This field reflects back the received value of My Discriminator, or is zero if that value is unknown.
  • Desired Min TX Interval: This is the minimum interval, in microseconds, that the local system would like to use when transmitting BFD Control packets.
  • Desired Min RX Interval: This is the minimum interval, in microseconds, between received BFD Control packets that this system is capable of supporting.
  • Required Min Echo RX Interval: This is the minimum interval, in microseconds, between received BFD Echo packets that this system is capable of supporting.
Comments
Cisco Employee

hi Mat,

Did the asr9k also declare the BFD session down? Or only the asr1k?

It's been a long time since I did any troubleshooting on asr1k, but I'm sure IOS XE must have some tracing facility akin to 'show bfd trace location ...' on IOS XR platforms. If asr1k supports BFD dampening, the traces should give a clue as to why the session went down. Can you look for that kind of info?

Regards,

Aleksandar

Hi,

i have problem with flapping BFD session on a bundle-interface. have the following config.

###

bfd
multipath include location 0/0/CPU0
multipath include location 0/1/CPU0
bundle coexistence bob-blb logical
!
router isis TEST
interface bundle-ether 105
bfd minimum-interval 20
bfd multiplier 3
bfd fast-detect ipv4
bfd fast-detect ipv6
!

RP/0/RSP1/CPU0:Feb 3 15:12:45.823 : isis[1006]: %ROUTING-ISIS-5-ADJCHANGE : Adjacency to x-x-x-x (Bundle-Ether105) (L2) Down, BFD session DOWN
LC/0/1/CPU0:Feb 3 15:12:45.825 : bfd_agent[125]: %L2-BFD-6-SESSION_REMOVED : BFD session to neighbor 1.1.1.1 on interface Bundle-Ether105 has been removed
RP/0/RSP1/CPU0:Feb 3 15:12:50.325 : isis[1006]: %ROUTING-ISIS-5-ADJCHANGE : Adjacency to x-x-x-x (Bundle-Ether105) (L2) Up, New adjacency
LC/0/1/CPU0:Feb 3 15:12:52.608 : bfd_agent[125]: %L2-BFD-6-SESSION_STATE_UP : BFD session to neighbor 1.1.1.1 on interface Bundle-Ether105 is up
LC/0/2/CPU0:Feb 3 15:12:55.293 : bfd_agent[125]: %L2-BFD-6-SESSION_STATE_UP : BFD session to neighbor fe80::b615:1111:1111:1111 on interface Bundle-Ether105 is up

sh bfd counters packet private interface bundle-ether 105
Wed Feb 3 15:16:27.356 CET
Bundle-Ether105 Recv Rx Invalid Xmit Delta
Async: 10179863 15190 12910224
Echo: 0 0 0 0

sh bfd session interface bundle-ether 105
Wed Feb 3 15:16:51.777 CET
Interface Dest Addr Local det time(int*mult) State
Echo Async H/W NPU
------------------- --------------- ---------------- ---------------- ----------
BE105 1.1.1.1 0s(0s*0) 60ms(20ms*3) UP
No n/a

###

Problem is not clear to me, because flap are only happening every one or two hours. there are some interface that are working with same config, some interface are not. i'm using typhoon linecard and version 5.1.2. I also had not that aggressive timers, but it's still the same problem.

regards

Cisco Employee

hi Marco,

on BoB we support 50ms interval, with multiplier 3.

Please refer to this CCO doc:

http://www.cisco.com/c/en/us/td/docs/routers/asr9000/software/asr9k_r5-3/routing/configuration/guide/b_routing_cg53xasr9k/b_routing_cg53xasr9k_chapter_0100.html#ID1712

Does the session still flap after increasing the interval?

regards,

Aleksandar

Hi Aleksander,

yes, we have that problem also when configuring 50ms. I also tried QoS, but when i understand it correctly, BFD packets were sent into prio level 1 by default.

that bundle in my example is is bundle-ether with only one interface, sound strange, but we use it cause future extensions are more easy. on the other hand, we have the same problem also on physical interfaces.

we also have one bundle with 6x10G between two ASR9k running some software and flapping BFD session. From my understanding all six interface must have a problem in that case, because the BFD packet are sent over all six interface in parallel. Also those routers are connected only with patch cable (no leased line or DWDM), i would assume that not all six interface can have a problem at the same time.

regards

Marco

Cisco Employee

hi Marco,

that doesn't sound right. To look into this further I would have to look into the BFD session details and traces to figure out what's going on. For that type of interaction I hope you won't mind if I asked you to open a TAC service request.

Regards,

Aleksandar

Beginner

Hello Aleksandar,

We are using XR 5.3.3 and not configured BFD dampening,but we still receiving dampening logs ? Isn't  BFD dampening enabled by default ?

LC/0/5/CPU0:Jun 8 05:03:58.022 : bfd_agent[125]: %L2-BFD-6-ADJACENCY_DELETE : Adjacency to neighbor x.x.x.x on interface GigabitEthernet0/5/0/3.89 was deleted
LC/0/5/CPU0:Jun 8 05:03:58.022 : bfd_agent[125]: %L2-BFD-6-ADJACENCY_DELETE : Adjacency to neighbor y.y.y.y on interface GigabitEthernet0/5/0/3.88 was deleted
LC/0/5/CPU0:Jun 8 05:03:58.058 : bfd_agent[125]: %L2-BFD-6-SESSION_REMOVED : BFD session to neighbor x.x.x.x on interface GigabitEthernet0/5/0/3.89 has been removed
LC/0/5/CPU0:Jun 8 05:03:58.077 : bfd_agent[125]: %L2-BFD-6-SESSION_STATE_DOWN : BFD session to neighbor y.y.y.y on interface GigabitEthernet0/5/0/3.88 has gone down. Reason: Control timer expired
LC/0/5/CPU0:Jun 8 05:04:32.566 : bfd_agent[125]: %L2-BFD-6-SESSION_DAMPENING_ON : Session to neighbor y.y.y.y on interface GigabitEthernet0/5/0/3.88 entered Dampened state (initial: 2000 ms,secondary: 5000 ms,maximum: 120000 ms).
LC/0/5/CPU0:Jun 8 05:04:32.569 : bfd_agent[125]: %L2-BFD-6-SESSION_DAMPENING_ON : Session to neighbor x.x.x.x on interface GigabitEthernet0/5/0/3.89 entered Dampened state (initial: 2000 ms,secondary: 5000 ms,maximum: 120000 ms).
LC/0/5/CPU0:Jun 8 05:04:34.525 : bfd_agent[125]: %L2-BFD-6-SESSION_DAMPENING_OFF : Session to neighbor y.y.y.y on interface GigabitEthernet0/5/0/3.88 moved out of Dampened state.

Cisco Employee

hi tural,

bfd dampening is running by default, but you can customize its behavior.

here is the detail on that:

  • If a session is brought down, then dampening is applied before a session is allowed to transition to initial/up states.
  • Length of time a session is dampened grows exponentially with continuous session flap.
  • If a session remains up for minimum two minutes, then the length of time a session dampens with the next session flap is reset to the initial dampening value.

BFD on bundle member applies dampening, only if the detected failure is specific to Layer 3. BFD dampening is not invoked for L1 or L2 failures. BFD is started after Layer 1 and Layer 2 (LACP) is up to prevent a race condition and false triggers. BFD is notified to stop/ignore when L1 or L2 goes down and must be notified to start/resume when L1 or L2 recovers for a given/affected link/member.

BFD applies dampening till the session transitions from up to down state and the session is not removed. Whenever there is a failure detected at L1 or L2 , the bundle manager removes BFD session on a member.

When dampening is removed a syslog message ‘Exponential backoff dampening for BFD session has been cleared for specified BFD session. When/if same session gets created by application(s), only calculated initial wait time will be applied’ is generated. If this is the desired behaviour, then dampening can be enabled by configuring the BFD configuration and by using the command bfd dampening bundle-member l3-failure-only .

Beginner

Hi Xander,

Thank you very much for detailed information.

Will it affect to bfd session torn down if I disable dampening ?

Cisco Employee

not necessarily. dampening merely makes sure that a session is stable up before signalng it is up, so that when it goes down afterwards that this is a real down.

however if a session is "penalized" by dampening, then a notification up or down is ignored for that penalty time.

it's like bgp route dampening: to prevent to propagate route oscillations down the network. bfd dampening prevents excessive signaling to the applications that asked for a notification.

cheers!

xander

Beginner

Thank you very much Xander, great explanation.

Cisco Employee

Just FYI: disabling BFD dampening is not supported before 6.1.1

Beginner

Ops, Really ?

Cisco Employee

Hi Aleks,

very nice and thorough article. I believe I found a small mistake in the BFD Scale calculation.

Correct calculation should be:

  • BFD pps rate = 16000 pps
  • Echo packet interval = 150ms 
  • Packets per session at 150ms interval: 1000ms / 150ms = 6.666
  • ==> 16000 / 6.666 = 2400 sessions

Cheers,

Jan

Cisco Employee

hi jan,

ah that's why they call it support-community :) thanks so much for that correction and yes you are right! thanks for the enhancement/correction!

xander

Beginner

Hi Xander,

One more question,please how to verify BFD dampening is running? 

In my case "show bfd session dampening" command output does not display all the interfaces for which bfd dampening is configured.only Bundle interface is displaying.

But in the logs I see dampening logs for those interfaces.

CreatePlease to create content
Content for Community-Ad
August's Community Spotlight Awards