cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
236
Views
1
Helpful
2
Replies

BGP ECMP and BFD between onprem to AWS

atsukane
Level 3
Level 3

Hi All,

We have 2 x 1Gbps Direct Connect between AWS and HA pair of FPR2140 (FMC managed) running FTD ver.7.4.2.1, BGP and ECMP to load balance the traffic. There are L2 switch in between them to split the single circuit to 2 firewalls.

atsukane_0-1749215802398.png

AWS's default keeptime is 30sec and holdtime is 90 sec, so in order to detect failures faster I've enabled BFD on our FTD with the following parameters. (BFD is enabled by default on AWS Enable BFD for a Direct Connect connection | AWS re:Post ) .

Unlike Azure, Echo mode is not enabled on AWS side (ironically) by the look of it. 

In any case, I'm not sure if this is set too sensitive, but we are now seeing frequent BGP neighbour down alerts from Solarwinds. 

"show bgp summary" confirms BGP being actually down, albeit very short time and it comes up quite quickly with in a minute or so.

In an effort to increase the failure detection time to minimize what appears to be false positive or it being too sensitive, I've changed the interval values, multiplier value, and slow timer value in the lab, but when I run a simulated test the detection time does not change.  I'd have thought upping the multiplier from 3 to 50 would definitely increase the detection time, but looks like the multiplier value need to match on the both sides. 

Enabled debug on bfd packets and bfd events, but I can't really tell what's going wrong to be honest.

Can someone advise which value would increase the failure detection time? 

bfd-template single-hop BFD_Template1
bfd slow-timers 10000
bfd template BFD_Template1
bfd template BFD_Template1
bfd template BFD_Template1
bfd template BFD_Template1
 neighbor 192.168.1.2 fall-over bfd single-hop
 neighbor 192.168.1.25 fall-over bfd single-hop
> show bfd neighbors details 

IPv4 Sessions
NeighAddr                                       LD/RD         RH/RS       State   Int
192.168.1.2                                    44/9493       Up          Up      AWS-2
Session state is UP and not using echo function.
Session Host: Software
OurAddr: 192.168.1.1
Handle: 3
Local Diag: 0, Demand mode: 0, Poll bit: 0
MinTxInt: 300000, MinRxInt: 300000, Multiplier: 3
Received MinRxInt: 300000, Received Multiplier: 3  <<< received from AWS
Holddown (hits): 0(0), Hello (hits): 300(143222)
Rx Count: 125697, Rx Interval (ms) min/max/avg: 1/2001/300 last: 293 ms ago
Tx Count: 143225, Tx Interval (ms) min/max/avg: 1/1548/264 last: 237 ms ago
Elapsed time watermarks: 0 0 (last: 0)
Registered protocols: BGP 
Template: BFD_Template1
Uptime: 10:29:40
Last packet: Version: 1                  - Diagnostic: 0
             State bit: Up               - Demand bit: 0
             Poll bit: 0                 - Final bit: 0
             C bit: 1                                       
             Multiplier: 3               - Length: 24
             My Discr.: 9493             - Your Discr.: 44
             Min tx interval: 300000     - Min rx interval: 300000

IPv4 Sessions
NeighAddr                                       LD/RD         RH/RS       State   Int
192.168.1.25                                   41/5436       Up          Up      AWS-1
Session state is UP and not using echo function.
Session Host: Software
OurAddr: 192.168.1.26
Handle: 4
Local Diag: 0, Demand mode: 0, Poll bit: 0
MinTxInt: 300000, MinRxInt: 300000, Multiplier: 3
Received MinRxInt: 300000, Received Multiplier: 3  <<< received from AWS
Holddown (hits): 0(0), Hello (hits): 300(307541)
Rx Count: 270142, Rx Interval (ms) min/max/avg: 275/1418/300 last: 213 ms ago
Tx Count: 307543, Tx Interval (ms) min/max/avg: 31/1418/263 last: 194 ms ago
Elapsed time watermarks: 0 0 (last: 0)
Registered protocols: BGP 
Template: BFD_Template1
Uptime: 22:32:35
Last packet: Version: 1                  - Diagnostic: 0
             State bit: Up               - Demand bit: 0
             Poll bit: 0                 - Final bit: 0
             C bit: 1                                       
             Multiplier: 3               - Length: 24
             My Discr.: 5436             - Your Discr.: 41
             Min tx interval: 300000     - Min rx interval: 300000
             Min Echo interval: 0       

Debugatsukane_1-1749218096911.png

 

 

 

2 Replies 2

M02@rt37
VIP
VIP

Hello @atsukane 

"...but looks like the multiplier value need to match on the both sides...."

Yes, both sides must agree on BFD parameters (multiplier...), you can't just increase it on your side and expect a change in detection time unles AWS support and mirror the change...

From my point of view, and regarding your outputs, you are locked into AWS's aggresive timer — you can’t slow it down from your side.

 

 

Best regards
.ı|ı.ı|ı. If This Helps, Please Rate .ı|ı.ı|ı.

Thanks M02@rt37, that's something I was half expecting

In that case, I'd have to somehow determiner whether frequent drop is false positive or not.  

Interestingly, Azure has Echo mode enabled and MS is also saying that in certain cases the minimum intervals can be set at higher value of 750ms. Azure ExpressRoute: Configure BFD | Microsoft Learn

Review Cisco Networking for a $25 gift card