cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1052
Views
0
Helpful
0
Replies

Pseudowires Drop When BFD Restores Links after Link Failure

ben.wiechman
Level 4
Level 4


We are seeing an issue where MPLS pseudowires are dropping when BFD restores links after a link failure when using LFA fast re-route. When there is a physical link failure and restoration MPLS pseudowires do not drop. This only appear to happen when there is a L1 failure (ROADM) that is not reflected in a phyiscal link failure, instead BFD is bringing the link down and restoring it to service. 

Based on the logs it appears the router attempts to use the link as soon as the BFD session is restored, but this causes the pseudowires to fail until the local adjacency is restored. IGP sync delay is configured, however this does not appear to take effect in this case.

There is a redundant path for the pseudowires and the routing table and MPLS forwarding table both show the LFA protected path is available pre/post link failure. 

This appears similar to the issue reported in this forum post: https://supportforums.cisco.com/discussion/11822666/bfd-ldp

I have not been able to determine why the author believed the BFD dampening would resolve the issues. 

This bug also appears to exactly describe the symptoms: https://bst.cloudapps.cisco.com/bugsearch/bug/CSCtf46692

Hardware is ASR9001 and ASR9010 running 5.2.2 and using Typhoon line cards. 

Noted with physical interfaces and bundles. 

Are we missing something in the configuration to bind this all together and force LDP to wait for the underlying IGP and LDP adjacencies to be restored? See the end of the post. I have attempted to remove MPLS LDP graceful restart and did configured BFD dampening. With those changes the pseudowires no longer dropped every time a BFD failure and restoration was replicated in the lab, but did not eliminate the failure in every case. 

The following configuration would be representative of the relevant configuration on the routers. 

bfd
interface Bundle-Ether10
!
interface Bundle-Ether20
!
!
! Redundant path
interface Bundle-Ether10
bfd address-family ipv4 multiplier 3
bfd address-family ipv4 destination {remote_ip}
bfd address-family ipv4 fast-detect
bfd address-family ipv4 minimum-interval 15
mtu 9100
service-policy output PMAP-QOS-TRANSPORT
ipv4 point-to-point
ipv4 address {local_ip} 255.255.255.252
bundle minimum-active links 2
load-interval 30
!
! Active path for pseudowires noted in logs
interface Bundle-Ether20
bfd address-family ipv4 multiplier 3
bfd address-family ipv4 destination {remote_ip}
bfd address-family ipv4 fast-detect
bfd address-family ipv4 minimum-interval 15
mtu 9100
service-policy output PMAP-QOS-TRANSPORT
ipv4 point-to-point
ipv4 address {local_ip} 255.255.255.252
bundle minimum-active links 2
load-interval 30
!
!
router isis Arvig
set-overload-bit on-startup 30
net 49.XXXX.XXXX.XXXX.XXXX.00
nsf ietf
log adjacency changes
lsp-gen-interval maximum-wait 30000 initial-wait 30000 secondary-wait 30000
lsp-password text encrypted SUPERSECRETPASSWORD level 2
address-family ipv4 unicast
metric-style wide
metric 16000000
microloop avoidance protected
ispf
mpls traffic-eng level-1-2
mpls traffic-eng router-id Loopback0
mpls traffic-eng igp-intact
!
address-family ipv4 multicast
metric-style wide
metric 16000000
ispf
!
address-family ipv6 unicast
metric-style wide
metric 16000000
ispf
!
address-family ipv6 multicast
metric-style wide
metric 16000000
ispf
!
interface Bundle-Ether10
circuit-type level-2-only
bfd minimum-interval 10
bfd multiplier 3
bfd fast-detect ipv4
bfd fast-detect ipv6
point-to-point
hello-padding sometimes
hello-password text encrypted SUPERSECRETPASSWORD
address-family ipv4 unicast
fast-reroute per-prefix
fast-reroute per-prefix remote-lfa tunnel mpls-ldp
metric 200
mpls ldp sync
!
address-family ipv4 multicast
metric 200
!
address-family ipv6 unicast
metric 200
!
address-family ipv6 multicast
metric 200
!
!
interface Bundle-Ether20
circuit-type level-2-only
bfd minimum-interval 10
bfd multiplier 3
bfd fast-detect ipv4
bfd fast-detect ipv6
point-to-point
hello-padding sometimes
hello-password text encrypted SUPERSECRETPASSWORD
address-family ipv4 unicast
fast-reroute per-prefix
fast-reroute per-prefix remote-lfa tunnel mpls-ldp
metric 200
mpls ldp sync
!
address-family ipv4 multicast
metric 200
!
address-family ipv6 unicast
metric 200
!
address-family ipv6 multicast
metric 200
!
!
!
l2vpn
load-balancing flow src-dst-ip
logging
bridge-domain
pseudowire
nsr
vfi
!
pw-class PW-CLASS-FAT-PW
encapsulation mpls
control-word
transport-mode ethernet
load-balancing
flow-label both
!
!
!
!
mpls ldp
log
adjacency
neighbor
nsr
graceful-restart
session-protection
!
nsr
graceful-restart
igp sync delay on-session-up 30
router-id 10.xxx.xxx.1
address-family ipv4
label
local
advertise
explicit-null
!
!
!
!
interface Bundle-Ether10
!
interface Bundle-Ether20
!
!

See the following logs for a clarification. In this case BE20 is the preferred path. BE10 is the backup path. 

10.80.255.79 is directly adjacent to the ASR9010 in question on BE20. The remaining 10.132.255.x neigbhors are beyond 10.80.255.79 or the next hop on BE10. 

! BFD detects an underlying failure and forces BE20 down
!
LC/0/6/CPU0:2016 Sep 20 01:17:07.051 UTC: ifmgr[210]: %PKT_INFRA-LINEPROTO-5-UPDOWN : Line protocol on Interface TenGigE0/6/1/3, changed state to Up
RP/0/RSP0/CPU0:2016 Sep 20 01:17:09.074 UTC: BM-DISTRIB[1171]: %L2-BM-6-MBR_BFD_STARTING : The BFD session on link TenGigE0/6/1/3 in Bundle-Ether20 is starting. Waiting indefinitely for peer to establish session.
RP/0/RSP0/CPU0:2016 Sep 20 01:17:09.074 UTC: BM-DISTRIB[1171]: %L2-BM-6-MBR_BFD_STARTING : The BFD session on link TenGigE0/7/1/3 in Bundle-Ether20 is starting. Waiting indefinitely for peer to establish session.
LC/0/7/CPU0:2016 Sep 20 01:17:31.084 UTC: bfd_agent[125]: %L2-BFD-6-SESSION_STATE_UP : BFD session to neighbor {remote_ip} on interface TenGigE0/7/1/3 is up
RP/0/RSP0/CPU0:2016 Sep 20 01:17:31.088 UTC: BM-DISTRIB[1171]: %L2-BM-6-MBR_BFD_SESSION_UP : The BFD session on link TenGigE0/7/1/3 in Bundle-Ether20 has gone UP.
RP/0/RSP0/CPU0:2016 Sep 20 01:17:31.088 UTC: BM-DISTRIB[1171]: %L2-BM-6-ACTIVE : TenGigE0/7/1/3 is Active as part of Bundle-Ether20
LC/0/6/CPU0:2016 Sep 20 01:17:32.600 UTC: bfd_agent[125]: %L2-BFD-6-SESSION_STATE_UP : BFD session to neighbor {remote_ip} on interface TenGigE0/6/1/3 is up
RP/0/RSP0/CPU0:2016 Sep 20 01:17:32.604 UTC: BM-DISTRIB[1171]: %L2-BM-6-MBR_BFD_SESSION_UP : The BFD session on link TenGigE0/6/1/3 in Bundle-Ether20 has gone UP.
RP/0/RSP0/CPU0:2016 Sep 20 01:17:32.604 UTC: BM-DISTRIB[1171]: %L2-BM-6-ACTIVE : TenGigE0/6/1/3 is Active as part of Bundle-Ether20
!
!
! BFD session restored
RP/0/RSP0/CPU0:2016 Sep 20 01:17:32.608 UTC: bfd[1172]: %L2-BFD-6-SESSION_STATE_UP : BFD session to neighbor {remote_ip} on interface Bundle-Ether20 is up
!
! ISIS is unhappy...
LC/0/7/CPU0:2016 Sep 20 01:17:32.611 UTC: netio[278]: %ROUTING-CLNS-3-DROP_PKT : Unable to get src MAC addr. Dropping packet
!
! Pseudowires drop
RP/0/RSP0/CPU0:2016 Sep 20 01:17:33.167 UTC: l2vpn_mgr[1186]: %L2-L2VPN_PW-3-UPDOWN : Pseudowire with address 10.80.255.79, id 875, state is changed to: Down
RP/0/RSP0/CPU0:2016 Sep 20 01:17:33.167 UTC: l2vpn_mgr[1186]: %L2-L2VPN_PW-3-UPDOWN : Pseudowire with address 10.80.255.79, id 987, state is changed to: Down
RP/0/RSP0/CPU0:2016 Sep 20 01:17:33.174 UTC: l2vpn_mgr[1186]: %L2-L2VPN_PW-3-UPDOWN : Pseudowire with address 10.132.255.4, id 451, state is changed to: Down
RP/0/RSP0/CPU0:2016 Sep 20 01:17:33.174 UTC: l2vpn_mgr[1186]: %L2-L2VPN_PW-3-UPDOWN : Pseudowire with address 10.132.255.4, id 101322554, state is changed to: Down
RP/0/RSP0/CPU0:2016 Sep 20 01:17:33.174 UTC: l2vpn_mgr[1186]: %L2-L2VPN_PW-3-UPDOWN : Pseudowire with address 10.132.255.4, id 855, state is changed to: Down
RP/0/RSP0/CPU0:2016 Sep 20 01:17:33.174 UTC: l2vpn_mgr[1186]: %L2-L2VPN_PW-3-UPDOWN : Pseudowire with address 10.132.255.4, id 856, state is changed to: Down
RP/0/RSP0/CPU0:2016 Sep 20 01:17:33.175 UTC: l2vpn_mgr[1186]: %L2-L2VPN_PW-3-UPDOWN : Pseudowire with address 10.132.255.3, id 454, state is changed to: Down
RP/0/RSP0/CPU0:2016 Sep 20 01:17:33.175 UTC: l2vpn_mgr[1186]: %L2-L2VPN_PW-3-UPDOWN : Pseudowire with address 10.132.255.3, id 101322553, state is changed to: Down
RP/0/RSP0/CPU0:2016 Sep 20 01:17:33.178 UTC: l2vpn_mgr[1186]: %L2-L2VPN_PW-3-UPDOWN : Pseudowire with address 10.132.255.8, id 489, state is changed to: Down
RP/0/RSP0/CPU0:2016 Sep 20 01:17:33.178 UTC: l2vpn_mgr[1186]: %L2-L2VPN_PW-3-UPDOWN : Pseudowire with address 10.132.255.8, id 101322558, state is changed to: Down
RP/0/RSP0/CPU0:2016 Sep 20 01:17:33.178 UTC: l2vpn_mgr[1186]: %L2-L2VPN_PW-3-UPDOWN : Pseudowire with address 10.132.255.7, id 472, state is changed to: Down
RP/0/RSP0/CPU0:2016 Sep 20 01:17:33.179 UTC: l2vpn_mgr[1186]: %L2-L2VPN_PW-3-UPDOWN : Pseudowire with address 10.132.255.7, id 101322557, state is changed to: Down
RP/0/RSP0/CPU0:2016 Sep 20 01:17:33.179 UTC: l2vpn_mgr[1186]: %L2-L2VPN_PW-3-UPDOWN : Pseudowire with address 10.132.255.7, id 551, state is changed to: Down
RP/0/RSP0/CPU0:2016 Sep 20 01:17:33.179 UTC: l2vpn_mgr[1186]: %L2-L2VPN_PW-3-UPDOWN : Pseudowire with address 10.132.255.7, id 522, state is changed to: Down
RP/0/RSP0/CPU0:2016 Sep 20 01:17:33.179 UTC: l2vpn_mgr[1186]: %L2-L2VPN_PW-3-UPDOWN : Pseudowire with address 10.132.255.7, id 436, state is changed to: Down
!
!
! MPLS LDP local adjacency is restored
RP/0/RSP0/CPU0:2016 Sep 20 01:17:33.201 UTC: mpls_ldp[1047]: %ROUTING-LDP-5-HELLO_ADJ_CHANGE : VRF 'default' (0x60000000), Adjacency ({remote_ip}) UP with Nbr 10.80.255.79:0 on Bundle-Ether20
!
!
! Pseudowires come back up
RP/0/RSP0/CPU0:2016 Sep 20 01:17:35.299 UTC: l2vpn_mgr[1186]: %L2-L2VPN_PW-3-UPDOWN : Pseudowire with address 10.132.255.4, id 451, state is changed to: Up
RP/0/RSP0/CPU0:2016 Sep 20 01:17:35.299 UTC: l2vpn_mgr[1186]: %L2-L2VPN_PW-3-UPDOWN : Pseudowire with address 10.132.255.4, id 101322554, state is changed to: Up
RP/0/RSP0/CPU0:2016 Sep 20 01:17:35.299 UTC: l2vpn_mgr[1186]: %L2-L2VPN_PW-3-UPDOWN : Pseudowire with address 10.132.255.4, id 855, state is changed to: Up
RP/0/RSP0/CPU0:2016 Sep 20 01:17:35.299 UTC: l2vpn_mgr[1186]: %L2-L2VPN_PW-3-UPDOWN : Pseudowire with address 10.132.255.4, id 856, state is changed to: Up
RP/0/RSP0/CPU0:2016 Sep 20 01:17:35.300 UTC: l2vpn_mgr[1186]: %L2-L2VPN_PW-3-UPDOWN : Pseudowire with address 10.132.255.3, id 454, state is changed to: Up
RP/0/RSP0/CPU0:2016 Sep 20 01:17:35.300 UTC: l2vpn_mgr[1186]: %L2-L2VPN_PW-3-UPDOWN : Pseudowire with address 10.132.255.3, id 101322553, state is changed to: Up
RP/0/RSP0/CPU0:2016 Sep 20 01:17:35.302 UTC: l2vpn_mgr[1186]: %L2-L2VPN_PW-3-UPDOWN : Pseudowire with address 10.132.255.7, id 472, state is changed to: Up
RP/0/RSP0/CPU0:2016 Sep 20 01:17:35.302 UTC: l2vpn_mgr[1186]: %L2-L2VPN_PW-3-UPDOWN : Pseudowire with address 10.132.255.7, id 101322557, state is changed to: Up
RP/0/RSP0/CPU0:2016 Sep 20 01:17:35.302 UTC: l2vpn_mgr[1186]: %L2-L2VPN_PW-3-UPDOWN : Pseudowire with address 10.132.255.7, id 551, state is changed to: Up
RP/0/RSP0/CPU0:2016 Sep 20 01:17:35.303 UTC: l2vpn_mgr[1186]: %L2-L2VPN_PW-3-UPDOWN : Pseudowire with address 10.132.255.7, id 522, state is changed to: Up
RP/0/RSP0/CPU0:2016 Sep 20 01:17:35.303 UTC: l2vpn_mgr[1186]: %L2-L2VPN_PW-3-UPDOWN : Pseudowire with address 10.132.255.7, id 436, state is changed to: Up
RP/0/RSP0/CPU0:2016 Sep 20 01:17:35.303 UTC: l2vpn_mgr[1186]: %L2-L2VPN_PW-3-UPDOWN : Pseudowire with address 10.132.255.8, id 489, state is changed to: Up
RP/0/RSP0/CPU0:2016 Sep 20 01:17:35.303 UTC: l2vpn_mgr[1186]: %L2-L2VPN_PW-3-UPDOWN : Pseudowire with address 10.132.255.8, id 101322558, state is changed to: Up
RP/0/RSP0/CPU0:2016 Sep 20 01:17:35.322 UTC: l2vpn_mgr[1186]: %L2-L2VPN_PW-3-UPDOWN : Pseudowire with address 10.80.255.79, id 875, state is changed to: Up
RP/0/RSP0/CPU0:2016 Sep 20 01:17:35.322 UTC: l2vpn_mgr[1186]: %L2-L2VPN_PW-3-UPDOWN : Pseudowire with address 10.80.255.79, id 987, state is changed to: Up
!
!
! And finally... the ISIS adjacency is restored.
RP/0/RSP0/CPU0:2016 Sep 20 01:17:42.387 UTC: isis[1006]: %ROUTING-ISIS-5-ADJCHANGE : Adjacency to XXXX-ASR9010-Core1 (Bundle-Ether20) (L2) Up, New adjacency

Configuration of BFD dampening with the following settings, and removal of graceful restart eliminated maybe half the failures in the lab, but did not eliminate the issue in every case. 

bfd
dampening secondary-wait 7500
dampening initial-wait 3000
dampening maximum-wait 120000
!
!
mpls ldp
log
adjacency
neighbor
nsr
graceful-restart
session-protection
!
nsr
igp sync delay on-session-up 30
session protection

Any thoughts or assistance would be greatly appreciated. 

0 Replies 0