Colin,

ravi rajani · ‎03-07-2017

Hello All,

We have implemented BGP with our 2 ISPs and running conditional advertise-map to switch over. Our end routers are ASR1001.

As soon as one ISP goes down, the hold timer of 90 seconds (configured manually) run. After which, BGP waits for some random amount of time before advertising the network to other ISP. This random time varies from 6 Seconds to 55 Seconds. The logs says that BPG(4): Condition DEFAULT_INTERNAL_MAP_ISP2 changes to Advertise, after which advertisement starts.

We are clear with the process. But, we are not sure what is this random time it takes to run the advertise-map condition. This random time actually adds up in the downtime (expected only 90 Seconds - hold time). We have tested this condition several times, all the times the advertise-map runs at different times.

Is there any cisco document/knowledge to find the exact timing for advertise-map to run for calculating conditional output ? it would be very helpful, if someone shares the hint where to find the answer. Thanks.

grabonlee · ‎03-07-2017

Hello Ravi,

To understand how conditional advertisement (CA) works, you also need to understand the BGP Scanner process, as that's what triggers the CA process.

The default time for BGP scanning is 60 secs, and CA can kick in sooner, depending on when the tracked route is removed from the BGP table and when the next instance of the scanning occurs.

You can go through the CA feature in the link below, and further down, you will see reference to BGP Scanner.

http://www.cisco.com/c/en/us/support/docs/ip/border-gateway-protocol-bgp/16137-cond-adv.html

2colin-cant · ‎08-16-2017

Hi Ravi,

i have a similar situation i would like to solve:

router bgp XXXX

timers bgp 1 3

bgp scan-time 5

neighbor X.X.X.1 advertise-map RMP-ISP-2-OUT exist-map RMP-EXIST-MAP-CORE-LOOPBACKS

However the BGP scan timer lowered to 5 seconds does not seem to impact the conditional advertisment (CA) timer.

Is there an undocumented Ninja-style TAC command which could help override standard IOS CA related timers?

Thanks

Colin

grabonlee · ‎08-17-2017

Colin,

The BGP scanner sequence is as follows: General scan, followed by Unicast RIB scan, followed by Next_Hop scan.

The scan time is the interval between scans and not the length of time for each scan.

Use "debug ip bgp events" to observe the scanner activity and the time interval.

Georg Pauwen · ‎08-17-2017

Hello,

in addition to the other posts, you can set the neighbor advertisement-interval to 0 (30 seconds is the default for eBGP):

neighbor x.x.x.x advertisement-interval 0

Also, check if your ISP supports BFD (see the link below for details). Here is what the configuration of your local BGP router should look like:

interface GigabitEthernet0/1
ip address 192.168.10.1 255.255.255.0
bfd interval 50 min_rx 50 multiplier 3
!
router bgp 1
bgp log-neighbor-changes
neighbor 192.168.10.2 remote-as 2
neighbor 192.168.10.2 fall-over bfd

https://www.cisco.com/c/en/us/td/docs/ios/iproute_bgp/command/reference/irg_book/irg_bgp3.html#wp1105562

2colin-cant · ‎08-17-2017

Hi guys,

thanks for your replies.

I am aware of BFD and I am using BFD where required, mainly on routed links between devices, and if required I use IPv4 BFD multihop.

However I am trying to solve the following below, where as I am trying to have BGP end to end through my perimiter firewalls.

This to ensure that the Border routers do not cause a black hole outbound, while ensuring the core only uses the path towards the Border router which is still available end to end.

I would like to have the "control plane" end to end rather than relying on static routes / IP SLA constructs, as the peer may respond to ICMPs but its routing engine is down and or its neigbor is down.

Another thing I would like to cover is that I can ensure the firewall is only routing traffic while its ACL firewall engine and NAT address translation service is still operational.

This is probably the biggest problem currently as I am not aware of any firewall vendor, which injects its ACL engine / NAT engines operational status somehow into the routing process.

Meaning the firewall may still route traffic while its no longer firewalling, hence I could still attract and drop traffic.

If someone has a solution to that, please share!

However coming back to BGP advertise-map / exist-map / non-exist-maps.

It seems hard coded within IOS, and unless some TAC guy shares some Ninja command, this will not change from a 60 second delay, no matter if you configure BGP advertisement-interval 0 or not. It did not work for me, as I had DEBUG IP BGP UPDATES running, and could see the update being advertise up to the border routers, but the border routes would wait until the CONDITION would be met, after approx 60 seconds, then widthdraw the /24 outbound to ISP A/B.

However that is not fast enough for most customers today, with more and more services going to the cloud.

---> Below solution will not converge faster than 60 seconds !!

(Please prove me wrong)

----------------------------

Which brings me to this, once again end to end BGP through firewalls.

However in this instance the CORE devices inside generate the outside /24 prefix via "x.x.x.x/24 Null 0" routes.

Those /24's are then picked up by the BGP and advertised outbound to the ISPs, ensuring there is not black hole in front or at the back of the firewalls.

Also the received default routes are being learned from the ISPs and pushed down to the cores, end to end.

This is probably my new solution, however this still needs to be labbed up and tested.

--------------------------------------------------

A few guys may have noticed the MPLS component, and go, WHY the heck?

Basically we operate two independant firewalls and one decided to partially die, now the idea is to not bind any services to the physical interfaces of the firewalls. Meaning we inject /32's into the MPLS "transit vrf".
This results that we can utilize both border routers for failover having two independant Firewalls. And if a Firewall fails we can easily shutdown the injected loopbacks on the partially failing Firewall, and enable the same configured /32 on the opposite still working firewall, while the default route and advertisements for the /24s outbound still operate.

We use MPLS/L3 VPNs to conserve public IP address space in regards to transit networks required between DC's, as we can span multiple DCs without loosing valuable IP addressing.

Thats some of the background info in order to understand my end to end designs.

Thanks for your replies, comments, and feedbacks!

THX

Colin

grabonlee · ‎08-17-2017

Colin,

The neighbor advertisement-interval on the Border routers should work effectively for you. There is no special TAC command required.

See output below of how quick the updates were sent both for advertise and withdraw. Scan-time was set to 5 secs.

Obviously, my BGP table is small. Yours may have many prefixes.

R3(config-router)#neighbor 155.1.13.1 advertisement-interval 1
R3(config-router)#
01:14:40: BGP(0): 155.1.37.7 rcv UPDATE about 112.0.0.0/8 -- withdrawn
01:14:40: BGP(0): no valid path for 112.0.0.0/8
01:14:40: BGP: topo global:IPv4 Unicast:base Remove_fwdroute for 112.0.0.0/8
01:14:40: BGP(0): (base) 155.1.13.1 send unreachable (format) 112.0.0.0/8
R3(config-router)#
01:14:45: BPG(0): Condition exis changes to Withdraw
01:14:45: BGP(0): net 33.33.33.0/24 matches ADV MAP adv: bump version to 113
01:14:45: BGP: topo global:IPv4 Unicast:base Remove_fwdroute for 33.33.33.0/24
01:14:45: BGP(0): (base) 155.1.13.1 send unreachable (format) 33.33.33.0/24

R3(config-router)#
R3(config-router)#neighbor 155.1.13.1 advertisement-interval 0
R3(config-router)#
01:15:45: BGP(0): 155.1.37.7 rcvd UPDATE w/ attr: nexthop 155.1.37.7, origin i, merged path 300 54 50 60, AS_PATH
01:15:45: BGP(0): 155.1.37.7 rcvd 112.0.0.0/8
01:15:45: BGP(0): Revise route installing 1 of 1 routes for 112.0.0.0/8 -> 155.1.37.7(global) to main IP table
01:15:45: BGP(0): (base) 155.1.13.1 send UPDATE (format) 112.0.0.0/8, next 155.1.13.3, metric 0, path 300 54 50 60
01:15:45: BPG(0): Condition exis changes to Advertise
01:15:45: BGP(0): net 33.33.33.0/24 matches ADV MAP adv: bump version to 115
01:15:45: BGP: topo global:IPv4 Unicast:base Remove_fwdroute for 33.33.33.0/24
R3(config-router)#
01:15:45: BGP(0): (base) 155.1.13.1 send UPDATE (format) 33.33.33.0/24, next 155.1.13.3, metric 0, path Local
R3(config-router)#

R3(config-router)#
01:19:14: BGP(0): 155.1.37.7 rcv UPDATE about 112.0.0.0/8 -- withdrawn
01:19:14: BGP(0): no valid path for 112.0.0.0/8
01:19:14: BGP: topo global:IPv4 Unicast:base Remove_fwdroute for 112.0.0.0/8
01:19:14: BGP(0): (base) 155.1.13.1 send unreachable (format) 112.0.0.0/8
R3(config-router)#
01:19:16: BPG(0): Condition exis changes to Withdraw
01:19:16: BGP(0): net 33.33.33.0/24 matches ADV MAP adv: bump version to 117
01:19:16: BGP: topo global:IPv4 Unicast:base Remove_fwdroute for 33.33.33.0/24
01:19:16: BGP(0): (base) 155.1.13.1 send unreachable (format) 33.33.33.0/24

Are you importing the loopbacks in a VRF? The import scanner for VRF tables is 15secs by default. Also if your neighbor peering is activated in VPNv4, then you may have to configure the neighbor advertisement-interval in the VPNv4

bgp advertise-map/non-exist-map timer