BGP Problem

David Leudem · ‎07-29-2018

Hello all;

I have BGP peering between one cisco router and alcatel router. both routers are directly connected.each time that the physical link between the two falls, when comes up the bgp remains in idle state. to solve it, need to reconfigure the peering again. Is there any configuration that can help me to this peering automatically up when the link goes down and come up??

Georg Pauwen · ‎07-29-2018

Hello,

on the Cisco side, you can configure 'bgp fast-external-failover', or set the bgp keepalive and hold timers to something really low, e.g. 'bgp timers 3 15'.

David Leudem · ‎07-29-2018

Thanks for your reply! I tried both solutions but no change

Georg Pauwen · ‎07-29-2018

Hello,

can you 'debug bgp *' when this occurs and post the output ? I have a feeling that the problem could be related to the Alcatel side of the BGP implementation. What Alcatel device is this on ?

In the meantime, as a workaround, you can configure an EEM script on the Cisco that automatically reconfigures ypur neighbor, at least you don't have to do it manually then...

Post the config of your Cisco router...

David Leudem · ‎07-29-2018

here is the result of debug

the other end is ALU 7750

10.120.120.9 IPv4 Unicast:base (0x7FD7EA86DB10:0) Not scheduling for GR processing [Peer did not advertise GR cap]
*Jul 29 16:04:15.553: %BGP-5-NBR_RESET: Neighbor 10.120.120.9 active reset (Peer closed the session)
*Jul 29 16:04:15.553: BGP: ses global 10.120.120.9 (0x7FD7EA86DB10:0) act Reset (Peer closed the session).
*Jul 29 16:04:15.553: BGP: nbr_topo global 10.120.120.9 IPv4 Unicast:base (0x7FD7EA86DB10:0) NSF delete stale NSF not active
*Jul 29 16:04:15.553: BGP: nbr_topo global 10.120.120.9 IPv4 Unicast:base (0x7FD7EA86DB10:0) NSF no stale paths state is NSF not active
*Jul 29 16:04:15.553: BGP: nbr_topo global 10.120.120.9 IPv4 Unicast:base (0x7FD7EA86DB10:0) Resetting ALL counters.
*Jul 29 16:04:15.553: BGP: 10.120.120.9 active closing
*Jul 29 16:04:15.553: BGP: ses global 10.120.120.9 (0x7FD7EA86DB10:0) act Session close and reset neighbor 10.120.120.9 topostate
*Jul 29 16:04:15.553: BGP: nbr_topo global 10.120.120.9 IPv4 Unicast:base (0x7FD7EA86DB10:0) Resetting ALL counters.
*Jul 29 16:04:15.553: BGP: 10.120.120.9 active went from OpenSent to Idle
*Jul 29 16:04:15.554: %BGP_SESSION-5-ADJCHANGE: neighbor 10.120.120.9 IPv4 Unicast topology base removed from session Peer closed the session
*Jul 29 16:04:15.554: BGP: ses global 10.120.120.9 (0x7FD7EA86DB10:0) act Removed topology IPv4 Unicast:base
*Jul 29 16:04:15.554: BGP: ses global 10.120.120.9 (0x7FD7EA86DB10:0) act Removed last topology
*Jul 29 16:04:15.554: BGP: nbr global 10.120.120.9 Open active delayed 7168ms (35000ms max, 60% jitter)
*Jul 29 16:04:15.554: BGP: nbr global 10.120.120.9 Active open failed - open timer running
*Jul 29 16:04:19.612: BGP_Router: unhandled major event code 128, minor 0
*Jul 29 16:04:22.719: BGP: 10.120.120.9 active went from Idle to Active
*Jul 29 16:04:22.719: BGP: 10.120.120.9 open active, local address 41.244.10.40
*Jul 29 16:04:22.720: BGP: ses global 10.120.120.9 (0x7FD7E5AEE0A0:0) act Adding topology IPv4 Unicast:base
*Jul 29 16:04:22.720: BGP: ses global 10.120.120.9 (0x7FD7E5AEE0A0:0) act Send OPEN
*Jul 29 16:04:22.720: BGP: ses global 10.120.120.9 (0x7FD7E5AEE0A0:0) act Building Enhanced Refresh capability
*Jul 29 16:04:22.720: BGP: 10.120.120.9 active went from Active to OpenSent
*Jul 29 16:04:22.720: BGP: 10.120.120.9 active sending OPEN, version 4, my as: 37620, holdtime 180 seconds, ID 29F40A28
*Jul 29 16:04:22.721: BGP: ses global 10.120.120.9 (0x7FD7E5AEE0A0:0) act Remote close.
*Jul 29 16:04:22.721: BGP: nbr_topo global 10.120.120.9 IPv4 Unicast:base (0x7FD7E5AEE0A0:0) Not scheduling for GR processing [Peer did not advertise GR cap]
*Jul 29 16:04:22.721: %BGP-5-NBR_RESET: Neighbor 10.120.120.9 active reset (Peer closed the session)
*Jul 29 16:04:22.721: BGP: ses global 10.120.120.9 (0x7FD7E5AEE0A0:0) act Reset (Peer closed the session).
*Jul 29 16:04:22.721: BGP: nbr_topo global 10.120.120.9 IPv4 Unicast:base (0x7FD7E5AEE0A0:0) NSF delete stale NSF not active
*Jul 29 16:04:22.721: BGP: nbr_topo global 10.120.120.9 IPv4 Unicast:base (0x7FD7E5AEE0A0:0) NSF no stale paths state is NSF not active
*Jul 29 16:04:22.721: BGP: nbr_topo global 10.120.120.9 IPv4 Unicast:base (0x7FD7E5AEE0A0:0) Resetting ALL counters.
*Jul 29 16:04:22.721: BGP: 10.120.120.9 active closing
*Jul 29 16:04:22.722: BGP: ses global 10.120.120.9 (0x7FD7E5AEE0A0:0) act Session close and reset neighbor 10.120.120.9 topostate
*Jul 29 16:04:22.722: BGP: nbr_topo global 10.120.120.9 IPv4 Unicast:base (0x7FD7E5AEE0A0:0) Resetting ALL counters.
*Jul 29 16:04:22.722: BGP: 10.120.120.9 active went from OpenSent to Idle
*Jul 29 16:04:22.722: %BGP_SESSION-5-ADJCHANGE: neighbor 10.120.120.9 IPv4 Unicast topology base removed from session Peer closed the session
*Jul 29 16:04:22.722: BGP: ses global 10.120.120.9 (0x7FD7E5AEE0A0:0) act Removed topology IPv4 Unicast:base
*Jul 29 16:04:22.722: BGP: ses global 10.120.120.9 (0x7FD7E5AEE0A0:0) act Removed last topology
*Jul 29 16:04:22.722: BGP: nbr global 10.120.120.9 Open active delayed 13312ms (35000ms max, 60% jitter)
*Jul 29 16:04:22.722: BGP: nbr global 10.120.120.9 Active open failed - open timer running

Georg Pauwen · ‎07-30-2018

Hello,

post the full configs of the Cisco (show run) and the Alcatel (admin display-config)...

paul driver · ‎07-29-2018

@Georg Pauwen wrote:

Hello,

on the Cisco side, you can configure 'bgp fast-external-failover', or set the bgp keepalive and hold timers to something really low, e.g. 'bgp timers 3 15'.

@georg this wouldn’t make any difference as those commands you’ve stated are to basically reduced the holdtime of the previous bgp session For a quick failover It won’t help bring the peer session back up any quicker if there is a problem at the first initial bgp state

Please rate and mark as an accepted solution if you have found any of the information provided useful.
This then could assist others on these forums to find a valuable answer and broadens the community’s global network.

Kind Regards
Paul

paul driver · ‎07-29-2018

Hello

If the physical link is going down I would question the physical connection between the two devices

As the bgp Idle state is basically saying the peers cannot see each other.

How are you configuring the peering, Is it with connected ip addressing or between loopbacks? If the latter and seeing as your peering between different vendors, trying disabling the verification process check that bgp performs for directly connected bgp peers - neighbour xxx disable-connected-check or use the neighbour xxx ebgp multihop command

The above are obviously Cisco commands but you may have to reference if the same feature for the other vendor are applicable

Please rate and mark as an accepted solution if you have found any of the information provided useful.
This then could assist others on these forums to find a valuable answer and broadens the community’s global network.

Kind Regards
Paul

David Leudem · ‎07-29-2018

yes the peering it is with connected ip addressing. And it's ibgp peering

Georg Pauwen · ‎07-29-2018

Hello,

if it's iBGP, that makes a difference. Can you post the relevant parts of the configs from both the Cisco and the Alactel ?

Do you have both sides configured under address families ?

David Leudem · ‎07-29-2018

//cisco

router bgp 300
no bgp enforce-first-as
bgp log-neighbor-changes
neighbor 10.120.120.9 remote-as 300
neighbor 10.120.120.9 description TO_ALU_ROUTER
neighbor 10.120.120.9 update-source Loopback1

!
address-family ipv4
neighbor 10.120.120.9 activate
neighbor 10.120.120.9 next-hop-self
neighbor 10.120.120.9 soft-reconfiguration inbound
neighbor 10.120.120.9 route-map LOCAL_PREF out

exit-address-family

//ALU

configure router bgp
local-as 300
group "TO_CAM"
description "TO_CAM"
family ipv4
local-preference 700
cluster 10.120.120.9
neighbor 10.120.120.10
type internal
remove-private
export "TO_CAM"
local-as 300
peer-as 300
exit
exit
no shutdown

Georg Pauwen · ‎07-29-2018

Hello,

config looks good as far as I can tell. I only have the ALCATEL 7750 in GNS3 to test with, but you might want to make sure that the keepalive and holdtime values on both devices match (30/90 on Cisco). Also try and enable peer tracking on the Alcatel:

configure router bgp
local-as 300
group "TO_CAM"
description "TO_CAM"
family ipv4
hold-time 90
keepalive 30
local-preference 700
cluster 10.120.120.9
enable-peer-tracking
neighbor 10.120.120.10
type internal
remove-private
export "TO_CAM"
local-as 300
peer-as 300
exit
exit
no shutdown

David Leudem · ‎07-29-2018

As I read in ALU doc the default value of hold-time and keepalive are the same with cisco 90/30. so I just try this but no change!!

Georg Pauwen · ‎07-29-2018

Hello,

as stated, you can implement an EEM script that automatically reactivates your neighbor in case it is not reachable anymore. Below is a sample (change the source interface to match the one you have configured):

ip sla 1
icmp-echo 10.120.120.9 source-interface GigabitEthernet0/0

!
ip sla schedule 1 life forever start-time now
ip sla reaction-configuration 1 react timeout threshold-type immediate

ip sla enable reaction-alerts

!

track 1 ip sla 1 reachability
delay down 5 up 10

!

event manager applet BGP_NEW
event ipsla operation-id 1 reaction-type timeout
action 1.0 if $_ipsla_condition eq "Occurred"
action 1.1 cli command "enable"
action 1.2 cli command "conf t"
action 1.3 cli command "router bgp 300"
action 1.4 cli command "address-family ipv4"
action 1.5 cli command "neighbor 10.120.120.9 activate"
action 1.5 cli command "exit-address-family"
action 1.6 cli command "end"

David Leudem · ‎07-31-2018

I really like the idea but I'm having a problem with it because this command "event ipsla" does not exist on my cisco router as you can see bellow

R1(config)#event manager applet BGP_CONFIG
R1(config-applet)#ev
R1(config-applet)#event ?
application Application specific event
cli CLI event
config Configuration policy event
counter Counter event
env Environmental event
gold GOLD event
interface Interface event
ioswdsysmon IOS WDSysMon event
neighbor-discovery Neighbor Discovery event
none Manually run policy event
oir OIR event
resource Resource event
rf Redundancy Facility event
routing Routing event
rpc Remote Procedure Call event
snmp SNMP event
snmp-notification SNMP Notification Event
snmp-object SNMP object event
syslog Syslog event
tag event tag identifier
timer Timer event
track Tracking object event