cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2071
Views
20
Helpful
13
Replies

High CPU loading after the bgp session down.

aweer1234
Level 1
Level 1

Dear All,

  Our network have two ibgp session running between Cisco and Juniper,and one of ibgp session have higher local preference than another session,and also incoming traffic via the first ibgp session,but everytime we tried to shutdown the bgp session the cpu loading will get high up to 60%,normally its should be around 21%.

   Two ibgp using direct interface to build up the bgp session,and no next-hop-self from juniper to cisco router..

    The picture should be following:

     Internal network->Cisco router->Juniper router->Internet.

  Could you please help to check if there any problem for this setup and caused the high cpu loading?

  Thanks!

Regards,

Rex

1 Accepted Solution

Accepted Solutions

Hello Rex,

>>

Adjusting the Delay Interval for BGP Next-Hop Address Tracking

Perform this task to adjust the delay interval between routing table walks for BGP next-hop address tracking.

You can increase the performance of this feature by tuning the delay interval between full routing table walks to match the tuning parameters for the Interior Gateway protocol (IGP). The default delay interval is 5 seconds. This value is optimal for a fast-tuned IGP. In the case of an IGP that converges more slowly, you can change the delay interval to 20 seconds or more, depending on the IGP convergence time.

BGP next-hop address tracking significantly improves the response time of BGP to next-hop changes in the RIB. However, unstable Interior Gateway Protocol (IGP) peers can introduce instability to BGP neighbor sessions. We recommend that you aggressively dampen unstable IGP peering sessions to reduce the possible impact to BGP.

But you just noted :

>> We don't have igp running inside the network

So it looks like that the suggested feature is not applicable to your scenario.

I see why you don't find the right value for the delay.

I'm afraid the feature is not of great help for you.

Probably the right feature here is PIC = Prefix Indipendent Convergence that makes smarter refresh of BGP tables.

But I don't know if BGP PIC is supported on your C7609S. 

Feature Navigator at

http://www.cisco.com/go/fn

gives me for PIC feature BGP PIC edge (IP/MPLS) a list of IOS images for C7600 with RSP720.

I have attached the list of files to this post.

If your hardware is different you may need a different image.

This time should be the right feature for your environment.

Because BGP PIC can take a backup path ready to install and also makes a more intelligent recursion using a table of BGP next-hops and track all the prefixes learned by each next-hop.

Hope to help

Giuseppe

View solution in original post

13 Replies 13

Philip D'Ath
VIP Alumni
VIP Alumni

What is the Cisco device and how many prefixes are in the BGP routing table?

And I am assuming it is the Cisco CPU that is going up to 60%.  How long does it stay at 60%?

Hi Philip,

 We are using Cisco 7609-S,and have include ibgp session full view around  646,886  routes.

 Yes its Cisco CPU that going up to 60%,such situation last around 10 minutes.

  

  Any help would be highly appreciated,thanks!

Regards,

Rex

Hello Rex,

with 646,886  routes.cpu load on the C7600S has to be expected, as it has to install the new BGP paths via the remaining iBGP session for all of them.

I wonder if you have enabled features that can help in a scenario like yours like BGP next-hop tracking.

see

.http://www.cisco.com/c/en/us/td/docs/ios-xml/ios/iproute_bgp/configuration/xe-3s/irg-xe-3s-book/bgp-support-for-next-hop-address-tracking.html#d110411e615a1635

If supported this feature would be of great help.

The fact that the cpu is 21% in normal conditions make me think that you may have overcome the limits of the TCAM table on the 7600S, unless you have an RSP720 3CXL that can support 1M Ipv4 routes Also all DFCs in the linecards must be in line with the RSP720 3CXL must be DFC 3 CXL.

see

http://www.cisco.com/c/en/us/products/collateral/routers/7600-series-routers/product_data_sheet0900aecd8057f3b6.html?cachemode=refresh

Hope to help

Giuseppe

Dear Giuseppe,

  Thanks for your useful information! i have done with read the document you suggested about bgp next-hop tracking,but i still can not figure out what's the value of bgp next-hop delay i should set in order to remand such situation.

  We don't have igp running inside the network.    

  Thank you!

Regards,

Rex

Hello Rex,

>>

Adjusting the Delay Interval for BGP Next-Hop Address Tracking

Perform this task to adjust the delay interval between routing table walks for BGP next-hop address tracking.

You can increase the performance of this feature by tuning the delay interval between full routing table walks to match the tuning parameters for the Interior Gateway protocol (IGP). The default delay interval is 5 seconds. This value is optimal for a fast-tuned IGP. In the case of an IGP that converges more slowly, you can change the delay interval to 20 seconds or more, depending on the IGP convergence time.

BGP next-hop address tracking significantly improves the response time of BGP to next-hop changes in the RIB. However, unstable Interior Gateway Protocol (IGP) peers can introduce instability to BGP neighbor sessions. We recommend that you aggressively dampen unstable IGP peering sessions to reduce the possible impact to BGP.

But you just noted :

>> We don't have igp running inside the network

So it looks like that the suggested feature is not applicable to your scenario.

I see why you don't find the right value for the delay.

I'm afraid the feature is not of great help for you.

Probably the right feature here is PIC = Prefix Indipendent Convergence that makes smarter refresh of BGP tables.

But I don't know if BGP PIC is supported on your C7609S. 

Feature Navigator at

http://www.cisco.com/go/fn

gives me for PIC feature BGP PIC edge (IP/MPLS) a list of IOS images for C7600 with RSP720.

I have attached the list of files to this post.

If your hardware is different you may need a different image.

This time should be the right feature for your environment.

Because BGP PIC can take a backup path ready to install and also makes a more intelligent recursion using a table of BGP next-hops and track all the prefixes learned by each next-hop.

Hope to help

Giuseppe

Hi Giuseppe,

Thanks for your solution! Our equipment running with realeased 122-33.SRE,
however after go to the link http://www.cisco.com/go/fn try to find BGP PIC feature in this realeased and can not find it,i think maybe time to upgrade ios to 15.xx?

When i come to this company,and know the network design just very werid.

Like below picture:

1.1.1.1(Cisco router)---1.1.1.2(Juniper router)---(upstreams)2.2.2.2----(their customers)5.5.5.5

from 1.1.1.1 to 5.5.5.5 next-hop is 2.2.2.2,and to the route 2.2.2.2 next-hop is 1.1.1.2

From the previous feature you suggest about,is it means that if we run some igp like ospf between Cisco and Juniper could help in this case?

Thanks!

--

Regards,

Rex

Hello Rex,

>> Two ibgp using direct interface to build up the bgp session,and no next-hop-self from juniper to cisco router..

AND

>> From the previous feature you suggest about,is it means that if we run some igp like ospf between Cisco and Juniper could help in this case?

For BGP next-hop tracking I would say yes the introduction of an IGP would help.

On the other hand:

>> Our equipment running with realeased 122-33.SRE,
however after go to the link http://www.cisco.com/go/fn try to find BGP PIC feature in this realeased and can not find it,i think maybe time to upgrade ios to 15.xx?

You can take this problem as a good reason to upgrade to 15.x IOS image as well. It is your choice.

Hope to help

Giuseppe

Hello Giuseppe,

It seems i found document about support BGP PIC In 12.2(33)SRE,but can not find in http://www.cisco.com/go/fn
could you please advise if its save to enable this feature? Thanks!
http://www.cisco.com/c/en/us/td/docs/ios/mpls/configuration/guide/15_0s/mp_15_0s_book/irg_bgp_mp_pic.html
In 12.2(33)SRE, this feature was introduced on the Cisco 7200 and Cisco 7600 routers

And from the document its said:

Prerequisites for BGP PIC


Ensure that the backup/alternate path has a unique next hop that is not the same as the next hop of the best path.

--Unfortunately we don't have unique next-hops in first lookup about bgp route.
For example:

From cisco router to one of upstreams subnets:

show ip bgp 1.0.6.0(sorry i have hide some next-hop ip for security):
This time have same next-hops.

6939 4826 38803 56203 56203 56203
3.3.3.3 from 1.1.1.2 (1.1.1.2)
Origin IGP, metric 0, localpref 351, valid, internal, best
6939 4826 38803 56203 56203 56203, (received-only)
3.3.3.3 from 1.1.1.2 (1.1.1.2)
Origin IGP, metric 0, localpref 300, valid, internal

show ip bgp 3.3.3.3
This time have unique next-hops

Local
1.1.1.2 from 1.1.1.2 (1.1.1.2)
Origin IGP, localpref 351, valid, internal, best
Local
1.1.1.6 from 1.1.1.6 (1.1.1.6)
Origin IGP, localpref 300, valid, internal

How BGP PIC Improves Convergence

When the BGP PIC feature is enabled, BGP calculates a backup/alternate path per prefix and installs it into BGP RIB, IP RIB, and FIB. This improves convergence after a network failure. There are two types of network failures that the BGP PIC feature detects:
•Core node/link failure (internal Border Gateway Protocol [iBGP] node failure): If a PE node/link fails, then the failure is detected through IGP convergence. IGP conveys the failure through the RIB to the FIB.

---We have no igp running inside the network so its not work for me?

Local link/immediate neighbor node failure (external Border Gateway Protocol [eBGP] node/link failure): To detect a local link failure or eBGP single-hop peer node failure in less than a second, you must enable BFD. Cisco Express Forwarding looks for BFD events to detect a failure of an eBGP single-hop peer.

---We are running IBGP only so this BFD is not work for us in this case?

And only below command available in 7609-S router:
bgp additional-paths install--available
bgp recursion host--available

neighbor ip-address fall-over ---not available.


Many thanks for your kindly advise!

--

Regards,

Rex

Hello Rex,

considering all the facts that you have reported and mainly the following:

>> 

Ensure that the backup/alternate path has a unique next hop that is not the same as the next hop of the best path.

--Unfortunately we don't have unique next-hops in first lookup about bgp route.
For example:

I'm afraid you cannot achieve better results enabling BGP PIC in your scenario.

If most of the routes share the same BGP next-hop for best path and backup path you are not in the condition to use BGP PIC.

You are not running an IGP, because probably you don't need it and you are doing iBGP peering on connected interfaces instead of loopbacks (otherwise you would need an IGP or static routes to resolve loopback address).

You could run BFD over iBGP sessions, this should be possible.

>> 

And only below command available in 7609-S router:
bgp additional-paths install--available
bgp recursion host--available

neighbor ip-address fall-over ---not available.

This could be a sign that the feature BGP PIC is not fully supported in your current IOS image in 12.2(33)SRE.

I really hoped your topology could allow for the use of BGP PIC, but the fact that several routes have the same BGP next-hop both in the best-path and in the backup path is an issue.

Hope to help

Giuseppe

Dear Giuseppe,

Thanks and i will try to make above solution to work!

Regards,

Rex

Hello Giuseppe,

Sorry to re-open this case,i have tried to bgp next-hop tracking solution of building up a lab with Cisco-Juniper router with two ibgp session and running ospf inside now,the first link with lower ospf cost with preferred path,but however the situation still the same.When i have shutdown the first bgp session,the cpu loading get spike up to 99%,i have leave the bgp next-hop delay interval is 5 seconds untouch.

Could you please advise how we can further tune this value or anything i missing to do regarding this solution?

Thanks!

Regards,

Rex

Hello Rex,

I'm sorry that the BGP next-hop tracking feature does not provide any improvement in the router behaviour when the primary iBGP session fails.

I don't think that tuning the bgp next-hop delay can be of any help at this point.

You are probably facing platform limitations regarging the high number of BGP prefixes involved 646000 routes.

I'm afraid at this point or you can accept this behaviour or you need to think of an hardware upgrade that may end with a change of platform if you have already an RSP720.

Hope to help

Giuseppe

Hello Giuseppe,

Thanks for your reply,i have just adjust to only have one iBGP session from cisco to juniper,
and cisco have full views received from juniper router and with ospf running inside.
When we shutdown the bgp session juniper, cisco router will have very high cpu loading up to 100%,
is it this has to be expected as its need to send mass withdraw routing update to juniper for full internet table?

Can this cpu loading only affect the juniper router if the shutdown action took by juniper side?
Let's say when we have one internet connection with ISP,if the ISP got some incident and have shutdown the bgp session,
then it will affect our side and get cpu high loading but the mistake was not made by us.

I think we already overcome the bgp routing table limit,we are using RSP720-3CXL and have enough memory to store the whole internet routes.

FIB TCAM maximum routes :
=======================
Current :-
-------
IPv4 - 700k
MPLS - 4k (default)
IPv6 + IP Multicast - 160k (default)


Any reply would  be highly appreciated,thanks!

Regards,

Rex

Review Cisco Networking for a $25 gift card