cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2438
Views
5
Helpful
7
Replies

How to maximize available BGP upstream routers?

kinwai
Level 1
Level 1

In the setup, there will be 2 x BGP routers capable of doing up to 400g routing (non-cisco) and 2 more BGP cisco routers capable of doing multiple 10g routing. In day to day operation, definitely, there won't be such a crazy amount of traffic.

 

Just for the sake of design perspective, how can I ensure that I'm able to maximize all the 4 routers' routing capacity "in theory"???

(ignore the fact of uneven traffic distribution due to AS-path, just purely academy design) 

 

Original Plan: Pass traffic from the DC traffic (assuming it is public IP, no NAT required) using Nexus switches to the 2 x 400g router via load balancing (ie: BGP multipath using default route) injected to Nexus Core. However, the traffic destined for Router 2's upstream might end up in Router 1 first before getting re-routed to router 2.

-> Limited by the available bandwidth on the 2 x 400g router. Unable to use the 2 x 10g (ASR1002X) routers which are capable of routing as well?

 

Question : How can I design a more scalable network and reduce the need of traversing a random router and realize that this route can be only fulfilled by another router?

Hopefully, this can be scalable in the sense that this design will continue to work even if we go beyond 4 bgp routers with even more upstreams. 

 

Is full BGP table the only way to go? The core N9K is barely enough to hold full ipv4 table. 

 

Is there some tips on how to perform route summaries so that the nexus switches can receive all the bgp entries (without getting overloaded by the BGP entries) ? How to ensure the packets reaches the correct router the first time?

Any alternative suggestions? Use ECMP? VRRP/GLBP?

 

Just want to hear from the experts here how to solve this issue creatively? 

1 Accepted Solution

Accepted Solutions

Giuseppe Larosa
Hall of Fame
Hall of Fame

Hello @kinwai ,

having multiple full BGP tables on the Nexus 9000 is not a viable option with 880,000 prefixes in IPv4 .

 

Considering that you have :

400 Gbps on the two primary routers and only 10 Gbps or 2 x 10 GE on the two Cisco routers .

 

You can have default route pointing to 400 Gbps capable devices .

For using at least a little the two Cisco routers you can consider to inject some specific routes to the Nexus so that for those prefixes the most specific prefix is used and you would go out the Cisco routers.

For example you can choice to use the Cisco routers for prefixes originated in the AS of the upstream provider and for its direct customers using an appropriate AS path access-list

 

the following example 12345 is the supposed upstream AS for a Cisco router.

ip as-path access-list permit ^12345$

ip as-path access-list permit ^12345_ [1-9]([0-9])*$

 

^ beginning of AS path

_ separator between two AS numbers actually it is a space

$ end of AS path

 

>> Any alternative suggestions? Use ECMP? VRRP/GLBP?

 

GLBP is only good in client facing VLANs because its load balancing capabilities are only based on different answers to ARP request for default gateway made by the AVG  that can give in round robin or other strategies the MAC address of up to 4 forwarders.

In a case like yours once the Nexus makes the ARP request ( the same would happen with firewall or routers) it would use a single forwarder until it decides to send out another ARP request for the default gateway ( OK ARP timeout is less then 5 minutes in Nexus to avoid unicast flooding but the issue is stll there)

 

VRRP like HSRP provide a single virtual gateway and only using multiple groups you could achieve something.

 

ECMP is not suitable here as you have two routers with 400 Gbps and two with 2 x 20 Gbps a ratio of 20:1

 

The mentioned PfR is Cisco proprietary and even if all devices were Cisco I have doubts it is supported on Carrier grade routers like Cisco CRS based on IOS XR.

In addition PfR control plane scalability is limited  in comparison to 880,000 prefixes, but this is not a real issue.

Path optimization is an hard work that pays back only for those prefixes or target ASes with a great volume of traffic associated to it.

 

There would be little sense to use such a technology for a destination that is a target for 1 MB per week.

 

 

Hope to help

Giuseppe

 

View solution in original post

7 Replies 7

Joseph W. Doherty
Hall of Fame
Hall of Fame

"Any alternative suggestions? Use ECMP? VRRP/GLBP?"

In the past, one of the best solutions I found was using Cisco's PfR, which dynamically load balanced.  (NB: It also didn't need full routing tables.  If fact, if worked very well with just a default route!  [This because something like Internet BGP usually provides best path as least number of AS hops, while PfR can monitor, and react, to actual performance.])  This though, then, and likely still now, required all Cisco edge routers.

sounds good but i'm unable to find the limitation of PfR like how many routes /prefix it is able to track, i saw like top 100 kind of thing which is not so sufficient for a DC where thousands and thousands of users can come from anywhere. 

 

In order to get the PfR feature, it would have to be a router and in order to support multiple 100g, it is going to cost more than a bomb for cisco... 

 

Currently, the only way I figured out after pulling some hairs, is to use ECMP across multiple (2 or more) routers, each should support the same amount of bandwidth (some 100g and some 10g will not work well). Alternatively, i saw that GLBP might work where it does some predefined traffic distrbution like 5% for 10g router and 40% for 100g router kind of thing. 

 

Actually, i really hope to find a "cheat"/workaround to summarize the BGP table (reduced entries) so that the nexus can support them comfortably.  

 

 

Good question about number of routes PfR can track; that I don't know.

I have seen it use lots of memory tracking routes using Netflow and/or SLA (both of which it can use, "under the covers").

You're also correct about the cost of 100G capable routers.

Lastly, you're also correct about you can use GLBP, proportionally.  If fact, in one instance we had a remote site that was planned for two T1s, but was mistakenly ordered with a T1 and T3.  PfR can proportionally load balance, which it did for that site, but I also had GLBP proportionally load balance based on T1:T3, upon which PfR would "fine tune".  The big difference, though, is PfR looks at actual link loading while GLBP just round robins flows (proportionally).  So, GLBP can still be quite a bit off.  Further, although we didn't use these PfR features, PfR can also load balance based on other features like link cost and/or QoS importance.

One "problem" we ran into, with PfR, when using on MPLS/VPN clouds, it rerouted around network issues faster than our monitoring equipment would note performance issues.  The solution to that was "monitoring" changes PfR made and/or having QA traffic not rerouted by PfR.

Giuseppe Larosa
Hall of Fame
Hall of Fame

Hello @kinwai ,

having multiple full BGP tables on the Nexus 9000 is not a viable option with 880,000 prefixes in IPv4 .

 

Considering that you have :

400 Gbps on the two primary routers and only 10 Gbps or 2 x 10 GE on the two Cisco routers .

 

You can have default route pointing to 400 Gbps capable devices .

For using at least a little the two Cisco routers you can consider to inject some specific routes to the Nexus so that for those prefixes the most specific prefix is used and you would go out the Cisco routers.

For example you can choice to use the Cisco routers for prefixes originated in the AS of the upstream provider and for its direct customers using an appropriate AS path access-list

 

the following example 12345 is the supposed upstream AS for a Cisco router.

ip as-path access-list permit ^12345$

ip as-path access-list permit ^12345_ [1-9]([0-9])*$

 

^ beginning of AS path

_ separator between two AS numbers actually it is a space

$ end of AS path

 

>> Any alternative suggestions? Use ECMP? VRRP/GLBP?

 

GLBP is only good in client facing VLANs because its load balancing capabilities are only based on different answers to ARP request for default gateway made by the AVG  that can give in round robin or other strategies the MAC address of up to 4 forwarders.

In a case like yours once the Nexus makes the ARP request ( the same would happen with firewall or routers) it would use a single forwarder until it decides to send out another ARP request for the default gateway ( OK ARP timeout is less then 5 minutes in Nexus to avoid unicast flooding but the issue is stll there)

 

VRRP like HSRP provide a single virtual gateway and only using multiple groups you could achieve something.

 

ECMP is not suitable here as you have two routers with 400 Gbps and two with 2 x 20 Gbps a ratio of 20:1

 

The mentioned PfR is Cisco proprietary and even if all devices were Cisco I have doubts it is supported on Carrier grade routers like Cisco CRS based on IOS XR.

In addition PfR control plane scalability is limited  in comparison to 880,000 prefixes, but this is not a real issue.

Path optimization is an hard work that pays back only for those prefixes or target ASes with a great volume of traffic associated to it.

 

There would be little sense to use such a technology for a destination that is a target for 1 MB per week.

 

 

Hope to help

Giuseppe

 

BTW . . .

"In addition PfR control plane scalability is limited in comparison to 880,000 prefixes, but this is not a real issue."

When I used PfR, CPU impact seemed minimal as, as far as I know, it only analyzes active flows, again via Netflow and/or SLA.  Further, I think (?) such analysis can be restricted to just link load balancing, i.e. actual path performance, for flows, not necessary.

"Path optimization is an hard work that pays back only for those prefixes or target ASes with a great volume of traffic associated to it.

There would be little sense to use such a technology for a destination that is a target for 1 MB per week."

Really depends on your goals.  It's much like deciding whether QoS is worth its overhead.  Assuming your goal it not have a great volume of traffic, which saturates a link, be adverse to a destination with low bandwidth demand, "moving" that flow to a less utilized link can benefit that traffic.

The last example might also be addressed by QoS, yet consider if the problem for that traffic is somewhere in the WAN cloud.  In such cases, something like PfR, which can monitor actual flow performance, can redirect a flow to another egress path which may avoid the problem.

I've seen PfR, rapidly, deal with WAN cloud "brown outs" and "black outs".  I've also seen it redirect traffic to far side sites with multiple links, bypassing the link, at that site that was cloud egress congested.

Although, as might be surmised, I'm a fan of the technology, I neither suggest using just because it's "cool" or conversely suggest not using it because there are negatives in using it.  Like other technologies, you really need to consider its cost/benefit.

It will say this method works the best. By selectively leaking routes into nexus routing table to steer traffic into the selected routers. 

 

In a network, the well known 80/20 rules applies. 80% of the traffic will typically comes from a few AS numbers. 

 

 

mohAmed khAdr
Level 1
Level 1

How to ensure the packets reaches the correct router the first time? 

 ****Cisco supports Next-Hop Address Tracking by default. try add a delay value

under bgp

 next-hop address tracking delay 5 ( try a higher value if warranted)

!

then use Fast FO  to improve BGP time for adjacency changes

under bgp

 nei xx.xx.xx.xx update-source Lo0
 nei xx.xx.xx.xx fall-over

From Cisco Doc: "the response time of BGP to adjacency changes is improved, detecting the loss of the peer is not based on the interface state but is event driven such as a loss of the path to the IGP address used for peering".

!

HTH

mohamed

!

 

Review Cisco Networking for a $25 gift card