BGP local pref - Page 2

Jon Marshall · ‎12-05-2013

I understand what local pref does but i have always been slightly confused by one aspect of the topology it is used in.

If 2 IBGP peers are using local pref to influence which path to take is it best practice to have a direct physical link between the 2 routers or is it perfectly normal to have traffic being routed via the same interface it came in on eg.

1 L3 switch (SW1) shares a common subnet with WAN routers (R1 and R2). The WAN routers are using IBGP to peer. Local pref has been setup so R1 is the preferred route for all networks learnt via EBGP. Now i have seen a number of posts on CSC where the recommendation to influence the outgoing traffic is to use local pref. So SW1 can send traffic to either R1 and R2. If traffic goes to R2 that router will see the preferred path is via R1 and so it will have to reroute the traffic back out of the interface it was received on and send it to R1.

To my mind it would be far easier to redistribute the EBGP learned routes into OSPF (using type 1s from R1 and type 2s from R2) or EIGRP and influence the metrics. So SW1 always knows that R1 is the best path.

If there was physical dedicated connection between R1 and R2 that would make more sense to me.

I only ask because, as i say, i have seen a number of posts with a similiar setup where local pref was the recommendation i wondered whether i was missing something in terms of my understanding.

Any comments welcome.

Jon

Joseph W. Doherty · ‎12-05-2013

Disclaimer

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

Jon, you're of course correct, bouncing some traffic off one BGP egress router, to another BGP egress router, is less than optimal, but how nonoptimal is it?

If your two BGP peers are physically close, redirected traffic might be subjected to one "needless" routing hop (the nonoptimal initial BGP egress router) and perhaps one or two "needless" L2 hops (to get to a nearby peer BGP egress router). How much additional latency (especially compared to end-to-end) and/or router load would this contribute? Does the extra latency and/or CPU really cause us to work to have more optimal routing?

On the issue of sharing the same interface for redirected traffic, again you're correct, using a different interface, perhaps one dedicated between R1 and R2 would avoid this issue. However, often the interior (LAN) interface has more available bandwidth than the exterior (WAN) interface? If it does, there may be sufficient bandwidth to provide for redirected traffic too.

Sure you could redistribute BGP into your IGP. However, perhaps setting iBGP on some of your interior routers would be better. For example, maybe only SW1 needs to iBGP peer with R1 and R2.

PS:

BTW, setting up local prefs to manage path selection works, but it's so 20th century . If possible, a 21th century approach might be used, such as using Cisco's PfR to manage path selection based on policies and actual path performance. PfR modifies routes, so if you were using it to manage BGP, its route churn could impact your IGP if doing redistribution. (And if you don't redistribute the churn, your IGP wouldn't be making optimal routing decisions.)

Jon Marshall · ‎12-05-2013

Joseph

I know you are right, the added latency of using high bandwidth links to hop back from one router to the other is minmal, but it just doesn't feel right. I know it's me but the idea of sending a packet to a router just for it to be sent back out the same inteface to get to the right router is just wrong

But of course you are right and Rick alluded to the same thing. And there are alternatives which we have covered here ie. BGP to IGP and influence the metrics, IBGP on internal L3 switch. Personally i have only ever used BGP to IGP redistribution.

As for PfR I have seen you refer to a few times in posts and every time i thought i should read up about it but then got sidetracked into something else. Maybe when i have finished with VSS and it's benefits with Campus design i'll get round to it.

Thanks for your comments.

Jon

Joseph W. Doherty · ‎12-05-2013

Disclaimer

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

Jon, I know exactly how you feel - letting the traffic bounce back off a router just feels wrong. However, in the grand scheme it can be the more cost effective option, and again, depending on the situation, it actually might be minimally adverse to the traffic.

Another approach that hasn't been discussed, would be to redistribute BGP into a IGP for just that, not your "normal" IGP (i.e. two IGPs on select routers - IGPs might be different protocol, or same protocol different process, or different VRFs). This is something unusual, but perhaps not all your interior IGP routers support BGP or you don't want/need the BGP routes on all of them. It also might avoid the peering issues related to dealing with more than a few iBGP routers.

Regarding OER/PfR, it's something I'm very taken with. Its improvement over "ordinary" dynamic routing, is somewhat akin to dynamic routing improvements over static routing. Instead of just keeping track of best path(s) based on static metrics, it constantly tracks best paths for actual performance end-to-end and uses those for routing table updates.

I enabled OER within an international company across two L3 VPN MPLS clouds. The only "problem" we then had, our performance monitoring tools stopped showing WAN cloud performance problems. OER would "see" a performance issue and redirect traffic before our performance monitoring "saw" the same issue.

The catalyst (pun not intended ) for trying OER was one day one of our cloud providers had a node, in England, black hole transit traffic, but routing looked fine. It took a while to figure out what the problem was and longer to work up temporary policies to not use one provider for some prefixes.

With OER, the same situation (and many others) is generally detected and dealt with, by OER, within a few seconds.

OER/PfR, I guess, might now also be considered an entry into the world of SDN.

PS:

Ah, VSS, dual member "stacking" for 6500s (and now 4500s). Actually if you don't like bouncing "needlessly" through network devices, that's something you can easily bump into with VSS. For L3, VSS can work against you. For L2, I like it.

Jon Marshall · ‎12-06-2013

Joseph

I haven't been around on the forums for a while but i don't see many questions on PfR. This doesn't necessarily mean it isn't widely used but CSC can give you a good idea about how much a technology is in use. Do you have any idea of how much it has been taken up by customers ? Of course it could just be that it is so easy to configure that no one ever has to ask about it but i suspect that isn't the case

Ah, VSS, dual member "stacking" for 6500s (and now 4500s). Actually if you don't like bouncing "needlessly" through network devices, that's something you can easily bump into with VSS. For L3, VSS can work against you. For L2, I like it.

Without wishing to turn this into a VSS thread can you elaborate on that point ie. why can L3 work against you.

Jon

Joseph W. Doherty · ‎12-06-2013

Disclaimer

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

Jon, I had noticed your absence, and your relatively recent return. Very glad to see you back!

I suspect PfR hasn't had a lot of uptake. If that's true, suspect it's because you need multiple "edge" paths, with an interior and exterior. When you first look at it, it also seems very complex but once you begin to understand it, it's really not all the complex. Unfortunately there's yet a Doyle book on the subject.

Why VSS can work against you, in L3, is because it's very feature hides better L3 paths.

Ideally VSS should have connections from each VSS member to all other devices. But in my experience, often they don't, especially for L3. For example, you might have a single "east" and "west" connection, each on a different VSS member. From one L3 device out, optimal path might be to the VSS member with the "east" connection, but the next L3 device sees the VSS pair as a single device. So, perhaps it sends its traffic to the "west" VSS member. Now that VSS member will redirect the traffic to its VSS mate (basically we're back to your concern about suboptimal traffic flows, on eBGP egress routers, in this post).

This situation can also arise in failure modes. Suppose a VSS member has a line card failure. From an L3 perspective, an adjoining L3 device won't know not to avoid sending some of its traffic to that device in lieu of sending it directly to the VSS member with the remaining egress path.

Cross traffic in a VSS pair is something you want to avoid because often the bandwidth is limited (often just a dual 10g pair is being used - consider a pair of 3750Xs or 3850s often have more bandwidth for a dual stack) and if there's congestion on the VSL link, you cannot define custom QoS prioritization for it. (Unlikely, but consider losing a 6900 line card [80 Gbps] whose egress now needs to be redirected to its VSS mate across the 20g[?] VSL.)

Another issue with VSS, a pair of L3 devices has a (very slightly) higher MTBF. This because separate L3 devices don't share the single point of failure, the VSS OS. (NB: the difference in MTBF for dual L3 vs. VSS is sort of akin to the difference between VSS vs. a single chassis with redundancy for everything [except, of course, the chassis itself].)

Yet another issue with VSS, a VSS member will always use their own local egress path; to avoid using the VSL. With L3, I could make the local device's egress and it's peer egress equal cost (or if EIGRP use unequal cost) to take advantage of the peer's egress bandwidth.

The above isn't an indictment against VSS. But I've found some engineers think VSS is a huge improvement vs. a fully redundant chassis or better than dual L3 in all aspects.

Jon Marshall · ‎12-06-2013

Joseph

Jon, I had noticed your absence, and your relatively recent return. Very glad to see you back!

Many thanks. I'm using CSC to get back up to speed and see just how much i have forgotten before i look for work next year. I have been out for a while so i'm trying to gauge how much i still know and whether i should continue in networking or look elsewhere

Thanks for the VSS pointers as well. Obviously i would never connect a L3 device to just one member chassis because of my aversion to suboptimal traffic paths

I understood everything except -

This situation can also arise in failure modes. Suppose a VSS member has a line card failure. From an L3 perspective, an adjoining L3 device won't know not to avoid sending some of its traffic to that device in lieu of sending it directly to the VSS member with the remaining egress path.

If the linecard fails doesn't the L3 device see the interface go down and immediately switch all it's traffic to the other chassis ? If the L3 device had an intermediary switch in between it would have to wait for timers before realising it had lost it's peering but i was assuming a direct connection from the L3 device to the member chassis.

Jon

Joseph W. Doherty · ‎12-06-2013

Disclaimer

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

If the linecard fails doesn't the L3 device see the interface go down and immediately switch all it's traffic to the other chassis ?   If the L3 device had an intermediary switch in between it would have to wait for timers before realising it had lost it's peering but i was assuming a direct connection from the L3 device to the member chassis.

Not the L3 device with a link to a failed card, but another L3 device with connections to both VSS members.

I.e

R1=VSS=R2

If VSS lose one of its connections to R2, R1 will still send traffic to both VSS members that transits R2.

vs.

R1-C1-R2

R1-C2-R2

If C1 loses its link to R2, R1 should only send traffic to C2.

Jon Marshall · ‎12-06-2013

Joseph

I understand now.

Thanks

Jon