cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
4068
Views
10
Helpful
20
Replies

ibgp peer stats recieving routes, then drops most...

ferrari01
Level 1
Level 1

Hey Guys, I have an interesting issue.

Setup:  Two 7000 series routers.

(Fictious ips used for example)

Router one: 5.5.5.1

                              ebgp Peers with Level3 (aprox 474,000 routes)

                              ebgp Peers with 5.5.5.2

Router two: 5.5.5.2

                              ebgp peers with Earthlink (aprox 474,000 routes)

                              ibgp peers with 5.5.5.1

Problem:

               The problem is with the ibgp session.  Router one will start to recieve routes from router two, I start the session...it starts getting routes in, but but when it hits a certian amount, usually around 200,000, it drops the routes down to around 46,000 routes...and stays there.  I cannot for the life of me understand why this is happening.  I should add the session isnt dropping, just the routes.

Router two sees 474K routes from both level3 and 5.5.5.1.  No problem.

Both routers had 512M of memory.  So i figured router1 may be running out, so I upgraded it to a NPE-G1 with 1GB memory.  Same problem is happening.  And since router2 only has 512MB and is getting full routes fine, I dont think its a memory issue.

Now, if somthing goes wrong with level3, then the routes from 5.5.5.2 start increasing until they reach full routes.  Its like im hitting some sort of max limits of routes....which is why i thought memory issue.

I really apreciate help with this as its driving me crazy!

--John

20 Replies 20

Thank you Vishesh!  Applied the weight 1000 to both routers, and wham....look at uplink1 now:

Neighbor                   V    AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd

Level3                        4  3356  555075    5302  2823713   30    0 3d16h      473892

Router2                     4 xxxx  626863  651686  2837483    0    0 3d12h      477535

Savvis (partial route)  4  3561   79786    5299  2837483    0    0 3d12h       35784

millan: Call me old school, but i prefer to recieve full routes at all sites.

***************** I really apreciate the help you guys, as this had me stumped. ******************

Im curious how this will affect traffic balance with the weight equal now.  We try to keep about 70% going through level3 and 30% through earthlink.  If level3 fails, 99% goes through level3.  We run OSPF on the core routers internally, and use that as well as a script mechnisim to switch which border router it points at for a default, so that in the event router1 or router2 totally goes down, traffic continues to flow.

You are welcome!! John, If you want to load-balance, First you have to identify the traffic's destinations and quantity, leaving through each router.

After that you can use route-map between IBGP peers and increase the weight for specific routes; till you get traffic load-balanced between each other the way you want.

Refer to following document, it contains some good case studies about BGP -

http://www.cisco.com/en/US/tech/tk365/technologies_tech_note09186a00800c95bb.shtml

And as mentioned in my post earlier following document contains BGP best path selection algorithm -

http://www.cisco.com/en/US/tech/tk365/technologies_tech_note09186a0080094431.shtml

-Vishesh

Hi Milan,

I am sorry, I agree with you about the BGP Split-horizon rule, but I didn't mention it in my post. I mentioned "Just like split-horizon rule in EIGRP"

I mean by that is "A local router propagates only the routes that are selected as best  to the Ebgp and Ibgp neighbors. However, the router never sends a route  back on the same BGP session upon which it was received and selected as a best-path for that prefix. When it picks a neighbor as the best next-hop, the router makes sure  that the neighbor is not pointing back to the local router. In order to  make this backward destination unreachable, a withdraw message is sent  to that neighbor." Which is exactly the case here.

But in Cisco's implemention this rule is bend a little and the route can be advertised to the neighbor who himself is the best-path for the prefix, if that neighbor is part of an update-group and the update-group contains other neighbors to whom we need to send the route. This update-group method is implemented to save memory cycles and processor consumption on the routers.

Second part is the ISP are peering, but that again won't affect route selection process in BGP. As these routers are not advertising the 470K routes back to the ISPs. Also, this is just one of the 470K routes.

Now the question is:  Should each of the routers prefer his eBGP neighbor for all prefixes received?

If yes, wouldn't it be easier to ask the providers to send just the default route instead of full BGP tables?

I think John needs to answer this question together with his LAN  topology - which router is used as the default GW by the LAN devices?

I agree with all these questions, but I am being a TAC engineer provide resolution based on the problem, I do not make any recommendations on design changes unless explicitly required.

So I think based on John's comment earlier in the post, "I understand BGP will not send an update to a peer if it considers that  neighbor a better path, but in that case it causes the problem that when  level3 goes down, its missing a lot of routes that were not sent.   Somthing isnt right here." increasing the weight would work.

-Vishesh

Hi Vishesh,

1) ad "A local router propagates only the routes that are selected as best  to the Ebgp and Ibgp neighbors. However, the router never sends a route  back on the same BGP session upon which it was received and selected as a best-path for that prefix. When it picks a neighbor as the best next-hop, the router makes sure  that the neighbor is not pointing back to the local router. In order to  make this backward destination unreachable, a withdraw message is sent  to that neighbor." Which is exactly the case here.

I don't think this is the case here.

As Router2 is picking a prefix received from Router1 via iBGP as the best one. So the next-hop is Router1 (using next-hop self), not the Router2 itself (the local router).

BTW, where does this mechanism come from? I've not found it in any BGP RFC, is it Cisco proprietary?

IMHO, the withdrawn is sent because the best path which had been advertised from Router2 to Router1 originally was the eBGP one received from his ISP originally. But now the best path is an iBGP one received from Router1. As iBGP received prefixes should not be advertised via iBGP, Router2 has no path to advertise anymore so the previously advertised path has to be withdrawn, I guess?

2) I agree preferring the eBGP received prefixes has a good sense.

But as shown in the example above, not always.

Possibly it would make a sense to prefer the eBGP prefixes received from the ISP but not those containing the other ISP AS number within the AS_PATH?

So to prefer eBGP prefixes not containing  _3356_  on Router2?

And vice versa, to prefer eBGP prefixes not containing  _13407_  on Router1?

Best regards,

Milan

Hi Milan,

      AS 3                         AS 4

       +                             +

       |                             |

       |                             |

       |ebgp                         |ebgp

       |                             |

  +----+-----+                 +-----+----+

  |          |                 |          |

  |          |                 |          |

  |    R1    |                 |    R2    |

  |          +-----------------+          |

  |          |      ibgp       |          |

  |          |      AS 12      |          |

  +----------+                 +----------+

Note in this example that BGP Split Horizon is disabled via route-reflector client, still R2 withdraws the prefixes from R1, if R2 opts to choose R1 as valid best path.

R2

R2#show ip bgp summary | be InQ

Neighbor        V    AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd

12.0.0.1        4    12      32      38       19    0    0 00:00:03        3

24.0.0.4        4     4      26      19       19    0    0 00:04:11        3

R2#show ip bgp | be Net

   Network          Next Hop            Metric LocPrf Weight Path

* i3.3.3.1/32       12.0.0.1                 0    100      0 3 i

*>                  24.0.0.4                 0             0 4 i

* i3.3.3.2/32       12.0.0.1                 0    100      0 3 i

*>                  24.0.0.4                 0             0 4 i

* i3.3.3.3/32       12.0.0.1                 0    100      0 3 i

*>                  24.0.0.4                 0             0 4 i

R2#

*Mar  1 00:36:40.835: BGP(0): 24.0.0.4 rcvd UPDATE w/ attr: nexthop 24.0.0.4, origin i, metric 0, path 4

*Mar  1 00:36:40.835: BGP(0): 24.0.0.4 rcvd 3.3.3.3/32

*Mar  1 00:36:40.835: BGP(0): 24.0.0.4 rcvd 3.3.3.2/32

*Mar  1 00:36:40.839: BGP(0): 24.0.0.4 rcvd 3.3.3.1/32

*Mar  1 00:36:40.843: BGP(0): Revise route installing 1 of 1 routes for 3.3.3.1/32 -> 24.0.0.4(main) to main IP table

*Mar  1 00:36:40.843: BGP(0): Revise route installing 1 of 1 routes for 3.3.3.2/32 -> 24.0.0.4(main) to main IP table

*Mar  1 00:36:40.847: BGP(0): Revise route installing 1 of 1 routes for 3.3.3.3/32 -> 24.0.0.4(main) to main IP table

R2#

*Mar  1 00:36:40.847: BGP(0): 12.0.0.1 NEXT_HOP is set to self for net 3.3.3.1/32,

*Mar  1 00:36:40.851: BGP(0): 12.0.0.1 send UPDATE (format) 3.3.3.1/32, next 12.0.0.2, metric 0, path 4

*Mar  1 00:36:40.851: BGP(0): 12.0.0.1 NEXT_HOP is set to self for net 3.3.3.2/32,

*Mar  1 00:36:40.855: BGP(0): 12.0.0.1 send UPDATE (prepend, chgflags: 0x820) 3.3.3.2/32, next 12.0.0.2, metric 0, path 4

*Mar  1 00:36:40.855: BGP(0): 12.0.0.1 NEXT_HOP is set to self for net 3.3.3.3/32,

*Mar  1 00:36:40.855: BGP(0): 12.0.0.1 send UPDATE (prepend, chgflags: 0x820) 3.3.3.3/32, next 12.0.0.2, metric 0, path 4

Event - routes from 24.0.0.4 have been advertised with a longer AS-Path, R2 chooses R1 as the best path and withdraws the prefixes from R1.

R2#show ip bgp summary | be InQ

Neighbor        V    AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd

12.0.0.1        4    12      38      46       25    0    0 00:06:21        3

24.0.0.4        4     4      34      25       25    0    0 00:10:29        3

R2#show ip bgp

   Network          Next Hop            Metric LocPrf Weight Path

*>i3.3.3.1/32       12.0.0.1                 0    100      0 3 i

*                   24.0.0.4                 0             0 4 4 3 i

*>i3.3.3.2/32       12.0.0.1                 0    100      0 3 i

*                   24.0.0.4                 0             0 4 4 3 i

*>i3.3.3.3/32       12.0.0.1                 0    100      0 3 i

*                   24.0.0.4                 0             0 4 4 3 i

R2#

*Mar  1 00:37:10.943: BGP(0): 24.0.0.4 rcvd UPDATE w/ attr: nexthop 24.0.0.4, origin i, metric 0, path 4 4 3

*Mar  1 00:37:10.943: BGP(0): 24.0.0.4 rcvd 3.3.3.3/32

*Mar  1 00:37:10.947: BGP(0): 24.0.0.4 rcvd 3.3.3.2/32

*Mar  1 00:37:10.947: BGP(0): 24.0.0.4 rcvd 3.3.3.1/32

*Mar  1 00:37:10.951: BGP(0): Revise route installing 1 of 1 routes for 3.3.3.1/32 -> 12.0.0.1(main) to main IP table

*Mar  1 00:37:10.951: BGP(0): Revise route installing 1 of 1 routes for 3.3.3.2/32 -> 12.0.0.1(main) to main IP table

*Mar  1 00:37:10.955: BGP(0): Revise route installing 1 of 1 routes for 3.3.3.3/32 -> 12.0.0.1(main) to main IP table

R2#

*Mar  1 00:37:10.955: BGP(0): 12.0.0.1 send unreachable 3.3.3.1/32

*Mar  1 00:37:10.959: BGP(0): 12.0.0.1 send UPDATE 3.3.3.1/32 -- unreachable

*Mar  1 00:37:10.959: BGP(0): 12.0.0.1 send UPDATE 3.3.3.2/32 -- unreachable

*Mar  1 00:37:10.959: BGP(0): 12.0.0.1 send UPDATE 3.3.3.3/32 -- unreachable

R2#show run | se router bgp

router bgp 12

no synchronization

bgp log-neighbor-changes

neighbor 12.0.0.1 remote-as 12

neighbor 12.0.0.1 route-reflector-client

neighbor 12.0.0.1 next-hop-self

neighbor 24.0.0.4 remote-as 4

neighbor 24.0.0.4 prefix-list n0thing out

no auto-summary

=================================================================================

R1

R1#show ip bgp summary | be InQ

Neighbor        V    AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd

12.0.0.2        4    12      51      42        4    0    0 00:10:34        3

13.0.0.3        4     3      29      27        4    0    0 00:24:19        3

R1#show ip bgp  

   Network          Next Hop            Metric LocPrf Weight Path

* i3.3.3.1/32       12.0.0.2                 0    100      0 4 i

*>                  13.0.0.3                 0             0 3 i

* i3.3.3.2/32       12.0.0.2                 0    100      0 4 i

*>                  13.0.0.3                 0             0 3 i

* i3.3.3.3/32       12.0.0.2                 0    100      0 4 i

*>                  13.0.0.3                 0             0 3 i

After Event

R1#

BGP(0): 12.0.0.2 rcv UPDATE about 3.3.3.1/32 -- withdrawn

BGP(0): 12.0.0.2 rcv UPDATE about 3.3.3.2/32 -- withdrawn

BGP(0): 12.0.0.2 rcv UPDATE about 3.3.3.3/32 -- withdrawn

R1#show ip bgp summary | be InQ

Neighbor        V    AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd

12.0.0.2        4    12      46      38        4    0    0 00:06:54        0

13.0.0.3        4     3      25      23        4    0    0 00:20:39        3

R1#show ip bgp         

   Network          Next Hop            Metric LocPrf Weight Path

*> 3.3.3.1/32       13.0.0.3                 0             0 3 i

*> 3.3.3.2/32       13.0.0.3                 0             0 3 i

*> 3.3.3.3/32       13.0.0.3                 0             0 3 i

So, believe me or not what I had explained is correct. And I do not read rfc's, but 1 thing I am sure of is this is how it works, on Cisco platforms, I have never tested this on other platforms.

-Vishesh

Hi Vishesh,

you are right.

I made a simple test with eBGP in my lab it the withdrawn was also sent!

So split horizon with poison-reverse was implemented for BGP by Cisco in fact?

I found this feature also mentioned in this Cisco Press book:

http://www.ciscopress.com/store/bgp-design-and-implementation-9781587051098

on page 283, e.g.

Interestingly no other BGP book/RFC mentioned that.

Good to discuss here to learn something new!

Thanks,

Milan

Review Cisco Networking products for a $25 gift card