Hello All, I can't believe, I'm not finding anything on the subject of running multiple EIGRP autonomous systems on the same set of routers. I'm hoping my Google skills are down. :)
Here's what we are trying to accomplish and the problem we are running into. Today we have two data Centers connected together over a 1Gb P2P link. We have remote sites connected to both Data Centers in a Dual Hub DMVPN configuration. The whole network is running in EIGRP AS 1 today. The goal is to break up the DCs into their own EIGRP autonomous system, and then deploy BGP between the DCs. So at the remote sites, they would be configured with both ASes. Tunnel interfaces going to DC1 would be in AS 1, and tunnels going to DC2 would be in AS2. The plan was to roll out the configuration with as little downtime as possible.
We are using Nexus 7k's as our core. That connects to a pair of ASR's that connect to the P2P link and terminates the DMVPN tunnels. N7Ks also connects to downstream 4500's as our access layer.
So at DC2 we decided to overlay (or run in parallel) EIGRP AS 2 with AS 1. In the hopes that both AS topologies would be close to the same in DC2. Then use BGP to redistribute AS1 @ DC1, and AS2 @ DC2. That way when we light up BGP over the P2P Link, the EIGRP AS1 routes would lose to BGP @ DC2. Then we would start removing AS 1 off all the devices at DC2.
With all the information I can find, I don't see a problem with that deployment. But we are running into a problem where the IOS devices (being the ASRs, and 4500's) are route poisoning the routes in AS2. AS1 is fine. The NX-OS devices are NOT doing this, where both AS1 and AS2 look fine for all routes in both topologies. But the advertised routes from the N7K for AS2 are being poisoned at the ASR's and 4500's (and of course not being advertised).
Here is an example of the problem:
Does anyone know why this is happening? Can someone point me to the documentation that describes this behavior?
Thanks for your time and help,
It has been my experience on IOS-based devices that if a particular EIGRP process is unable to install its own learned variant of a route into the routing table, it will not advertise this route further. In somewhat simplified and imprecise terms, a router should not advertise a route it is not using itself. This rule is specifically followed by RIP and EIGRP. OSPF and IS-IS can not observe this rule because of their underlying nature, and BGP does not observe this rule by default (because that is what we usually want to do) but it can be forced to.
So what happened in your case was that EIGRP instance "as2" was unable to install its own routes into the routing table. Therefore, this instance considered itself ineligible to even propagate these routes further because within its own "world" (realm of knowledge), the router is not using that particular route, and so propagating it further could introduce suboptimal routes or routing loops. Just think of this: Routers A---B---C connected in a row, RouterA advertising a network N to RouterB. RouterB has a static route for N pointing to RouterC. Obviously, RouterB won't be able to install the EIGRP-learned variant about N into its routing table because the static route has a lower AD. However, if RouterB still advertised this network to RouterC, a routing loop would occur - RouterC would point back to RouterB thanks to EIGRP, and RouterB points back to RouterC because of the static route.
These routes that could not be installed into the routing table because some other routing source was deemed more trustworthy or more preferred are called zero-successor routes, and EIGRP always flags them with infinite Feasible Distance to cause them to be route-poisoned. You can actually have exactly these routes printed from the topology table using the show eigrp address-family ... topology zero-successors command.
I am not acquainted with NX-OS implementation but it is possible that they are not doing this sanity check on NX-OS which would explain why your Nexus devices behaved differently.
Would this fit your observations?
Thank you for your reply. Yes, showing zero-successors is how we learnt that AS2 was killing the routes. I also know the rules on how EIGRP will process the routes into the RIB. Your statements about how EIGRP will kill routes that EIGRP will not install into the RIB are correct. As that is what we are experiencing. The question is where is the documentation on this action/function/feature/rule? If I can find that, maybe we can see away of disabling it. Or some work around.
After all, this behavior does not happen in the NX-OS.
Again thank you, you're getting me closer to the answer,
I am sorry to respond so late.
The question is where is the documentation on this action/function/feature/rule?
I was not able to find any definitive documentation on this. I know about this behavior from my own experiments and from discussions with people who had been involved in EIGRP development in the past. I also do not believe that this behavior can be configured.