cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
6415
Views
50
Helpful
52
Replies

EIGRP equal load balancing problem stuck with single link with Drop !

Dr.X
Level 2
Level 2

Hello Team ,

We have been running eigrp equal load balancing instead of EtherChannel between :

Cisco ASR ---> Cisco 3560.

we are balancing 4G traffic on six links of 1G speed.

But recently, with no recent change , we began to see that one of the Eigrp links is 1G while others are 400Mbps. which causes network drop all the time ...

From Eigrp, all routes have the same AD/Metric and equal loads of six links on the routing table.

Tried to clear BGP nei, Routing tables, and flush everything; one link always stays 1G while the others are 500 Mbps.

If i drop one link from the group, another link is filled with 1G and the same issue.

I added the 7th link to the eigrp balance, and the issue was solved all links get equal balance now.

Based on the search, i don't think the issue has to occur as Eigrp is much better than Etherchannel L3 solution.

My eigrp is simple :

router eigrp 4
maximum-paths 10
variance 128

Here is sample of the routing table :

D EX 10.14.2.152/32
[170/2560000768] via 172.26.40.2, 00:07:35, GigabitEthernet0/1/3
[170/2560000768] via 172.25.40.2, 00:07:35, GigabitEthernet0/1/6
[170/2560000768] via 172.24.40.2, 00:07:35, GigabitEthernet0/1/5
[170/2560000768] via 172.23.40.2, 00:07:35, GigabitEthernet0/1/2
[170/2560000768] via 172.22.40.2, 00:07:35, GigabitEthernet0/1/1
[170/2560000768] via 172.21.40.2, 00:07:35, GigabitEthernet0/0/2
[170/2560000768] via 172.20.40.2, 00:07:35, GigabitEthernet0/0/1

 

here is my router info :

Gateway-ASR1002#sh version
Cisco IOS XE Software, Version 03.13.00.S - Extended Support Release
Cisco IOS Software, ASR1000 Software (PPC_LINUX_IOSD-ADVENTERPRISEK9-M), Version 15.4(3)S, RELEASE SOFTWARE (fc11)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2014 by Cisco Systems, Inc.
Compiled Mon 28-Jul-14 04:11 by mcpre


Cisco IOS-XE software, Copyright (c) 2005-2014 by cisco Systems, Inc.
All rights reserved. Certain components of Cisco IOS-XE software are
licensed under the GNU General Public License ("GPL") Version 2.0. The
software code licensed under GPL Version 2.0 is free software that comes
with ABSOLUTELY NO WARRANTY. You can redistribute and/or modify such
GPL code under the terms of GPL Version 2.0. For more details, see the
documentation or "License Notice" file accompanying the IOS-XE software,
or the applicable URL provided on the flyer accompanying the IOS-XE
software.


ROM: IOS-XE ROMMON

Gateway-ASR1002 uptime is 24 weeks, 4 days, 17 hours, 39 minutes
Uptime for this control processor is 24 weeks, 4 days, 17 hours, 43 minutes
System returned to ROM by reload at 18:39:44 UTC Sun Aug 6 2017
System image file is "bootflash:asr1000rp1-adventerprisek9.03.13.00.S.154-3.S-ext.bin"
Last reload reason: PowerOn

 

This product contains cryptographic features and is subject to United
States and local country laws governing import, export, transfer and
use. Delivery of Cisco cryptographic products does not imply
third-party authority to import, export, distribute or use encryption.
Importers, exporters, distributors and users are responsible for
compliance with U.S. and local country laws. By using this product you
agree to comply with applicable laws and regulations. If you are unable
to comply with U.S. and local laws, return this product immediately.

A summary of U.S. laws governing Cisco cryptographic products may be found at:
http://www.cisco.com/wwl/export/crypto/tool/stqrg.html

If you require further assistance please contact us by sending email to
export@cisco.com.

cisco ASR1002 (2RU) processor (revision 2RU) with 1650497K/6147K bytes of memory.
Processor board ID FOX1807GBZW
12 Gigabit Ethernet interfaces
32768K bytes of non-volatile configuration memory.
4194304K bytes of physical memory.
7757823K bytes of eUSB flash at bootflash:.

Configuration register is 0x2102

Any one can help if this could be a hardware issue or a bug?

 

Thanks 

 

52 Replies 52

Very helpful thank you. As suspected EIGRP is in fact load balancing the routes equally. I have highlighted the information to check.

 

 

Gateway-ASR1002#Sh ip route 10.14.212.164 255.255.255.255
Routing entry for 10.14.212.164/32
Known via "eigrp 30", distance 170, metric 2560000768, type external
Redistributing via eigrp 30
Last update from 172.23.40.2 on GigabitEthernet0/1/2, 11:17:32 ago
Routing Descriptor Blocks:
172.26.40.2, from 172.26.40.2, 11:17:32 ago, via GigabitEthernet0/1/3
Route metric is 2560000768, traffic share count is 1
Total delay is 30 microseconds, minimum bandwidth is 1 Kbit
Reliability 1/255, minimum MTU 1 bytes
Loading 106/255, Hops 2
172.25.40.2, from 172.25.40.2, 11:17:32 ago, via GigabitEthernet0/1/6
Route metric is 2560000768, traffic share count is 1
Total delay is 30 microseconds, minimum bandwidth is 1 Kbit
Reliability 1/255, minimum MTU 1 bytes
Loading 138/255, Hops 2
172.24.40.2, from 172.24.40.2, 11:17:32 ago, via GigabitEthernet0/1/5
Route metric is 2560000768, traffic share count is 1
Total delay is 30 microseconds, minimum bandwidth is 1 Kbit
Reliability 1/255, minimum MTU 1 bytes
Loading 94/255, Hops 2
172.23.40.2, from 172.23.40.2, 11:17:32 ago, via GigabitEthernet0/1/2
Route metric is 2560000768, traffic share count is 1
Total delay is 30 microseconds, minimum bandwidth is 1 Kbit
Reliability 1/255, minimum MTU 1 bytes
Loading 104/255, Hops 2
* 172.22.40.2, from 172.22.40.2, 11:17:32 ago, via GigabitEthernet0/1/1
Route metric is 2560000768, traffic share count is 1
Total delay is 30 microseconds, minimum bandwidth is 1 Kbit
Reliability 1/255, minimum MTU 1 bytes
Loading 103/255, Hops 2
172.21.40.2, from 172.21.40.2, 11:17:32 ago, via GigabitEthernet0/0/2
Route metric is 2560000768, traffic share count is 1
Total delay is 30 microseconds, minimum bandwidth is 1 Kbit
Reliability 1/255, minimum MTU 1 bytes
Loading 101/255, Hops 2
172.20.40.2, from 172.20.40.2, 11:17:32 ago, via GigabitEthernet0/0/1
Route metric is 2560000768, traffic share count is 1
Total delay is 30 microseconds, minimum bandwidth is 1 Kbit
Reliability 1/255, minimum MTU 1 bytes
Loading 113/255, Hops 2

 

As you can see the traffic shar count is 1 across the board which means it is distributing the traffic exactly equal. Also noted was the LOAD of each interface. They are pretty much on par with each other as the interfaces loads are close to the same value. So EIGRP is doing exactly as its expected to do. I didn't see any post on it but what exactly are you using to determine that 1 link is using 1 GB while the others are using 400Mb of the link?

We also don't have your full config. There could be other configurations affecting the link utilization such as QoS or static routes.

 

Lastly you say the network drops, does that mean you lose your EIGRP neighborships? All of them? Are there any logs to indicate what the failure could be?

 

-David

Hello David, now there is no problem..... Already mentioned it was solved after I increased the eigrp neighbors from 6 to 7. I will try to put down one of the links and watch if the issue happens and paste the config above again.

Again, don't believe there's anything "unusual".  EIGRP is distributing flows, as equally as it can, but it doesn't analyze, or route, based on individual flow bandwidth usage.

In your @Dr.X last posted results, although the loading is pretty "equal" it's not equal.  I.e.:

Loading 106/255
Loading 138/255
Loading 94/255
Loading 104/255
Loading 103/255
Loading 101/255
Loading 113/255

Notice your least is 94 and your highest is 138, although four of the links have almost an equal loading, i.e.:

Loading 106/255
Loading 104/255
Loading 103/255
Loading 101/255

If you want load sharing which accounts for individual flow bandwidth usage, you'll need to use something like PfR.

Hello CISCO Team ,

the issue happened again, after i had my router rebooted due to power issue .



Gateway-ASR1002#sh ip eigrp neighbors
EIGRP-IPv4 Neighbors for AS(4)
EIGRP-IPv4 Neighbors for AS(30)
H Address Interface Hold Uptime SRTT RTO Q Seq
(sec) (ms) Cnt Num
6 172.22.40.2 Gi0/1/1 14 02:40:56 9 100 0 43387
3 172.26.40.2 Gi0/1/3 11 02:40:57 9 100 0 43386
2 172.20.40.2 Gi0/0/1 13 02:40:58 11 100 0 43385
1 172.23.40.2 Gi0/1/2 13 02:40:58 57 342 0 43388
0 172.21.40.2 Gi0/0/2 14 02:40:59 11 100 0 43384
Gateway-ASR1002#sh ip eigrp neighbors
EIGRP-IPv4 Neighbors for AS(4)
EIGRP-IPv4 Neighbors for AS(30)
H Address Interface Hold Uptime SRTT RTO Q Seq
(sec) (ms) Cnt Num
6 172.22.40.2 Gi0/1/1 14 02:40:58 26 156 0 43391
3 172.26.40.2 Gi0/1/3 14 02:41:00 26 156 0 43390
2 172.20.40.2 Gi0/0/1 14 02:41:00 27 162 0 43392
1 172.23.40.2 Gi0/1/2 14 02:41:01 65 390 0 43389
0 172.21.40.2 Gi0/0/2 14 02:41:01 27 162 0 43393

 

 

 

Gateway-ASR1002#sh ip cef 10.240.13.0/24 internal
10.240.13.0/24, epoch 2, RIB[I], refcnt 6, per-longest-match-prefix sharing
sources: RIB
feature space:
IPRM: 0x00028000
Broker: linked, distributed at 4th priority
ifnums:
GigabitEthernet0/0/1(9): 172.20.40.2
GigabitEthernet0/0/2(10): 172.21.40.2
GigabitEthernet0/1/1(13): 172.22.40.2
GigabitEthernet0/1/2(14): 172.23.40.2
GigabitEthernet0/1/3(15): 172.26.40.2
path list 44205F28, 3463 locks, per-longest-match-prefix, flags 0x4D [shble, hvsh, rif, hwcn]
path 3015B1FC, share 1/1, type attached nexthop, for IPv4
nexthop 172.20.40.2 GigabitEthernet0/0/1, IP adj out of GigabitEthernet0/0/1, addr 172.20.40.2 39663180
path 30159FB4, share 1/1, type attached nexthop, for IPv4
nexthop 172.21.40.2 GigabitEthernet0/0/2, IP adj out of GigabitEthernet0/0/2, addr 172.21.40.2 448DC220
path 3015B53C, share 1/1, type attached nexthop, for IPv4
nexthop 172.22.40.2 GigabitEthernet0/1/1, IP adj out of GigabitEthernet0/1/1, addr 172.22.40.2 448DBEE0
path 3015AFF4, share 1/1, type attached nexthop, for IPv4
nexthop 172.23.40.2 GigabitEthernet0/1/2, IP adj out of GigabitEthernet0/1/2, addr 172.23.40.2 448DBBA0
path 3015B46C, share 1/1, type attached nexthop, for IPv4
nexthop 172.26.40.2 GigabitEthernet0/1/3, IP adj out of GigabitEthernet0/1/3, addr 172.26.40.2 3BADD720
output chain:
IP adj out of GigabitEthernet0/1/2, addr 172.23.40.2 448DBBA0
Gateway-ASR1002#sh ip route 10.240.13.0 255.255.255.0
Routing entry for 10.240.13.0/24
Known via "eigrp 30", distance 170, metric 2560000768, type external
Redistributing via eigrp 30
Last update from 172.21.40.2 on GigabitEthernet0/0/2, 00:07:25 ago
Routing Descriptor Blocks:
172.26.40.2, from 172.26.40.2, 00:07:25 ago, via GigabitEthernet0/1/3
Route metric is 2560000768, traffic share count is 1
Total delay is 30 microseconds, minimum bandwidth is 1 Kbit
Reliability 1/255, minimum MTU 1 bytes
Loading 110/255, Hops 1
172.23.40.2, from 172.23.40.2, 00:07:25 ago, via GigabitEthernet0/1/2
Route metric is 2560000768, traffic share count is 1
Total delay is 30 microseconds, minimum bandwidth is 1 Kbit
Reliability 1/255, minimum MTU 1 bytes
Loading 200/255, Hops 1
172.22.40.2, from 172.22.40.2, 00:07:25 ago, via GigabitEthernet0/1/1
Route metric is 2560000768, traffic share count is 1
Total delay is 30 microseconds, minimum bandwidth is 1 Kbit
Reliability 1/255, minimum MTU 1 bytes
Loading 106/255, Hops 1
172.21.40.2, from 172.21.40.2, 00:07:25 ago, via GigabitEthernet0/0/2
Route metric is 2560000768, traffic share count is 1
Total delay is 30 microseconds, minimum bandwidth is 1 Kbit
Reliability 1/255, minimum MTU 1 bytes
Loading 112/255, Hops 1
* 172.20.40.2, from 172.20.40.2, 00:07:25 ago, via GigabitEthernet0/0/1
Route metric is 2560000768, traffic share count is 1
Total delay is 30 microseconds, minimum bandwidth is 1 Kbit
Reliability 1/255, minimum MTU 1 bytes
Loading 100/255, Hops 1

 

 




i collected above , 
1 172.23.40.2 Gi0/1/2 14 02:41:01 65 390 0 43389

This port always has + % than other ports and unequal distribution .

 

 

issue with CEF only check my previous reply. 

 

Please tell me which command do you need ?

@Harold Ritter 
please Mr. Harold, @Dr.X  have issue with EIGRP ECMP 
so after I check some cisco doc. 
I found that issue is CEF (if you can please check my previous reply and lab I done)
so can he use 
ip load-sharing per-packet to solve his problem.

thanks a lot for your support 
MHM 

"ip load-sharing per-packet to solve his problem."

Even if it does solve the OP's problem, often per-packet causes its own set of problems caused by out of order packet delivery (per individual flow).

Hello ,

"ip load-sharing per-packet to solve his problem.

is not supported .

Gateway-ASR1002(config)#int gigabitEthernet 0/1/6

Gateway-ASR1002(config-if)#ip load-sharing ?
per-destination Deterministic distribution
Gateway-ASR1002(config-if)#ip load-sharing per-destination ?

<cr>

Gateway-ASR1002(config-if)#ip load-sharing per-destination

then let me check the equivalent command for IOS XE  

Hello ,

i downgraded the IOS to :
System image file is "bootflash:asr1000rp1-adventerprisek9.03.11.00.S.154-1.S-std.bin"


i added more two links and i can see its loading equally .

Gateway-ASR1002#sh ip cef 1.11.11.1/32 internal
1.11.11.1/32, epoch 2, RIB[I], refcount 6, per-destination sharing
sources: RIB
feature space:
IPRM: 0x00028000
Broker: linked, distributed at 4th priority
ifnums:
GigabitEthernet0/0/1(9): 172.20.40.2
GigabitEthernet0/0/2(10): 172.21.40.2
GigabitEthernet0/1/1(13): 172.22.40.2
GigabitEthernet0/1/2(14): 172.23.40.2
GigabitEthernet0/1/3(15): 172.26.40.2
GigabitEthernet0/1/5(17): 172.24.40.2
GigabitEthernet0/1/6(18): 172.25.40.2
path 41C420B8, path list 41F17C38, share 1/1, type attached nexthop, for IPv4
nexthop 172.20.40.2 GigabitEthernet0/0/1, adjacency IP adj out of GigabitEthernet0/0/1, addr 172.20.40.2 42A11AE0
path 41C431C8, path list 41F17C38, share 1/1, type attached nexthop, for IPv4
nexthop 172.21.40.2 GigabitEthernet0/0/2, adjacency IP adj out of GigabitEthernet0/0/2, addr 172.21.40.2 42A11940
path 41C41E88, path list 41F17C38, share 1/1, type attached nexthop, for IPv4
nexthop 172.22.40.2 GigabitEthernet0/1/1, adjacency IP adj out of GigabitEthernet0/1/1, addr 172.22.40.2 42A11600
path 41C422E8, path list 41F17C38, share 1/1, type attached nexthop, for IPv4
nexthop 172.23.40.2 GigabitEthernet0/1/2, adjacency IP adj out of GigabitEthernet0/1/2, addr 172.23.40.2 42400AA0
path 41C43158, path list 41F17C38, share 1/1, type attached nexthop, for IPv4
nexthop 172.24.40.2 GigabitEthernet0/1/5, adjacency IP adj out of GigabitEthernet0/1/5, addr 172.24.40.2 424005C0
path 41C42048, path list 41F17C38, share 1/1, type attached nexthop, for IPv4
nexthop 172.25.40.2 GigabitEthernet0/1/6, adjacency IP adj out of GigabitEthernet0/1/6, addr 172.25.40.2 42400420
path 41C42208, path list 41F17C38, share 1/1, type attached nexthop, for IPv4
nexthop 172.26.40.2 GigabitEthernet0/1/3, adjacency IP adj out of GigabitEthernet0/1/3, addr 172.26.40.2 42400C40
output chain:
loadinfo 393E9DCC, per-session, 7 choices, flags 0003, 2748 locks
flags: Per-session, for-rx-IPv4
14 hash buckets
< 0 > IP adj out of GigabitEthernet0/0/1, addr 172.20.40.2 42A11AE0
< 1 > IP adj out of GigabitEthernet0/0/2, addr 172.21.40.2 42A11940
< 2 > IP adj out of GigabitEthernet0/1/1, addr 172.22.40.2 42A11600
< 3 > IP adj out of GigabitEthernet0/1/2, addr 172.23.40.2 42400AA0
< 4 > IP adj out of GigabitEthernet0/1/5, addr 172.24.40.2 424005C0
< 5 > IP adj out of GigabitEthernet0/1/6, addr 172.25.40.2 42400420
< 6 > IP adj out of GigabitEthernet0/1/3, addr 172.26.40.2 42400C40
< 7 > IP adj out of GigabitEthernet0/0/1, addr 172.20.40.2 42A11AE0
< 8 > IP adj out of GigabitEthernet0/0/2, addr 172.21.40.2 42A11940
< 9 > IP adj out of GigabitEthernet0/1/1, addr 172.22.40.2 42A11600
<10 > IP adj out of GigabitEthernet0/1/2, addr 172.23.40.2 42400AA0
<11 > IP adj out of GigabitEthernet0/1/5, addr 172.24.40.2 424005C0
<12 > IP adj out of GigabitEthernet0/1/6, addr 172.25.40.2 42400420
<13 > IP adj out of GigabitEthernet0/1/3, addr 172.26.40.2 42400C40
Subblocks:
None
Gateway-ASR1002#



I think there is something wrong related to the IOS
asr1000rp1-adventerprisek9.03.13.00.S.154-3.S-ext.bin


Still not sure whats going on   ....

Hello ,

"ip load-sharing per-packet to solve his problem." 

is not supported .

I didn't endorse it as a solution, just cautioned on using it, but I'm a bit surprised it's not a supported option.  Yet, in @MHM Cisco World's referenced document, there are notes about it not being supported on some platforms.  Referenced document is "old", i.e. predates ASRs, but as ASR's QXPs are, I believe, the evolution of the PXFs, likely they may have a similar limitation. 

BTW, I have actually used CEF per-packet load sharing, and it does a great job obtaining, often, load balancing.

Hi @MHM Cisco World ,

As other have mentioned, using per packet load sharing is not recommended as it can l lead to some out of order packet delivery. As far as the original issue is concerned, the cef load distribution will never be perfect between all available next hops, but the more flows you have the more evenly distributed the traffic will be.

Regards,
Harold Ritter, CCIE #4168 (EI, SP)

the problem as I search is the CEF when some link remove out more load in other only one link and do not distribute the load among other link. 


from previous my comment 

 

< 0 > IP adj out of GigabitEthernet0/0/1, addr 172.20.40.2 38C581C0
< 1 > IP adj out of GigabitEthernet0/0/2, addr 172.21.40.2 38C58EC0
< 2 > IP adj out of GigabitEthernet0/1/1, addr 172.22.40.2 38C59BC0
< 3 > IP adj out of GigabitEthernet0/1/2, addr 172.23.40.2 38C58D20
< 4 > IP adj out of GigabitEthernet0/0/1, addr 172.24.40.2 38C589E0
< 5 > IP adj out of GigabitEthernet0/0/2, addr 172.25.40.2 38C58500
< 6 > IP adj out of GigabitEthernet0/1/1, addr 172.26.40.2 38C596E0
< 7 > IP adj out of GigabitEthernet0/1/2, addr 172.20.40.2 38C581C0
< 8 > IP adj out of GigabitEthernet0/0/1, addr 172.21.40.2 38C58EC0
< 9 > IP adj out of GigabitEthernet0/0/2, addr 172.22.40.2 38C59BC0
<10 > IP adj out of GigabitEthernet0/1/1, addr 172.23.40.2 38C58D20
<11 > IP adj out of GigabitEthernet0/1/2, addr 172.24.40.2 38C589E0
<12 > IP adj out of GigabitEthernet0/0/1, addr 172.25.40.2 38C58500 <<- 
<13 > IP adj out of GigabitEthernet0/0/2, addr 172.26.40.2 38C596E0 <<- each port use three times except the 0/0/1 and 0/0/2 they use four times. 

Hi @MHM Cisco World ,

This does not seem like normal behavior to me, but rather like a bug.

Regards,

Regards,
Harold Ritter, CCIE #4168 (EI, SP)