cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
3706
Views
50
Helpful
52
Replies

EIGRP equal load balancing problem stuck with single link with Drop !

Dr.X
Level 2
Level 2

Hello Team ,

We have been running eigrp equal load balancing instead of EtherChannel between :

Cisco ASR ---> Cisco 3560.

we are balancing 4G traffic on six links of 1G speed.

But recently, with no recent change , we began to see that one of the Eigrp links is 1G while others are 400Mbps. which causes network drop all the time ...

From Eigrp, all routes have the same AD/Metric and equal loads of six links on the routing table.

Tried to clear BGP nei, Routing tables, and flush everything; one link always stays 1G while the others are 500 Mbps.

If i drop one link from the group, another link is filled with 1G and the same issue.

I added the 7th link to the eigrp balance, and the issue was solved all links get equal balance now.

Based on the search, i don't think the issue has to occur as Eigrp is much better than Etherchannel L3 solution.

My eigrp is simple :

router eigrp 4
maximum-paths 10
variance 128

Here is sample of the routing table :

D EX 10.14.2.152/32
[170/2560000768] via 172.26.40.2, 00:07:35, GigabitEthernet0/1/3
[170/2560000768] via 172.25.40.2, 00:07:35, GigabitEthernet0/1/6
[170/2560000768] via 172.24.40.2, 00:07:35, GigabitEthernet0/1/5
[170/2560000768] via 172.23.40.2, 00:07:35, GigabitEthernet0/1/2
[170/2560000768] via 172.22.40.2, 00:07:35, GigabitEthernet0/1/1
[170/2560000768] via 172.21.40.2, 00:07:35, GigabitEthernet0/0/2
[170/2560000768] via 172.20.40.2, 00:07:35, GigabitEthernet0/0/1

 

here is my router info :

Gateway-ASR1002#sh version
Cisco IOS XE Software, Version 03.13.00.S - Extended Support Release
Cisco IOS Software, ASR1000 Software (PPC_LINUX_IOSD-ADVENTERPRISEK9-M), Version 15.4(3)S, RELEASE SOFTWARE (fc11)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2014 by Cisco Systems, Inc.
Compiled Mon 28-Jul-14 04:11 by mcpre


Cisco IOS-XE software, Copyright (c) 2005-2014 by cisco Systems, Inc.
All rights reserved. Certain components of Cisco IOS-XE software are
licensed under the GNU General Public License ("GPL") Version 2.0. The
software code licensed under GPL Version 2.0 is free software that comes
with ABSOLUTELY NO WARRANTY. You can redistribute and/or modify such
GPL code under the terms of GPL Version 2.0. For more details, see the
documentation or "License Notice" file accompanying the IOS-XE software,
or the applicable URL provided on the flyer accompanying the IOS-XE
software.


ROM: IOS-XE ROMMON

Gateway-ASR1002 uptime is 24 weeks, 4 days, 17 hours, 39 minutes
Uptime for this control processor is 24 weeks, 4 days, 17 hours, 43 minutes
System returned to ROM by reload at 18:39:44 UTC Sun Aug 6 2017
System image file is "bootflash:asr1000rp1-adventerprisek9.03.13.00.S.154-3.S-ext.bin"
Last reload reason: PowerOn

 

This product contains cryptographic features and is subject to United
States and local country laws governing import, export, transfer and
use. Delivery of Cisco cryptographic products does not imply
third-party authority to import, export, distribute or use encryption.
Importers, exporters, distributors and users are responsible for
compliance with U.S. and local country laws. By using this product you
agree to comply with applicable laws and regulations. If you are unable
to comply with U.S. and local laws, return this product immediately.

A summary of U.S. laws governing Cisco cryptographic products may be found at:
http://www.cisco.com/wwl/export/crypto/tool/stqrg.html

If you require further assistance please contact us by sending email to
export@cisco.com.

cisco ASR1002 (2RU) processor (revision 2RU) with 1650497K/6147K bytes of memory.
Processor board ID FOX1807GBZW
12 Gigabit Ethernet interfaces
32768K bytes of non-volatile configuration memory.
4194304K bytes of physical memory.
7757823K bytes of eUSB flash at bootflash:.

Configuration register is 0x2102

Any one can help if this could be a hardware issue or a bug?

 

Thanks 

 

52 Replies 52

Yes please can you check if it bug.
thanks for support 

and regarding my previous solution 
I depend on this Cisco Doc.
Per-packet load balancing allows the router to send successive data packets over paths without regard to individual hosts or user sessions. It uses the round-robin method to determine which path each packet takes to the destination. Per-packet load balancing ensures balancing over multiple links.
 Troubleshooting Load Balancing Over Parallel Links Using Cisco Express Forwarding - Cisco

"Per-packet load balancing ensures balancing over multiple links."

That's 100% correct.

What I tried to convey, which @Harold Ritter also noted, is using this technique often causes out of order packet delivery, which as I also noted "causes its own set of problems" or why Harold also likely wrote "per packet load sharing is not recommended".

Further, found within the referenced document, for per-packet, disadvantages:

Packets for a given source-destination host pair might take different paths, which could introduce reordering of packets. This is not recommended for Voice over IP (VoIP) and other flows that require in-sequence delivery.

Although TCP does insure in order packet delivery, TCP's fast retransmit (using common dup ACK counts) is often unnecessarily triggered by out of order packet delivery.

Possibly, "the cure is worst than the disease".

Joseph W. Doherty
Hall of Fame
Hall of Fame

BTW, one of our earlier prolific posters, a former Cisco person, usually strongly recommended against using EIGRP's unequal load balancing feature.

Looking at your OP, see you're using a variance value of 128, rather than the default of 1 (supporting only ECMP).

I haven't reread all the replies, but has using a variance of 1 been tried, as it looks to me you only need ECMP?

BTW, I'm not saying that using a variance of 128 is wrong, or even "bad", but if its not actually needed, disabling it might make a difference.  I.e. perhaps worth trying.

Hello Team ,

Few mins latter .

it seems the router automatically change his  as below :

Gateway-ASR1002#sh ip cef 10.12.5.0/24 internal
10.12.5.0/24, epoch 2, RIB[I], refcount 6, per-longest-match-prefix sharing
sources: RIB
feature space:
IPRM: 0x00028000
Broker: linked, distributed at 4th priority
ifnums:
GigabitEthernet0/0/1(9): 172.20.40.2
GigabitEthernet0/0/2(10): 172.21.40.2
GigabitEthernet0/1/1(13): 172.22.40.2
GigabitEthernet0/1/2(14): 172.23.40.2
GigabitEthernet0/1/3(15): 172.26.40.2
GigabitEthernet0/1/5(17): 172.24.40.2
GigabitEthernet0/1/6(18): 172.25.40.2
path 41C41D38, path list 41F18358, share 1/1, type attached nexthop, for IPv4
nexthop 172.20.40.2 GigabitEthernet0/0/1, adjacency IP adj out of GigabitEthernet0/0/1, addr 172.20.40.2 42400900
path 41C40448, path list 41F18358, share 1/1, type attached nexthop, for IPv4
nexthop 172.21.40.2 GigabitEthernet0/0/2, adjacency IP adj out of GigabitEthernet0/0/2, addr 172.21.40.2 42400AA0
path 41C425F8, path list 41F18358, share 1/1, type attached nexthop, for IPv4
nexthop 172.22.40.2 GigabitEthernet0/1/1, adjacency IP adj out of GigabitEthernet0/1/1, addr 172.22.40.2 423FF0A0
path 41C43548, path list 41F18358, share 1/1, type attached nexthop, for IPv4
nexthop 172.23.40.2 GigabitEthernet0/1/2, adjacency IP adj out of GigabitEthernet0/1/2, addr 172.23.40.2 423FF3E0
path 41C43C48, path list 41F18358, share 1/1, type attached nexthop, for IPv4
nexthop 172.24.40.2 GigabitEthernet0/1/5, adjacency IP adj out of GigabitEthernet0/1/5, addr 172.24.40.2 42400760
path 41C407C8, path list 41F18358, share 1/1, type attached nexthop, for IPv4
nexthop 172.25.40.2 GigabitEthernet0/1/6, adjacency IP adj out of GigabitEthernet0/1/6, addr 172.25.40.2 423FFF40
path 41C42E48, path list 41F18358, share 1/1, type attached nexthop, for IPv4
nexthop 172.26.40.2 GigabitEthernet0/1/3, adjacency IP adj out of GigabitEthernet0/1/3, addr 172.26.40.2 423FF580
output chain: IP adj out of GigabitEthernet0/0/1, addr 172.20.40.2 42400900



And now no longer equal sharing again .... is it a bug  or could be Hardware issue?

Hello Team , 

i added link # 8 .

 

Unfortunatelty , same issue , there must be a link that is 100 % utilized .

 

 

Gateway-ASR1002#sh ip cef 176.58.74.58/32 internal
12.202.74.58/32, epoch 2, RIB[I], refcount 6, per-destination sharing
sources: RIB
feature space:
IPRM: 0x00028000
Broker: linked, distributed at 4th priority
ifnums:
GigabitEthernet0/0/1(9): 172.20.40.2
GigabitEthernet0/0/2(10): 172.21.40.2
GigabitEthernet0/1/1(13): 172.22.40.2
GigabitEthernet0/1/2(14): 172.23.40.2
GigabitEthernet0/1/3(15): 172.26.40.2
GigabitEthernet0/1/5(17): 172.24.40.2
GigabitEthernet0/1/6(18): 172.25.40.2
GigabitEthernet0/1/7(19): 172.27.40.2
path 41C42C18, path list 41F16F78, share 1/1, type attached nexthop, for IPv4
nexthop 172.20.40.2 GigabitEthernet0/0/1, adjacency IP adj out of GigabitEthernet0/0/1, addr 172.20.40.2 42400900
path 41C43628, path list 41F16F78, share 1/1, type attached nexthop, for IPv4
nexthop 172.21.40.2 GigabitEthernet0/0/2, adjacency IP adj out of GigabitEthernet0/0/2, addr 172.21.40.2 42400AA0
path 41C41478, path list 41F16F78, share 1/1, type attached nexthop, for IPv4
nexthop 172.22.40.2 GigabitEthernet0/1/1, adjacency IP adj out of GigabitEthernet0/1/1, addr 172.22.40.2 423FF0A0
path 41C426D8, path list 41F16F78, share 1/1, type attached nexthop, for IPv4
nexthop 172.23.40.2 GigabitEthernet0/1/2, adjacency IP adj out of GigabitEthernet0/1/2, addr 172.23.40.2 423FF3E0
path 41C41C58, path list 41F16F78, share 1/1, type attached nexthop, for IPv4
nexthop 172.24.40.2 GigabitEthernet0/1/5, adjacency IP adj out of GigabitEthernet0/1/5, addr 172.24.40.2 42400760
path 41C40758, path list 41F16F78, share 1/1, type attached nexthop, for IPv4
nexthop 172.25.40.2 GigabitEthernet0/1/6, adjacency IP adj out of GigabitEthernet0/1/6, addr 172.25.40.2 423FFF40
path 41C40988, path list 41F16F78, share 1/1, type attached nexthop, for IPv4
nexthop 172.26.40.2 GigabitEthernet0/1/3, adjacency IP adj out of GigabitEthernet0/1/3, addr 172.26.40.2 423FF580
path 41C409F8, path list 41F16F78, share 1/1, type attached nexthop, for IPv4
nexthop 172.27.40.2 GigabitEthernet0/1/7, adjacency IP adj out of GigabitEthernet0/1/7, addr 172.27.40.2 4201F4C0
output chain:
loadinfo 41A8AB04, per-session, 8 choices, flags 0003, 2884 locks
flags: Per-session, for-rx-IPv4
16 hash buckets
< 0 > IP adj out of GigabitEthernet0/0/1, addr 172.20.40.2 42400900
< 1 > IP adj out of GigabitEthernet0/0/2, addr 172.21.40.2 42400AA0
< 2 > IP adj out of GigabitEthernet0/1/1, addr 172.22.40.2 423FF0A0
< 3 > IP adj out of GigabitEthernet0/1/2, addr 172.23.40.2 423FF3E0
< 4 > IP adj out of GigabitEthernet0/1/5, addr 172.24.40.2 42400760
< 5 > IP adj out of GigabitEthernet0/1/6, addr 172.25.40.2 423FFF40
< 6 > IP adj out of GigabitEthernet0/1/3, addr 172.26.40.2 423FF580
< 7 > IP adj out of GigabitEthernet0/1/7, addr 172.27.40.2 4201F4C0
< 8 > IP adj out of GigabitEthernet0/0/1, addr 172.20.40.2 42400900
< 9 > IP adj out of GigabitEthernet0/0/2, addr 172.21.40.2 42400AA0
<10 > IP adj out of GigabitEthernet0/1/1, addr 172.22.40.2 423FF0A0
<11 > IP adj out of GigabitEthernet0/1/2, addr 172.23.40.2 423FF3E0
<12 > IP adj out of GigabitEthernet0/1/5, addr 172.24.40.2 42400760
<13 > IP adj out of GigabitEthernet0/1/6, addr 172.25.40.2 423FFF40
<14 > IP adj out of GigabitEthernet0/1/3, addr 172.26.40.2 423FF580
<15 > IP adj out of GigabitEthernet0/1/7, addr 172.27.40.2 4201F4C0
Subblocks:
None
Gateway-ASR1002#sh ip ei
Gateway-ASR1002#sh ip eigrp ne
Gateway-ASR1002#sh ip eigrp neighbors
EIGRP-IPv4 Neighbors for AS(4)
EIGRP-IPv4 Neighbors for AS(30)
H Address Interface Hold Uptime SRTT RTO Q Seq
(sec) (ms) Cnt Num
7 172.27.40.2 Gi0/1/7 14 00:32:05 8 100 0 160175
6 172.23.40.2 Gi0/1/2 14 00:41:07 7 100 0 160180
5 172.26.40.2 Gi0/1/3 12 00:41:07 7 100 0 160179
4 172.24.40.2 Gi0/1/5 13 00:41:08 52 312 0 160182
3 172.25.40.2 Gi0/1/6 13 00:41:08 7 100 0 160177
2 172.22.40.2 Gi0/1/1 14 00:41:09 7 100 0 160178
1 172.20.40.2 Gi0/0/1 12 00:41:09 6 100 0 160181
0 172.21.40.2 Gi0/0/2 12 00:41:09 7 100 0 160176
Gateway-ASR1002#

 

 

There is something not logical .

 

Sometime i see per-longest-match-prefix sharing and sometimes i see  per-destination sharing .

 

Does the ASR change these options automatically ?

I want to confirm the "issue/problem" is occasionally one link goes to 100%, possibly dropping packets, while all the other links are showing a much lower % also all those links showing about the same %.  E.g. one link 100% and possibly drops incrementing while all the other links about 40%.

Above, more or less, an accurate description?

Correct .

Eigror is not doing equal load-balance or whats supposed to be done.

 

Thanks 

In that case, as I may have replied earlier, it can be normal behavior (and probably is)!

Also why such routing multi link usage property called load sharing not load balancing.

If you wish, I can explain why this can happen and/or alternative approachs to avoid this being a problem or come closer to LB results.