cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
3688
Views
5
Helpful
6
Comments
xthuijs
Cisco Employee
Cisco Employee

The question

loadbalancing is one of the more complex items in hardware forwarding. of course we have talked about it many years on cisco live (id 2904) with ever incrementing more detail. and there is the support forum article on loadbalancing.

while it is then all understood (hopefully how it works, it doesnt answer the question of what flow will take what path or member.

 

there are IOS tools like

show cef exact-route for ECMP 

and

bundle-hash BE ...

 

to help derive what path or member is taken for a given flow but there are 2 possible issues with that.

1) these tools run in the control plane and are effectively running a shadow call over what the hardware does. 

overtime many enhancements to the logic have been made, but this is not taken into consideration of the show commands at hand.

 

2) also one needs to be precisely sure what scenario we are evaluating: l2transport with what encap, where does it go out of, etc.the user also needs to know what will be used for the hash and in what condition.

 

for example, transporting pppoe over a VPWS in the past would fall back to mac hash. today the logic is able to interpret the pppoe header, scrape ip and do an L3/L4 hash. and is it going out of an l2transport AC or is it taking a pseudowire and that PW has different IGP paths to its t-ldp peer. some version dependencies are here too where there are cases the VCID is taken over the computed ingress hash.

 

Deriving a member

this is a multistep process:

0) get the routerID from the device we are producing the hash for 

1) define which fields are going to be used for the hash

2) produce a hash from these fields (based on CRC32) and identify the "bucket" number

    from the "asr9000 loadbalancing architecture" article, these buckets are distributed over the paths and members.

    the tool is attached to the document below

3) extract from the bundle show details how the members are assigned the bucket ID or via show cef the path id.

4) apply some basic modulo math to see what the bucket number would end up with on the member based on his "LON".

 

you now found the member taken.

Example

lets run through an actual example:

0) use the following command to identify the routerID:

RP/0/RP0/CPU0:XRdevice#show arm router-ids 
Wed Aug  4 09:11:49.444 EDT
Router-ID         Interface
66.109.2.98       Loopback0     

 

 1) for instance we have an L3 bundle interface and we want to see which member is taken for a particular flow set.

this means we'll feed L3/L4 info.

 

2) let's produce a hash:

xthuijs@XTHUIJS-M-R3WC fatlabel % ./xrhash -r 66.109.2.98 -n 24.175.148.53 -o 24.175.148.157 -p 49152 -q 9177 -f -x
Hash calculator FAT (version 3.0),(c) Oct-2018 Aug-2020,
xander thuijs CCIE#6775 Cisco Systems Int.

SIP: 414159925l (18af9435)
DIP: 414160029l (18af949d)
RID: 1114440290l (426d0262)
Sport: 49152 (c000)
Dport: 9177 (23d9)
Buffer dump:
 18 af 94 35 18 af 94 9d c0 00 23 d9 00 00 00 00 "...5......#....."
 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ""
CRC32 = 393450809
MD5 router id = 4223469357
RAWHASH = 464948748
Mpls label = 953868

HASH: IGP BGP LAG
BUCKET: 142 180 12

 the tool especially with the -x option dumps a lot of junk for debugging etc, without the -x the info is a bit more concise.

you can also use a grep to limit the output:

 

Example:

./xrhash -r 66.109.2.98 -n 24.175.148.53 -o 24.175.148.157 -p 49152 -q 2253 -f -x | egrep -A 1 LAG

HASH:   IGP     BGP     LAG

BUCKET: 56      233     40

 

3) identify the LON table for a particular bundle. this is a "series" based on the number of members.

for instance if you have a 4 member bundle. the LON id's are 0-3. for an 8 member it will be 0-7.

we want to know what LON id maps to what member:

 

Checking the members:

RP/0/RP0/CPU0:XRdevice#show interfaces bundle-ether 26

< snip >

    No. of members in this bundle: 8

      HundredGigE0/0/0/27          Full-duplex  100000Mb/s   Active         

      HundredGigE0/0/0/28          Full-duplex  100000Mb/s   Active         

      HundredGigE0/1/0/27          Full-duplex  100000Mb/s   Active         

      HundredGigE0/2/0/27          Full-duplex  100000Mb/s   Active         

      HundredGigE0/3/0/27          Full-duplex  100000Mb/s   Active         

      HundredGigE0/4/0/27          Full-duplex  100000Mb/s   Active         

      HundredGigE0/5/0/27          Full-duplex  100000Mb/s   Active         

      HundredGigE0/6/0/27          Full-duplex  100000Mb/s   Active         

< snip >

 

checking the LON table:

the location doesnt really matter, it should be the same for all LC's.

 

RP/0/RP/CPU0:XRdevice#show bundle load-balancing bundle-ether 7 location 0/2/CPU0
Bundle-Ether7
  Type:                  Ether (L3)
  Members <current/max>: 8/64
  Total Weighting:       8
  Load balance:          Default
< snip >

  Member Information:
    Port:                 LON  ULID  BW
    --------------------  ---  ----  --
    Hu0/2/0/2               7     7   1
    Hu0/2/0/3               1     6   1
    Hu0/2/0/4               5     5   1
    Hu0/2/0/5               2     4   1
    Hu0/2/0/6               3     3   1
    Hu0/5/0/0               6     1   1
    Hu0/5/0/1               4     0   1
    Hu0/6/0/1               0     2   1
< snip >

 4) Now that we have all the pieces together: the hash bucket and the LON table we can derive the member to be taken.

the series 0-7 is repeated over ever bucket range (0-255).

 

In our earlier example the hash bucket was 12.

  • you now modulo 12 over the number of members which is 8:
  • 12%8 = 1 with remainder 4.
  • this means our LON is 4.
  • LON 4 maps to Hu0/5/0/1

you can use netflow with the option outphyint that will resolve the bundle member of the NF record to see and verify.

Conclusion

it is still a complex path to derive and totally recognized. many have asked to fix the bundle hash (notably) but this is so release dependent. therefore I thought it was a better idea to have an offbox tool to do the computation so that it is not dependent on the sw algorithm or XR dependency anymore.

 

Comments
MariaSousa48787
Level 1
Level 1

Hi @xthuijs ,

 

i'm evaluating the best solution for L2VPN service configuration when we have PE to P and P to P 2x100G L3 Bundle interface where we have at PE side A9K-MOD400-SE/A9K-MPA-2X100GE LC and A9K-16X100GE-TR at P side. After read all documentation about loadbalance over Bundle and use this offbox tool i really couldn't understand how to know the bundle member the router will choose, because i can see at PE side that the traffic is passing on Hu0/6/0/0 Interface

A9K-LAB02 Monitor Time: 04:51:20 SysUptime: 3572:54:02

Protocol:General
Interface In(bps) Out(bps) InBytes/Delta OutBytes/Delta
Hu0/6/0/0 15000/ 0% 105.9M/ 0% 943.6T/1481 732.0T/26.5M
Hu0/7/0/0 105.0M/ 0% 139000/ 0% 409.5T/26.8M 825.1T/34573
BE122.2 102.4M/ 0% 104.6M/ 0% 1213T/0 1293T/0
Te0/5/0/14 98.9M/ 0% 99.0M/ 0% 79.9T/25.8M 66.1T/25.7M

 

But after with bundle-hash command it seems that is another:

RP/0/RSP0/CPU0:A9K-LAB02#bundle-hash members hundredGigE 0/6/0/0 HundredGigE 0/7/0/0 location 0/7/CPU0
Thu Nov 4 14:13:29.791 WET
Calculate Bundle-Hash for L2 or L3 or sub-int based: 2/3/4 [3]:
Enter traffic type (1:IPv4-inbound, 2:MPLS-inbound, 3:IPv6-inbound, 4:IPv4-MGSCP, 5:IPv6-MGSCP): [1]: 2
Entropy label: y/n [n]:
Number of ingress MPLS labels is 4 or less: y/n [y]:
Enter MPLS payload type (1:IPv4, 2:IPv6, 3:other): [1]: 3
Enter the bottom label in decimal (20-bit value) :24098

Link hashed [hash:9] to is HundredGigE0/7/0/0 ICL () LON 1 ifh 0x120000c0

 

and with this tool, it seems that the LON:1 will be choosen

 

[p@automationbox hashing]$ ./xrhash -v 24098 -r 172.25.200.6 -x

Hash calculator FAT (version 3.0),(c) Oct-2018 Aug-2020,
xander thuijs CCIE#6775 Cisco Systems Int.

SIP: 0l (0)
DIP: 0l (0)
RID: 2887370758l (ac19c806)
Sport: 0 (0)
Dport: 0 (0)
MD5 router id = 2288895513
RAWHASH = 951434135
HASH: IGP BGP LAG
BUCKET: 187 173 151

 

And here it is the LON assignment:

RP/0/RSP0/CPU0:A9K-LAB02#show bundle load-balancing bundle-ether 122.2 location 0/7/CPU0 | be LON
Port: LON ULID BW
-------------------- --- ---- --
Hu0/6/0/0 0 0 1
Hu0/7/0/0 1 1 1

Sub-interface Information:
Sub-interface Type Load Balance Locality
Hash Threshold
---------------------------- ---- ------------ ---------
Bundle-Ether122.2 L3 Default 65

 

In attachment i'm made a draw with the scenario and the several tests and respective results.

 

Question: i'm i doing the correct evaluation or i shouldn't use all these commands/tool?

 

Cenario_&_Several_Test_ResultsCenario_&_Several_Test_Results

Thank you in advance

Maria

xthuijs
Cisco Employee
Cisco Employee
the bundle hash on box command is not well maintained and is to be deprecated as it is near useless now.
the xrhash is up to date, but need to ensure the right info is fed in the tool for it to calculate correctly.

PE to P depending on rls will use l3l4 info.
p to p will use label for sure for l2vpn (but ensure you have control word enabled)

cheers!
xander
MariaSousa48787
Level 1
Level 1
 

sorry, but it seems that probably I'm not using the correct parameters on your tool, so please see the detailed draw and please help me to understand at the 2 identified points A and B:

A? : when the traffic goes to L3 BE core facing already have a VC-label, so i thought that i should use MPLS with Ethernet traffic, I'm i wrong?

B?: when traffic will go out from P to PE using L3 BE, because it is a PHP, the packet just will go with VC-label and because of that i thought that should be used MPLS with Ethernet, am i wrong, too?

 

Thank you so much

Maria

 

doubtsdoubts

 

xthuijs
Cisco Employee
Cisco Employee
may be a good idea to check out the cisco live’s id 2904 from sanfran 2014 to vegas 17 and the BCN 2019 one
which has a lot of details on various LB scenarios where we made some improvements over time as well.

standard l2vpn ingress pe would use vcid prior to 65.
in 65 onwards it can use the l3/l4 info for fat label and egress lb.

P router will use inner label always it being the vc label or fat label

egress pe would for bundle AC use the l2vpn loadbalancing directive.
l2 mac default l3 ip optional (recommended).

xander
MariaSousa48787
Level 1
Level 1

Hi Xander,

 

if we have a L2VPN VPWS and if the traffic that will send us is MACSEC, we couldn't use FAT or Entropy Label! In this case which will be the best solution in order to avoid that this customer consumes all Bundle Member?

 

Thanks in advance

Maria

xthuijs
Cisco Employee
Cisco Employee
yeah with macsec we’re screwed, there is not much more then l2 mac adds we can use.
but if this is router to router over an l2vpn vpws, than that also measn everything will be polarized becuae the macs don’t vary.
there is nothing we can do about that.
decryption would be worse (and a violation and defeats the purpose of macsec)
xander
Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Quick Links