L2 Bundle interface load-balancing problem on Cisco ASR9K
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-15-2019 09:14 AM - edited 08-15-2019 10:07 AM
Hello everyone,
We're having a little issue which we've been struggling with over the past few days.
We have a PE with a LACP L2 bundle-ethernet interface comprised of two GigabitEthernet physical interfaces, which are facing customer equipment. We provide a L2VPN tagged service, so such bundle acts as one end of a pseudowire over MPLS.
The problem is that we can't manage to balance the traffic over both members of the bundle in the outbound direction; there's only one link being used while the other remains idle no matter the traffic load on the "working" interface, which gets to a point of saturation. The ingress traffic, however, is correctly balanced.
According to our current l2vpn and bundle-ether configurations the hash calculation for the load-balance should be done based on src-ip-dst calculations, as pointed out in the following article:
asr9000-xr-load-balancing-architecture-and-characteristics/ta-p/3124809
So please, if anyone could shed some light over this matter we'd be greatly thankful.
We share our current PE configuration scripts (of the interfaces and l2vpn) in the following lines:
interface Bundle-Ether100 mtu 9216 bundle load-balancing hash dst-ip ! interface Bundle-Ether100.200 l2transport encapsulation dot1q 200 ! interface GigabitEthernet0/0/1/0 bundle id 100 mode active ! interface GigabitEthernet0/0/1/1 bundle id 100 mode active ! l2vpn load-balancing flow src-dst-ip pw-class mpls encapsulation mpls ! ! xconnect group Transport p2p Transport interface Bundle-Ether100.200 neighbor ipv4 10.1.200.2 pw-id 200 pw-class mpls ! ! ! !
And here's some troubleshooting info that could be useful:
- Monitor interface, where we can see how the traffic is correctly balanced at the ingress but not at the eggress:
RP/0/RSP0/CPU0:router-pe#monitor interface Protocol:General Interface In(bps) Out(bps) InBytes/Delta OutBytes/Delta BE100.200 (statistics not available) BE100 46.1M/ 2% 545.7M/ 27% 53.2T/13.4M 642.9T/133.5M Gi0/0/1/0 23.1M/ 2% 547.3M/ 54% 26.6T/6.0M 638.3T/145.9M Gi0/0/1/1 23.2M/ 2% 21000/ 0% 26.6T/6.0M 4.6T/518
- Some info about the load-balancing:
RP/0/RSP0/CPU0:router-pe#show bundle load-balancing bundle-ether 100 detail location 0/0/CPU0 Bundle-Ether100 Type: Ether (L3) Members <current/max>: 2/64 Total Weighting: 2 Load balance: Dst IP Locality threshold: 65 Avoid rebalancing? False Sub-interfaces: 1 Member Information: Port: LON ULID BW -------------------- --- ---- -- Gi0/0/1/0 0 0 1 Gi0/0/1/1 1 1 1 Sub-interface Information: Sub-interface Type Load Balance Locality Hash Threshold ---------------------------- ---- ------------ --------- Bundle-Ether100.200 L2 Dst IP 65 Platform Information: ===================== * Bundle Summary Information * -------------------------- Interface : Bundle-Ether100 Ifhandle : 0x000001a0 Lag ID : 3 Virtual Port : 255 Number of Members : 2 Local to LC : Yes Hash Modulo Index : 2 MGSCP Operational Mode : No Member Information: LON Interface ifhandle SFP port slot remote/rack_id ----- --------------- ---------- --- ---- ---- -------------- 0 Gi0/0/1/0 0x040002c0 96 16 2 0/0 1 Gi0/0/1/1 0x04000300 99 17 2 0/0 Preroute Member Information: LON Interface ifhandle SFP port slot remote/rack_id ----- --------------- ---------- --- ---- ---- -------------- 1 Gi0/0/1/1 0x04000300 99 17 2 0/0 * Bundle Table Information * ------------------------ [NP 0]: ---------------------------------------------------------------------- Unicast (Global) LAG table | Unicast (Rack) LAG table ---------------------------------------------------------------------- idx local VQI port VQI-LB | idx local VQI port VQI-LB ---------------------------------------------------------------------- 1 0 96 16 0 1 0 96 16 0 2 0 99 17 0 2 0 99 17 0 [NP 1]: ------------------------------------------------------------------------------------------------------------- Unicast (Global) LAG table | Multicast (Local) LAG table | Unicast (Rack) LAG table ------------------------------------------------------------------------------------------------------------- idx local VQI port VQI-LB | idx local VQI port VQI-LB | idx local VQI port VQI-LB ------------------------------------------------------------------------------------------------------------- 1 1 96 16 0 1 1 96 16 0 1 1 96 16 0 2 1 99 17 0 2 1 99 17 0 2 1 99 17 0 * SW LAG Table Information * ------------------------ -------------------------------------------------------------------------------------- Global/Rack LAG table | Local LAG table -------------------------------------------------------------------------------------- idx local LON VQI port VQI-LB | idx local LON VQI port VQI-LB -------------------------------------------------------------------------------------- 1 1 0 96 16 0 1 1 0 96 16 0 2 1 1 99 17 0 2 1 1 99 17 0 ===============================================================================
- And some information about the pseudowire:
RP/0/RSP0/CPU0:router-pe#show l2vpn xconnect pw-id 200 det Group Transport, XC Transport, state is up; Interworking none Description: Transport AC: Bundle-Ether100.200, state is up Type VLAN; Num Ranges: 1 VLAN ranges: [200, 200] MTU 9206; XC ID 0xa0000019; interworking none Statistics: packets: received 256535207083, sent 498000190258 bytes: received 53258327407296, sent 643037489071235 drops: illegal VLAN 0, illegal length 0 PW: neighbor 10.1.200.2, PW ID 200, state is up ( established ) PW class mpls, XC ID 0xc000000a Encapsulation MPLS, protocol LDP Source address 10.2.200.2 PW type Ethernet, control word disabled, interworking none PW backup disable delay 0 sec Sequencing not set Load Balance Hashing: src-dst-ip PW Status TLV in use MPLS Local Remote ------------ ------------------------------ ----------------------------- Label 16000 16000 Group ID 0x1a0 0x4000100 Interface Bundle-Ether100.200 TenGigE0/0/0/1.200 MTU 9206 9206 Control word disabled disabled PW type Ethernet Ethernet VCCV CV type 0x2 0x2 (LSP ping verification) (LSP ping verification) VCCV CC type 0x6 0x6 (router alert label) (router alert label) (TTL expiry) (TTL expiry) ------------ ------------------------------ ----------------------------- Incoming Status (PW Status TLV): Status code: 0x0 (Up) in Notification message Outgoing Status (PW Status TLV): Status code: 0x0 (Up) in Notification message MIB cpwVcIndex: 3221225482 Create time: 28/03/2019 00:10:01 (20w0d ago) Last time status changed: 14/08/2019 23:02:09 (13:34:43 ago) Last time PW went down: 14/08/2019 23:02:09 (13:34:43 ago) Statistics: packets: received 498000190258, sent 256535207083 bytes: received 643037489071235, sent 53258327407296
Thank you so much for your help.
- Labels:
-
MPLS
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-17-2019 12:08 AM
Hello. It should work according to https://community.cisco.com/t5/service-providers-documents/asr9000-xr-load-balancing-architecture-and-characteristics/ta-p/3124809#case1.
But i noticed that you use dst-ip hash in bundle-ether. I think you should disable this setting and use default. Also you could exclude l4 info from hash by configuring
cef load-balancing fields L3 global
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-30-2019 09:38 AM
Hello Evgeniy. Thank you so much for the reply and sorry for having taken so long to respond.
Unfortunately excluding L4 info from hash calculation didn't do the trick. I also tried every configuration suggested in the link you provided, which I've had already looked into even before posting in the forum, but couldn't find a way to balance traffic.
I even configured a FAT PW between both ends of the L2VPN but it didn't work either:
l2vpn pw-class transporte_class encapsulation mpls control-word load-balancing flow-label both ! ! xconnect group Transporte p2p Transporte interface Bundle-Ether100.200 neighbor ipv4 10.1.200.2 pw-id 200 pw-class transporte-class ! ! ! !
I don't know, maybe the cause is that there aren't any L3 (sub)interfaces configured on the bundle-ether, so the hashing calculation can't be done.
So, I'm still trying to find the configuration and any information anyone has would be helpful.
Thank you.
@Evgeniy Prichinin wrote:Hello. It should work according to https://community.cisco.com/t5/service-providers-documents/asr9000-xr-load-balancing-architecture-and-characteristics/ta-p/3124809#case1.
But i noticed that you use dst-ip hash in bundle-ether. I think you should disable this setting and use default. Also you could exclude l4 info from hash by configuring
cef load-balancing fields L3 global
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-31-2019 10:17 PM - edited 08-31-2019 10:19 PM
What kind of traffic flows into l2vpn?
fat label is used to load balancing mpls traffic in bundle-ethers between mpls routers, so it will not help.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-03-2019 10:47 AM
On both ends of the PW we have an AC which is allowing incoming L2 frames tagged with VLAN 200; traffic is then encapsulated into MPLS and sent through the L2VPN. Here´s the output of the lvpn xconnect status of the PW, taken from the customer-facing end of the L2VPN:
RP/0/RSP0/CPU0:RPE-01#sh l2vpn xconnect pw-id 200 det Tue Sep 3 14:39:01.241 ART Group Transporte, XC Transporte, state is up; Interworking none Description: Transporte AC: Bundle-Ether100.200, state is up Type VLAN; Num Ranges: 1 VLAN ranges: [200, 200] MTU 9206; XC ID 0xa0000019; interworking none Statistics: packets: received 298210575515, sent 572150150837 bytes: received 62781404682410, sent 736782573729841 drops: illegal VLAN 0, illegal length 0 PW: neighbor 10.1.200.2, PW ID 200, state is up ( established ) PW class mpls, XC ID 0xc000000a Encapsulation MPLS, protocol LDP Source address 10.2.200.2 PW type Ethernet, control word disabled, interworking none PW backup disable delay 0 sec Sequencing not set PW Status TLV in use MPLS Local Remote ------------ ------------------------------ ----------------------------- Label 16000 16000 Group ID 0x1a0 0x4000100 Interface Bundle-Ether100.200 TenGigE0/0/0/1.200 MTU 9206 9206 Control word disabled disabled PW type Ethernet Ethernet VCCV CV type 0x2 0x2 (LSP ping verification) (LSP ping verification) VCCV CC type 0x6 0x6 (router alert label) (router alert label) (TTL expiry) (TTL expiry) ------------ ------------------------------ ----------------------------- Incoming Status (PW Status TLV): Status code: 0x0 (Up) in Notification message Outgoing Status (PW Status TLV): Status code: 0x0 (Up) in Notification message MIB cpwVcIndex: 3221225482 Create time: 28/03/2019 00:10:01 (22w5d ago) Last time status changed: 27/08/2019 05:46:40 (1w0d ago) Last time PW went down: 27/08/2019 05:46:38 (1w0d ago) Statistics: packets: received 572150150837, sent 298210575515 bytes: received 736782573729841, sent 62781404682410
Thank you so much for your help!

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-02-2019 01:24 AM - edited 09-02-2019 01:43 AM
Hello,
you have dst-ip hash algorithm configured under the bundle interface. Are you sure that there are multiple DST IPs on the CE site? Have you tried to configure src-ip hash?
interface Bundle-Ether100 mtu 9216 bundle load-balancing hash dst-ip
What kind of router platform is it - NCS or A9K?
Best Regards,
P.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-03-2019 11:11 AM
I've also tried configuring src-ip hash but it didn't help either...
I'm not sure about the number of destination IPs in the CE because we don't have control over it; the CE is managed by our customer, we only provide a L2L service from PE to PE.
Lastly, the router platform we're using is ASR9K, ASR9001 with Typhoon based LC to be specific.
Thanks for your reply.
Best regards,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-05-2019 01:58 AM
Try to turn on the control-word under the pw-class on both PEs, so you will see it is enabled/negotiated. Please let me know the following output after this configuration "sh l2vpn xconnect pw-id 200 de".
Remove the "bundle load-balancing hash" command under the "interface Bundle-Ether100", so it will return to the default, which should be Src and Dst MAC based.
Under l2vpn configuration change the "load-balancing flow src-dst-ip" to "load-balancing flow src-dst-mac"
Best Regards,
P.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-05-2019 05:31 AM
I've tried turning on the control-word on both PEs and applied the other configurations you suggested but traffic is still not balanced. Here's the output of the status of the PW:
RP/0/RSP0/CPU0:RPE-01#sh l2vpn xconnect pw-id 200 det Thu Sep 5 08:38:08.201 ART Group Transporte, XC Transporte, state is up; Interworking none Description: Transporte AC: Bundle-Ether100.200, state is up Type VLAN; Num Ranges: 1 VLAN ranges: [200, 200] MTU 9206; XC ID 0xa0000019; interworking none Statistics: packets: received 301548464958, sent 578043723749 bytes: received 63527046452555, sent 744178739937323 drops: illegal VLAN 0, illegal length 0 PW: neighbor 10.1.200.2, PW ID 200, state is up ( established ) PW class transporte_class, XC ID 0xc000000a Encapsulation MPLS, protocol LDP Source address 10.2.200.2 PW type Ethernet, control word enabled, interworking none PW backup disable delay 0 sec Sequencing not set Load Balance Hashing: src-dst-mac PW Status TLV in use MPLS Local Remote ------------ ------------------------------ ----------------------------- Label 16000 16000 Group ID 0x1a0 0x4000100 Interface Bundle-Ether100.200 TenGigE0/0/0/1.200 MTU 9206 9206 Control word enabled enabled PW type Ethernet Ethernet VCCV CV type 0x2 0x2 (LSP ping verification) (LSP ping verification) VCCV CC type 0x7 0x7 (control word) (control word) (router alert label) (router alert label) (TTL expiry) (TTL expiry) ------------ ------------------------------ ----------------------------- Incoming Status (PW Status TLV): Status code: 0x0 (Up) in Notification message Outgoing Status (PW Status TLV): Status code: 0x0 (Up) in Notification message MIB cpwVcIndex: 3221225482 Create time: 28/03/2019 00:10:01 (23w0d ago) Last time status changed: 05/09/2019 08:35:26 (00:00:42 ago) Last time PW went down: 05/09/2019 08:35:23 (00:00:46 ago) Statistics: packets: received 578043723749, sent 301548464958 bytes: received 744178739937323, sent 63527046452555
I've also changed the "load-balancing flow src-dst-ip" to "load-balancing flow src-dst-mac" in both PEs and removed "bundle load-balancing hash" from interface Bundle-Ether100.
In case it helps, I share you the detail of the current load-balancing status:
RP/0/RSP0/CPU0:RPE-01#sh bundle load-balancing bundle-ether 100 detail location 0/0/cpu0 Thu Sep 5 09:01:23.411 ART Bundle-Ether100 Type: Ether (L3) Members <current/max>: 2/64 Total Weighting: 2 Load balance: Default Locality threshold: 65 Avoid rebalancing? False Sub-interfaces: 1 Member Information: Port: LON ULID BW -------------------- --- ---- -- Gi0/0/1/0 0 0 1 Gi0/0/1/1 1 1 1 Sub-interface Information: Sub-interface Type Load Balance Locality Hash Threshold ---------------------------- ---- ------------ --------- Bundle-Ether100.200 L2 Default 65 Platform Information: ===================== * Bundle Summary Information * -------------------------- Interface : Bundle-Ether100 Ifhandle : 0x000001a0 Lag ID : 3 Virtual Port : 255 Number of Members : 2 Local to LC : Yes Hash Modulo Index : 2 MGSCP Operational Mode : No Member Information: LON Interface ifhandle SFP port slot remote/rack_id ----- --------------- ---------- --- ---- ---- -------------- 0 Gi0/0/1/0 0x040002c0 96 16 2 0/0 1 Gi0/0/1/1 0x04000300 99 17 2 0/0 Preroute Member Information: LON Interface ifhandle SFP port slot remote/rack_id ----- --------------- ---------- --- ---- ---- -------------- 1 Gi0/0/1/1 0x04000300 99 17 2 0/0 * Bundle Table Information * ------------------------ [NP 0]: ---------------------------------------------------------------------- Unicast (Global) LAG table | Unicast (Rack) LAG table ---------------------------------------------------------------------- idx local VQI port VQI-LB | idx local VQI port VQI-LB ---------------------------------------------------------------------- 1 0 96 16 0 1 0 96 16 0 2 0 99 17 0 2 0 99 17 0 [NP 1]: ------------------------------------------------------------------------------------------------------------- Unicast (Global) LAG table | Multicast (Local) LAG table | Unicast (Rack) LAG table ------------------------------------------------------------------------------------------------------------- idx local VQI port VQI-LB | idx local VQI port VQI-LB | idx local VQI port VQI-LB ------------------------------------------------------------------------------------------------------------- 1 1 96 16 0 1 1 96 16 0 1 1 96 16 0 2 1 99 17 0 2 1 99 17 0 2 1 99 17 0 * SW LAG Table Information * ------------------------ -------------------------------------------------------------------------------------- Global/Rack LAG table | Local LAG table -------------------------------------------------------------------------------------- idx local LON VQI port VQI-LB | idx local LON VQI port VQI-LB -------------------------------------------------------------------------------------- 1 1 0 96 16 0 1 1 0 96 16 0 2 1 1 99 17 0 2 1 1 99 17 0 ===============================================================================
And here's the output of the "monitor interfaces" command from the customer-facing PE. Note how the traffic is correctly balanced on the inbound direction but not on the outbound:
Protocol:General Interface In(bps) Out(bps) InBytes/Delta OutBytes/Delta BE100.200 (statistics not available) BE100 54.5M/ 2% 396.4M/ 19% 63.5T/13.2M 744.2T/100.3M Gi0/0/1/0 27.1M/ 2% 396.4M/ 39% 31.7T/6.6M 739.6T/103.5M Gi0/0/1/1 27.2M/ 2% 0/ 0% 31.7T/6.5M 4.6T/226
Please, let me know if there's anything else that comes to your mind.
Thank you so much for your time.
Best regards,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-10-2019 12:58 AM
Hi Renzo,
I have checked the available documentation, bud did not find any other way how to configure it.
You may try to configure the following command to adjust the hash algorithm in the global configuration mode:
cef load-balancing algorithm adjust
before opening a TAC case.
Best Regards,
P.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-27-2019 07:24 AM - edited 09-27-2019 07:31 AM
Try configuring "bundle load-balancing hash auto" under each l2transport sub-interface on the bundle. I've see that work in the past. During that time "l2vpn load-balancing flow src-dst-ip" was also configured on the bundle but I'm not sure it is necessary. You can try using the hash auto command by itself first.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-30-2019 07:34 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-02-2019 08:44 AM
The customer may be using the same source/destination device resulting in the traffic you're seeing. I have no idea how per-packet load-balancing can be configured for the bundle. In our situation we had several circuits using the bundle.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-05-2019 12:20 PM
Have you been able to solve this issue?
I am facing the exactly same problem.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-26-2021 07:20 AM
Have you been able to solve this issue?
I am facing the same issue.
