cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
8083
Views
0
Helpful
8
Comments
shrnair
Cisco Employee
Cisco Employee

Introduction

The support for Line Card based Subscribers in IOS-XR BNG came in Release 5.1.1. Prior releases only had RP based Subscriber (Access interface as Bundle Interfaces) support.

In RP based Subscribers, the entire Control Plane is run on RSP. The important processes like iedged, dhcpd, subdb_svr, ipsub, etc runs on RSP. Also, the feature control plane (<feature>_ma) runs on RSP, while the hardware programming part (<feature>_ea) runs on LC.

The problems associated with RP based subscribers are as follows

  • Memory usage limits the number of sessions that can be brought up to about 128K IPv4 sessions. Physical/Shared memory resources are limited.

  • It doesn't use full potential of ASR9k. Even an ASR9k with more LC's will rely on RSP to do processing resulting in problems with Vertical scaling.

  • CPS is limited as well due to above points.

  • Poor fault isolation.

     

LC based Subscriber

Due to above challenges in Bundle based Subscribers, Line Card based Subscriber was introduced in 5.1.1.

The entire BNG control plane is designed to run in a distributed manner, on all the line cards. This allows the BNG control plane to be easily distributed and to terminate any session initiation protocol directly on the line card, as opposed to terminating the protocol on the RSP.

Running the control plane in such a distributed manner allows for higher fault isolation on the router. This allows certain LCs to fail without affecting subscriber sessions on the other LCs in the system. Only subscriber sessions built on that LC is lost.

 


 

Here, the subscriber controller plane is replicated to every LC on the system. Every LC will offer 8 Gig of physical memory. Hence, as more LC's are added to the system, horizontal scaling becomes possible. By doing this, memory and CPU on RP is freed up which can be used by Multi-Service Edge (MSE) Profile, allowing the ASR9K to provide significant cost benefits to the service provider. In an MSE environment, not only must the router support BNG functionality, but it must also provide other services, such as high scale BGP and other routing protocols, L3VPN/L2VPN services and other edge/aggregation functionalities.
 

Feature supported

All features which are supported for Bundle based (RSP) subscribers till release 4.3.1 would be supported for LC based Subscribers too, except for the listed features below.

  1. Per-User QoS

  2. Multicast

  3. Service Accounting.

  4. Local DHCPv4 Server (The support for RP based Subscribers came in 5.1.0. It works for LC based Subscribers too, but is untested as of now)

 

Hardware Support

Any typhoon linecard such as the A9K-MOD80-SE, A9K-24x10, 36x10 and MOD160.

Note that the linecard should be of the "SE" or Service Edge variant.

Also the ASR9001 will support BNG.

Trident linecards are not supported for subscriber termination, but they can be used as core facing linecard for transport

 

Memory implications

Unlike RP based subscribers, where each line card holds only data plane related information, in case of LC Based subscribers, the subscriber information is stored primarily on the line cards. LCs on the ASR9K are quite powerful and have nearly the same amount of physical memory as the RSP.

For every LC subscriber, there is a certain amount of data stored on the RSP as well, primarily as:

  1. IM’s G3P database

  2. SADB database in iedge for CoA handling

  3. RIB route

  4. DHCPv4 and v6 store a shadow database on the RSP for every binding on the LC.

 

IM’s G3P database is used by processes running on the RSP to obtain information about all interfaces, typically by routing protocol processes to be able to run their protocol or by show commands to display interface name to handle mappings.

IM process running on LC will only poll interface which are local to LC. IM process on RP polls interfaces on each nodes of the system. This is done for show commands, converting ifh etc. Protocol on RSP's need to know what there on LC. IM on RP currently keeps a copy of LC. This limits scaling higher.
 

SADB database in iedge is needed primarily for two reasons.

  • For managing Change of Authorization (CoA) requests. Only iedge process running on RP has the command handler. iedge looks up the sessions on the RSP using the replicated SADB and uses that to direct the CoA to iedge on the right LC.

  • LC OIR case. Iedge on RP sends the final stats from the SADB.

 

RIB route addition is unavoidable and has to go via RSP.

 

DHCP on RSP maintains shadow bindings. On LC OIR, DHCP on RSP replays bindings back to LC after LC comes back up.

 

In comparison to the data stored for each subscriber on the RSP for bundle subscribers, the amount of data stored for LC subscribers on the RSP is minimal. This distribution of subscriber data to the LCs where the subscriber is hosted allows the router to scale to much higher number of subscriber sessions.

 

Scale Numbers

  • 64k Single Stack subscribers per LC

  • 32k Dual Stack subscribers per LC

  • 256K Single Stack subscribers per System

  • 128K Dual Stack subscribers per System

 

Performance

Calls-per-second (cps) is defined as the number of sessions that can be brought up in a second.

In Bundle subscribers, control plane runs completely on the RSP, while the data plane runs completely on the LC. As a result, the control plane and data plane provisioning can happen on different CPUs and hence can happen in parallel. At the same time, due to this separation, there is a significant number of inter-node IPCs between the RSP and LC.

For subscribers terminating on physical interfaces, both the control plane and the data plane run on the same node, sharing the same CPU resource. As a result, the amount of CPU available to bringup the session is limited. The CPS achieved on a single LC, therefore, is lower than the CPS achieved for Bundle subscribers. At the same time, the two planes don’t have to communicate with each other using inter-node IPCs, as a result that CPU should be available for processes.

A significant gain in performance is achieved for LC subscribers by the distribution of control plane to multiple LCs. This allows us to make use of each LC’s processing power to process more sessions. As a result, even though the LC is less powerful and has to run both the control plane and data plane on the same CPU, in aggregate, the entire chassis can reach much higher scale and performance than RSP based subscribers.

Approximate target CPS is listed in the table below, assuming 64K sessions per LC and adding as many LCs needed to scale to the appropriate level.

 

Scale/Call Flow

IPoE

PPPoE

256K IPv4

200

200

128K Dual Stack

60

60


High Availability

HA Feature

Plane

Bundle subscribers

Line card subscribers

Process restart

Control Plane

Subscriber session state maintained. New subscriber bringup delayed by a small time, depending on the component being restarted.

Same as bundle subscribers.

Data Plane

No impact to traffic

No impact to traffic

LC OIR

Control Plane

With multi-member bundles, no impact is seen.

With single member bundles, control plane cannot function since no control packet is received. Session state is not lost since that is stored on the RP.

Control plane is down for new sessions and all session state is lost for existing sessions.

Data plane

With multi-member bundles, no impact is seen. With single member bundles, data traffic is lost. No session state is lost.

All traffic is lost

RP FO

Control Plane

Significant quiet time (currently more than 10 minutes) before new sessions can be setup.

Existing session state is not lost.

Very small impact (about 30-60 seconds) before new sessions can be setup; the delay is in connecting to RP based servers, like RIB. Existing session state is not lost.

Data Plane

No impact to traffic

No impact to traffic


 

External Interaction

  • Load balancing – In case of Bundle based subscribers, the user need not worry about load-balancing, since the entire control plane is run on one node (RSP), which will take care of the load-balancing. With LC based subscribers, each LC control plane functions independently, any global configuration of radius and dhcp servers will not result in load balanced usage. It is quite possible that all LCs will end up using the same radius server. As a result, it is upto the user to do some manual load balancing. This can be achieved by creating different aaa groups and method lists using different sets of radius servers and assigning the aaa groups to different service profiles and then assigning the different service profiles to the access interfaces on different LC's. Similarly for dhcp servers, access interfaces on different LCs need to have different profiles, each pointing at different dhcp servers.
     

  • Radius - The radius client need to be set up such that the entire BNG router shows up as one BNG to the Radius (NAS IP Address). This is important because currently, the CoAs can only be handled by the iedge on the RSP, which has the command handler module to deal with CoAs. The command handler then looks up the LC associated with a subscriber and pushes the CoA to the iedge on the appropriate LC. The command handler in iedge looks up the sessions on the RSP using the replicated SADB and uses that to direct the CoA to iedge on the right LC.
     

  • Address Pools - Since each LC operates independent of each other, when operating in server mode, if the addresses given out by the control plane on those LCs need to be in same pool, then proper coordination is required to ensure that the LCs don’t assign the same address to different subscribers. This is managed by the DAPS server component on the RSP . It is preferable currently to provide different pools to different LCs so that they can work completely independent of each other, without the need to perform significant messaging across nodes.

 

Configs

(The configs are same as that of Bundle based subscribers)
 

Radius configuration

radius-server host 8.0.0.172 auth-port 2001 acct-port 2002

key 7 121A0C041104

!

radius-server host 8.0.0.172 auth-port 3001 acct-port 3002

key 7 14141B180F0B

!

radius-server host 8.0.0.172 auth-port 4001 acct-port 4002

key 7 14141B180F0B

!

radius-server host 8.0.0.172 auth-port 5001 acct-port 5002

key 7 070C285F4D06

!

radius-server host 8.0.0.172 auth-port 6001 acct-port 6002

key 7 05080F1C2243

!

radius-server host 8.0.0.172 auth-port 7001 acct-port 7002

key 7 121A0C041104

!

radius-server host 8.0.0.172 auth-port 8001 acct-port 8002

key 7 0822455D0A16

!

radius-server host 8.0.0.172 auth-port 9001 acct-port 9002

key 7 1511021F0725

!

 

aaa group server radius aaa-group1

server 8.0.0.172 auth-port 2001 acct-port 2002

server 8.0.0.172 auth-port 3001 acct-port 3002

server 8.0.0.172 auth-port 4001 acct-port 4002

server 8.0.0.172 auth-port 5001 acct-port 5002

load-balance method least-outstanding batch-size 1

!

aaa group server radius aaa-group2

server 8.0.0.172 auth-port 6001 acct-port 6002

server 8.0.0.172 auth-port 7001 acct-port 7002

server 8.0.0.172 auth-port 8001 acct-port 8002

server 8.0.0.172 auth-port 9001 acct-port 9002

load-balance method least-outstanding batch-size 1

!

aaa accounting subscriber methodlist1 group aaa-group1

aaa accounting subscriber methodlist2 group aaa-group2

aaa authorization subscriber methodlist1 group aaa-group1

aaa authorization subscriber methodlist2 group aaa-group2

aaa authentication subscriber methodlist1 group aaa-group1

aaa authentication subscriber methodlist2 group aaa-group2

aaa authentication ppp methodlist1 group aaa-group1

aaa authentication ppp methodlist2 group aaa-group2

 

Access interface config

interface TenGigE0/0/0/0

!

interface TenGigE0/0/0/0.1

service-policy output vlan_policy_egress subscriber-parent resource-id 0

service-policy type control subscriber PM_CNTRL_1

pppoe enable

encapsulation ambiguous dot1q 10 second-dot1q any

!

 

Class Map and Policy Map

class-map type control subscriber match-any PPP_CM

match protocol ppp

end-class-map

!

policy-map type control subscriber PM_CNTRL_1

event session-start match-first

class type control subscriber PPP_CM do-until-success

10 activate dynamic-template PTA_TEMPLATE_1

!

!

event session-activate match-first

class type control subscriber PPP_CM do-until-success

10 authenticate aaa list methodlist1

!

!

end-policy-map

!

Comments
victor.lyapunov
Level 1
Level 1

First thanxs for the detailed explanation.

For better redundancy DSLAMs in our network are connected to the ASR9K though a Bundle-Ethernet interfaces (Do not plan to change this soon since - need protection from link failures). Will PPPoE sessions terminated on Bundle Ethernet interfaces see any benefit by migrating to IOS XR 5.1.1 (e.g. in terms of scalability) or the improvements concern only PPPoE sessions terminated on a single Physical Interfaces?

 

 

shrnair
Cisco Employee
Cisco Employee

Hello Victor,

There were quite a few enhancements done in XR 5.1.1 primarily in the areas of memory utilization and targeted performance improvements which applies to both RSP and LC Based subscribers. Hence XR 5.1.1 can get you a better CPS/Performance compared to previous releases.
With LC based Subscribers, the system scale supported doubles when compared to Bundle based subscribers. (256k compared to 128k). LCs run a Quad core PPC while RSPs run a Quad core intel, which is more power powerful. Hence there are advantages to both architecture.

With 128k subscribers in 2 LCs, the RP subscribers could give you a higher CPS. Further scale isn't currently supported with RP subscribers. Where the LC subscribers do well is when you add more LCs to the system as this would allow you to increase the overall scale and performance horizontally until you hit the max limit supported by the system. At lower scale, RP subscribers will provide higher cps. But in the centralized/RP model, as scale goes up, the cps tends to drop. With the distributed model, you can add another LC and you will get higher scale and higher cps.

Hope, I answered your question.

 

Regards

Shrijit

victor.lyapunov
Level 1
Level 1

Yes Thank you for the clarification

Zhichun Jiang
Level 1
Level 1

Shrjit

Nice Article!!

when to clarified that in the article it's said that "Per-User QoS" is not supported, it's a little confusing, i prefer the words of "Parameter QoS" instead, which means downloading the parameters using radius attribute is not supported when you are running LC based session.  But you can still download the name of a service-policy which is supposedly defined on the 9K box to apply a qos to a session.

 

 

----following is not supported by LC based, but supported by RP based session-----

cisco-avpair += “ip:qos-policy-out=add-class(sub,(class-default),shape(106496))”,

 

cisco-avpair += “ip:qos-policy-out=add-class(sub,(class-default,voip),pri-level(1),police(13600,9216,transmit,drop),queue-limit(8),set-cos(5))",

 

 

---following is supported by both LC and RP based session----

Cisco-AVPair = "sub-qos-policy-in=S99_IN_POLICING_256K",

Cisco-AVPair = “sub-qos-policy-out=S99_OUT_POLICING_512K"

 

Referred configuration on BNG

policy-map S99_IN_POLICING_256K

 class class-default

  police rate 256 kbps 

  end-policy-map

!

 

 

policy-map S99_OUT_POLICING_512K

 class class-default

  police rate 512 kbps 

  !

 end-policy-map

 

BR/Roy

 
Zhichun Jiang
Level 1
Level 1

RADIUS issue and it's fix.

Following config may not working  in LC based context. The 9K fails to send radius message out in case that  there is no other UP physical interface within the vrf  in the LC terminating the session. 

###############
radius-server host x.x.x.x auth-port 4444 acct-port 5555

 key 7 adgscasdwegtwrqsx
!
radius-server source-port extended
!
aaa group server radius radius-broadband
 vrf VRF1
 server-private x.x.x.x auth-port 1812 acct-port 1813
  key 7 24252524dadadada

 !
 source-interface Loopback1
!
interface Loopback1
 vrf VRF1
 ipv4 address y.y.y.y 255.255.255.255

 

the symptom is no radius message sent out when the FSOL comes.

you can see following debug display

 

LC/0/1/CPU0:Jun 12 17:46:26.723 : PPP-MA[303]: LCP: TenGigE0/1/1/1.2255.pppoe49: [Open]: Report This-Layer-Up
LC/0/1/CPU0:Jun 12 17:46:26.849 : radiusd[320]: Received request [handle 0x5015b688] with server-group   : radius-broadband
LC/0/1/CPU0:Jun 12 17:46:26.849 : radiusd[320]: Building header for the Authentication request
LC/0/1/CPU0:Jun 12 17:46:26.850 : radiusd[320]: radius_get_prfrd_srvr_info: Retrive Preferred Server info from attr list
LC/0/1/CPU0:Jun 12 17:46:26.850 : radiusd[320]: radius_get_prfrd_srvr_info: Preferred server handle is set to NULL
LC/0/1/CPU0:Jun 12 17:46:26.850 : radiusd[320]: (handle_nas_req) Couldn't retrive the preferred server info
LC/0/1/CPU0:Jun 12 17:46:26.850 : radiusd[320]: Trying to find the first radius server to use.
LC/0/1/CPU0:Jun 12 17:46:26.850 : radiusd[320]: Created transaction_id (800002A) for server group F000001
LC/0/1/CPU0:Jun 12 17:46:26.850 : radiusd[320]: Picking the rad id 142:0 sockfd 0x5004A108
LC/0/1/CPU0:Jun 12 17:46:26.850 : radiusd[320]: interface valid 0 flags 0x0, state 17
LC/0/1/CPU0:Jun 12 17:46:26.850 : radiusd[320]: rctx 0x50118778 added successfully
LC/0/1/CPU0:Jun 12 17:46:26.850 : radiusd[320]: rib lookup result found  1073729784
LC/0/1/CPU0:Jun 12 17:46:26.850 : radiusd[320]: Cannot get IP address, error: No error
LC/0/1/CPU0:Jun 12 17:46:26.850 : radiusd[320]: No interface found for vrf id 0x60000007
LC/0/1/CPU0:Jun 12 17:46:26.850 : radiusd[320]: Failed to send the request to radius server :'Subsystem(1167)' detected the 'fatal' condition 'Code(3)'

 

 

Workaround

You can config a dummy physical interface under the LC and put it in the VRF.  the interface could be protocol down and the ip address could be anyone , since it will not be used as the source address of the radius message . the source ip of radius message is still the one under the loopback 1. It's just a trick to cheat the IOS XR system :)


interface TenGigE0/1/1/1.444
 vrf VRF1
 ipv4 address z.z.z.z 255.255.255.0

 


RP/0/RSP0/CPU0:#sh  ipv4 vrf VRF1 interface   brief
Thu Jun 12 18:41:30.305 UTC

Interface                      IP-Address      Status                Protocol
Loopback1                  y.y.y.y    Up                    Up
TenGigE0/1/1/1.444          z.z.z.z Up                    Down

 

 

This is not normal  when your radius server is in default VRF, since you always has a interface in the LC in default VRF, but if your radius server is in the VRF, you may fall into this issue.

 

Enjoy

 BR/roy

 

 

 

 

 

Alireza Karimi
Level 1
Level 1

Hi

First thank you so much for this article

Could you please tell me what is the minimum requirements (hardware + software + license(s)) to be able to have 128000 PPPOE sessions on ASR9000?

Best regards

Zhichun Jiang
Level 1
Level 1

important update, LC based session service accounting has been supported from 5.3.2 release.

BR/Roy

PauloHirakawa
Level 1
Level 1

Hello,

Can I have both RP and LC based subscriber running in the same chassis? I mean, I have two A9K-RSP-440-SE and two A9K-MOD-160-SE, is it possible to have 64k subscribers running in each RSP plus 64k subscribers running in each MOD with a total of 256k subscribers? That's because I saw that MOD-160 has a memoyr limitation of the linecard and it supports only 64k subscribers.

In summary, the client needs to achieve 256k subscribers in this chassis, is it possible or it will be necessary to have more two MODs? If I can distribute the subscribers between RSP and MOD, how would be this configs?

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Quick Links