04-17-2016 05:43 AM - edited 01-29-2019 08:04 AM
Geo redundancy is a powerful new technology for XR BNG that allows for session synchronization between 2 nodes. This means that a session active on one node has a shadow and fully programmed session on a standby node, so that when the active chassis fails, the standby BNG can take over and continue to forward the session info WITHOUT service interruption to the user.
Geo redundancy overcomes some of the restrictions that other redundancy models have which makes it a solution that is very compelling.
Some of the existing models include the use of PPPoE smart server selection, ASR9K nv Cluster, ISSU, MC-LAG/MSTAG. This section outlines their operation and pros/cons.
Smart server selection relies on the operation whereby a host sends a PADI (discovery), which is broadcast to multiple devices/BNG's. Normally all hosts send a PADO (offer) back to the client who then connects with one of the offered BNG's for a single connection. By controlling the response time of the PADO's from all BNG's we can make one node more primary for a particular vlan, and the other(s!) standby.
The solution is stateless, meaning that if the active node dies, the client needs to rediscover and will find one or more stadnby BNG's for connection with.
Pro is that this is simple, useful, it provides N+1 redundancy (multiple BNG nodes can be used on the segment for more sharing of the load).
Con is that this is stateless, clients have to reconnect, per vlan bases and for PPPoE only (not usable for dhcp). Though a similar concept can be leveraged for IP sessions by delaying the offer timers of the dhcp server.
Clustering two devices by linking their brains together via what we call the EOBC (Ether out of band connection), makes two chassis become a perfect mirror from each other. This automatically means that you have stateful redundancy.
It relies on the fact that you are dual homed with a connection in both both racks of the cluster. If the cluster device or rack we call it, fails the other chassis will take over as sole primary and the forwarding over the bundle all happens without any disruption
Pro: powerful, stateful, high scale
Con: sw upgrades, hw restrictions for cluster, requires bundle intefaces and dual homing into both nodes of the cluster, costly/license
Using standard redundancy technologies like mclag or mstag provides for a lot of simplicity. These technologies allow for dual homing and relying on ICCP (mclag) or STP (MSTAG) protocols to detect loops and only have one active link forwarding.
This means that a session is only available and active on one node at the time.
Pro: very simple low cost
Con: long convergence times and stateless
How nice would it be to have the best of all these solutions and not having so much cons? That is where GeoRed comes into play :)
Geo redundancy provides for a very powerful M:N or N+1 redundancy model depending on how you like to implement it.
Flexible redundancy models via pairing across routers on Access Link basis
- 1:1 (both active/active with load sharing or active/standby) (like nv Cluster)
- M:N (active/standby roles and load is split across multiple routers)
- N:1 (1 backup for N active)
Full circle standby (M:N)
Designated backup (N:1)
There is no special connection required between the BNG's, just an ip connectivity for the redundancy protocol (to be discussed later).
One big advantage is also that the different BNG nodes may be placed in different geo-locations without any limitations!
Complements existing BNG high-availability, redundancy and geo-redundancy mechanisms.
Geo redundancy in a nutshell:
A typical design could look like this:
CPEs are agnostic to redundancy and they see “one BNG / Gateway”. Any switchover is transparent to them. With the redundancy model used, the CPE peers with the same mac address and node ID hence if a failover is required the CPe doesn't even know that it is talking to a different physical device.
Access Nodes are dual/multi-homed for redundancy using a variety of technologies such as MCLAG, Dual Homed (MST-AG), Ring (MST-AG or G.8032), xSTP, Seamless MPLS (PWs), etc. Using heartbeat mechanisms like E-OAM, BFD, etc. for faster fault detection/isolation.
BNG is not just a gateway router, it has subscriber state, policies and accounting/authorization details and subscriber features. Redundancy and synchronization also require sharing of protocol state like DHCP and PPP.
A good redundancy solution also should employ seamless integration with external servers like DHCP/Radius and backend policy/billing systems.
The concept for geo-redundancy is built on top of a sync protocol that is used in MCLAG also: ICCP (inter chassis communication protocol). It is a reliable protocol that allows for state and info sync between 2 chassis.
One of the basic pieces to that is the definition of what we in GEORED will call the Subscriber Redundancy Group (SRG).
Taking the picture from above, that shows the M:N or N:1 redundancy topologies, an SRG is the equivalent of the "X" or "Y" arrows:
Synchronization from “master” to “slave” is done over TCP on per SRG basis between routers using proprietary mechanism – BNG Sync
This mechanism serves the following purposes:
When BNG SRG peers connect, first the master slave determination is done, after which sync of state happens from master to slave followed by regular mirroring that happens without delay with without holding up the session provisioning on master.
Session mirroring takes care of complete state once the session is up; and when there is any change or when it is deleted
Master/Slave roles are defined by the SRG and not defined as a BNG router by itself. This simply means that SRG1 can be active on router ONE and SRG2 will be active on router TWO, and SRG1 will be standby on router TWO also.
active/active – (eg the M:N) BNG could be master for one SRG and slave for another
active/standby – (eg the N:1) dedicated backup BNG could be slave for multiple SRGs from different active BNGs which are masters for those respective SRGs
Role negotiated via BNG sync between routers on per SRG level
Where possible, role can be determined by the underlying access technology
In master role BNG will handle and process all control traffic it receives
In slave role BNG will ignore all BNG and related protocols traffic. It will receive state notifications of the session via the ICCP communication from the active node serving that SRG.
GEORED can operate in two distinct redundancy operations. That is hot and warm standby.
Hot-Standby Mode (default)
Sessions provisioned on slave in sync with setup on master
Since the sessions are actively programmed on the standby, this will consume hardware resources on slave. Proper planning is necessary here, since if we have BNG node X and Y both serving 50k sessions each, the slave node needs to be able to support 100k sessions when they are actively programmed!
Minimal action on switchover; data plane is already setup for sub-second traffic impact, this is the highest level of redundancy you can achieve.
And especially useful in deployments requiring high and tight SLA
Warm-Standby Mode (for over-subscription)
Sessions data kept in “shadow” database on slave in sync with setup on master
Only consumes some additional memory in control plane for the shadow copy – no provisioning in hardware
Upon failover trigger, sessions are setup at rapid pace from shadow copy
This allows for over provisioning on backup for subscribers. While it still provides for a high level of redundancy, and the "outage" or forwarding loss is determined by the time it takes to hw program the sessions served by the SRG, the failover will result in some session loss (if the SRG serves high number of sessions that take longer to program then the keepalive/timeout of the session).
Example scenario with an active/standby, N:1 model:
Example scenario with an active/active, N:1 model:
some important notes regarding radius accounting and authorization information
One of the big advantages of GEORED that overcomes a painpoint of nV cluster is the sw upgrades.
In cluster, an orchestration is necessary to separate the cluster nodes, upgrade one and make a quick switch over to upgrade the other one.
In Geo Redundancy, the BNG nodes can run different sw versions even! and that is no problem. Although we wouldn't recommend too much version disparity between the devices and for the ease of deployment have all BNG nodes in the network, regardless of being part of the GEORED to be on the same sw version as much as possible with the same smu set.
The SW upgrade procedure would be opaque to the redundancy model chosen (N:1, M:N, active/active or active/standby).
Basically the steps include:
And do this for all the BNG nodes part of the SRG interaction.
NOTE: you can even setup GEO red just for the upgrade procedure. A node that is synchronizing its sessions during this setup is not affected whatsoever.
The following section graphs out the call flow and messaging between BNG SRG devices and the session.
The MSTP protocol is used here to block standby path so we have only one active
In this case each BNG have their own MAC which is used for MST and other Ethernet protocols. In this scenario we need to setup SRG vMAC for BNG sessions. Which will act like an HSRP/VRRP virtual mac in the same facinity. The BNG's use their own mac for the STP communication, we'll use the vmac towards the sessions as their peering/communication point.
For dual homing two MST instances required with VLANs split across them to enable active/active load balancing to each of the 2 BNGs
MST provides “preempt delay” knobs to throttle switchovers and allow stabilization of subscribers on top of it after failure recovery.
Failure detection, or the improved detection for it is done via CFM sessions (at least one per MST instance in any of its VLAN). The CFM session is used to monitor connectivity and to detect which BNG has the forwarding path and which one has the standby/drop path (i.e. CFM session will be UP on active & DOWN on standby)
Coupling the CFM session via EFD with each of the BNG L3 access sub-interfaces on that interface will result in that sub-interface status tracking UP on active side and DOWN on standby side.
Access tracking object monitoring this sub-interface status (which is in turn controlled via EFD based on CFM session) is used for determining SRG role as well as controlling the subscriber subnet route advertisement
In event of failures, as MST re-converges and switches paths, the CFM session status changes and the L3 BNG sub-interfaces get notified of status via EFD such that the SRG role can be switched
MST and CFM timers can be as aggressive as supported by the access devices with stable operations even with full subscriber load
MC-LAG provides consistency of MAC and IP address across the two PoA (i.e. BNG routers). In this scenario there is no need for SRG vMAC since it is managed by MCLAG natively already.
The failure is induced by an object that directly tracks MC-LAG bundle interface status and signals to both SRG (for role determination) & the routing entity (to control the subnet/pool advertisement).
MC-LAG provides knobs to throttle switchovers and allow stabilization of subscribers on top of it in event of link flaps and after failure recovery
Parameters to consider when using MCLAG:
mlacp switchover recovery-delay – to ensure bundle remains slave after recovery from failure and allows subscribers to get sync and stabilized on it in slave mode
mlacp switchover type revertive – means that when the primary comes back, it will assume the primary role also and basically pull everything from the standby back. Like HSRP preempt.
lacp switchover suppress-flaps – to avoid switchover for transient link-flaps
BFD or CFM with EFD can be used for faster detection of failures in addition to LACP protocol mechanisms
Now that you know everything about GEORED you want to go set it up right?! Here is a config piece and explanation what it is for.
Enable BNG GEo Redundancy |
group 1 peer 1.1.1.1 |
Set up SRG and define which group holds which interface. Multiple groups can be defined. |
subscriber redundancy group 1 interface-list interface bundle-ether1 id 1 |
Setup Access Object Tracking for SRG and Summary Subscriber route. In this example we are tracking the interface bundle state that MCLAG is providing to us. If we see that the state is going down, that will result in a static route withdraw from the table. If we have redistribute static configured, the pool summary will be removed so that the previous standby, now active can start advertising the summary to start pulling the traffic. |
track access-mclag type line-protocol state interface bundle-ether1 subscriber redundancy group 1 access-tracking access-mclag router static address-family ipv4 unicast 10.0.0.0/24 null0 track access-mclag desc sub-pool-summ |
Optional SRG configuration to determine more deterministically what the preferred role is and what redundancy mode should be run. |
subscriber redundancy preferred-role master slave-mode warm hold-timer 15 |
A little more detail of the subscriber redundancy configlet
As with everything in technology, there is always some trade off. This table below is what exists currently as know restrictions for the GEORED solution as of XR 5.3.3
Note that XR6 has quite a significant amount of improvements, that will be documented separately. Since XR 5.3.3 is the going release today for ASR9000 I thought it is important to know what you get and where you need to think about.
Limitation |
Recommendation |
With just Core tracking, if core interface goes down, SRG switchover is triggered causing traffic black hole on access |
EEM script can be used to shut access when core goes down |
RA will send with both SRG vMAC as well as interface MAC towards access |
use RA preference CLI under dynamic template or access-interface |
Accounting records may get lost if we do back-to-back switchover before they sync on master and slave |
we should wait for 15 mins before doing Switchover (128k sessions) |
Admin clear of sessions from the slave is prohibited |
1. If slave is out of sync from master, subscriber redundancy synchronize command can be issued from slave to replay
2. SRG clear command can be issued either from slave or Master to get slave back to normal state |
Master reload is not recommended on the access with non-revertive protocol support |
Enable revertive configuration on the access-protocol |
On flight vmac modification for IPv6 sessions is not supported |
Features not supported:
With great thanks to the GEORED dev team for some of the visualizations used in this paper.
PS. it is highly important not to use pppoe bba-group Global. This is a reserved keyword that is known to break certain SRG cases. name your bba-group to anything but global/Global.
Hi Alex,
We would like to do Geo-Red with two (BNG)ASR9000 in different location. brief topo as below. Is it possible to do Geo-Redundancy ? which Geo-Red solution is the best and suitable ?
CPE---OLT--- Pre-Agg Router-- Agg Router---PE1|--- BNG (location 1)
|--PE2---BNG (location 2)
Thank!
Happy
you have a few options here. either on the agg router PE1 you have an mclag to the 2 bng's.
this will create bundle subscribers, meaning the scale is not as high as lineard based subs.
if you have multiple PE's and using PW'. you could also consider pwhe terminating a primary and backup pw on the BNG's. this is also rp based subs.
alternatively you can consider an mstag design to use phy subinterfaces which allows you to have linecard based subs for increased scale.
considering you mention that the bng's are in different locations, I would assume that PE1 doesnt have easy (direct) access/link to BNG in location 2. if that is the case it looks like PWHE is the easiest approach for oyu there.
cheers
xander
Hello Alex,
Thank for very useful information :)
I will try PWHE option first.
Happy
Hello,
Even though DHCP Server is not supported as of yet in Geo Redundancy. Is that a hard not supported? :)
My use case is that I want to use my RADIUS server to push out Framed-IP-Address and ipv4:ipv4-unnumbered for highly efficient IP allocation. This works fine for regular BNG. I just push a dhcp-class attribute to match the dhcp config so that it can set the default gateway for the connection.
Should this still work under Geo Redundancy using the local DHCP-server?
Example:
I send via RADIUS
Framed-IP-Address: 192.168.1.10
Framed-IP-Netmask: 255.255.255.0
ipv4:ipv4-unnumbered: Loopback1921681
dhcp-class: 192_168_1_0
vrf-id: SUB-INTERNET
Then I have a DHCP server config on the ASR as follows:
dhcp ipv4
profile BNG_RADIUS server
lease 0 1 0
dns-server 192.168.20.2 192.168.20.3
subnet-mask 255.255.255.0
class 192_168_1_0
default-router 192.168.1.1
!
!
!
Thanks!
Fred
hi fred,
the "issue" is with geored and dhcp server locally is that the dbase/bindings are not synced.
so if there is a failover to the standby geored mate, it may be handing out adds that were already allocated.
you could technically bypass it by having 2 separate pools for each mate in the geored pair, but that is likely not desirable either.
cheers!
xander
Hi Xander,
Awesome! Thanks for the clarification on that.
As I'm using my RADIUS server to handle all IP address allocation on the BNG setup, that should still work fine.
Thanks!
Fred
yeah Fred, if you are using the local dhcp server only as a "converter" between radius access-accept and dhcp offer to the client, you should be fine!!
xander
Ok i don't know if it's just me but i really can't get my head wrapped around setting a BNG pair up in a MC-LAG configuration, Xander do you by chance have any basic examples of the mclag+iccp+subscriber examples...?
For instance the loopback that we're using for the gateway of the PPPoE subs, should that be the same IP on both POA's? Should i be using 2 loopbacks 1 for ICCP and one for PPPoE/BNG? Do i even need a staticroute+loopback for the ICCP if i'm only having a 2 node cluster direct connected to each other for iccp can't i just use a /30)
I have one ASR working with PPPoE Subs, i copied that configuration over to the second ASR, modified the public ip interface ip everything else so far is the same...
I've got EtherBundle100 for access on both boxes (currently in separate bundles on the remote switch but will move them into the same remote group once the ICCP is ready i suppose) (EtherBundle100 = TenGigE0/0/2/0 0/0/2/1)
I have EtherBundle200 for core (internet) on both boxes (seperate bundles on the remote switch)
(EtherBundle200 = TenGigE0/0/0/0 0/0/0/1)
And i'm going to create EtherBundle 999 for the ICCP traffic directly between the 2 ASR's
(EtherBundle200 = TenGigE0/0/0/2 0/0/0/3)
Please help as i've been beating my head for days here, and now that i finally got my ADV licenses i really want to put the cluster into service, but i have to get the ICCP/SRG/OSPF working before i can even think of moving forward.
Hi,
I am trying to understand GEORED feature to implement it on our network.
This is I see in the latest 6.3 sw documentation:
These are planned to be fully qualified only in future releases of Cisco IOS XR Software:
• Warm-standby slave mode.
• Line card (LC) based subscribers (that is, using physical port sub-interfaces).
• DHCP server mode.
• Pseudowire Headend (PWHE), G.8032 (dual-home and ring) access technologies
So warm-standby is not supported at this time?
And I see that DHCP server mode is in plans. Can you share the roadmap, when it's planned to be supported?
Hi,
I have set up and tested GeoRed for IP subscribers, but meet problem with PPPoE on 5.3.3 with SP10.
The redundancy group is up but pppoe sessions do not install on slave.
"show pppoe interfaces" on Slave shows them as "Incomplete" without Outer VLAN ID and Tags. "show ppp interface" shows nothing at all on Slave. I've turned on debug pppoe with the most suspicious parts being
RP/0/RSP0/CPU0:Aug 20 18:39:45.821 : pppoe_ma[369]: Session: Bundle-Ether2.2.pppoe4567: Received AAA Session Update Response cb: 'iEdge' detected the 'informational' condition 'iEdge Disconnect Pending error' RP/0/RSP0/CPU0:Aug 20 18:39:45.821 : pppoe_ma[369]: Session: [ERROR] Bundle-Ether2.2.pppoe4567: AAA Session Update callback with error: 'iEdge' detected the 'informational' condition 'iEdge Disconnect Pending error' RP/0/RSP0/CPU0:Aug 20 18:39:45.821 : pppoe_ma[369]: Session: [ERROR] Bundle-Ether2.2.pppoe4567: Session being cleaned up, trigger 4
and
RP/0/RSP0/CPU0:Aug 20 18:39:45.920 : pppoe_ma[369]: Session: 0x08002ca0: Received FINAL notification RP/0/RSP0/CPU0:Aug 20 18:39:45.920 : pppoe_ma[369]: Session: [ERROR] 0x08002ca0: Session being cleaned up, trigger 11
Also I cannot complete the command "debug srg".
What should I look at to solve this problem?
Hi Xander,
We are using SRG but found an issue with the traffic between 2 subscribers from the same subnet. Could you pleas help? We are using IPoE. We are currently migrating customers from a "normal"
IPoE subscriber environment (ASR9001, IOSXR 4.3.4) to IOS XR 6.3.2 64-bit with SRG's. Now some customers are complaining they can't reach other customers in the same subnet. In the old setup we use the same BE-config as below, with the /21 subnet on the loopback. In the new setup we use a similar config as below. When we ping from client modem 1 to client modem 2 we see it working perfectly in the old setup but not in the new setup. Any idea what the problem could be?
New setup:
interface Bundle-Ether1.111
service-policy output svlan_shape subscriber-parent resource-id 0
ipv4 point-to-point
ipv4 unnumbered Loopback111
arp learning disable
arp gratuitous ignore
ipv4 unreachables disable
service-policy type control subscriber IP_PM_INT
load-interval 30
encapsulation dot1q 111
ipsubscriber ipv4 l2-connected
initiator dhcp
!
!
subscriber
redundancy
source-interface Loopback0
group 27
preferred-role master
virtual-mac 0001.0001.0001
slave-mode hot
peer 2.2.2.2
peer route-disable
core-tracking core-int-BE1000
access-tracking wap-int-BE1
state-control-route ipv4 3.3.0.0/21 vrf default tag 30
revertive-timer 60 maximum 150
interface-list
interface Bundle-Ether1.111 id 111
!
!
Hi Xander,
is there any restriction to use synchronization of subscriber account session id's for N+1 or 1:1 redundancy model?
subscriber manager srg sync-acct-session-id
Dear Xthuijs,
I have issue during the recovery of SRG group. There is outage when Master get back the session, after downlink recovery and grace period.
When failover happens the outage is less than 1 sec.
In case of recovery the outage is ~10 sec (if keepalive 10 sec.).
The aggregation switch updates it`s MAC address table after the 1st packet comes from BNG side. - usually PPP LCP keepalive.
How can we force BNG to send something in case of recovery?
In case of failover the MAC deleted and renewed immediately because the interface went down.
cXR 6.5.3, PPPoE LC subscribers IPv4.
Thanks, Imre
Workaround from Xander:
Configure dummy IP address to the interface, and the gratuitous ARP sent out.
Trick is to configure the same mac address to the main interface as the virtual MAC of the SRG. Use the same vMAC for all groups.
Thanks for the help.
We are using BNG with GeoRed for more than a year in configuration of two ASR9010 with MST-AG over multi-vlan subinterfaces of Bundles of 2*10G links. Subscribers are PPPoE and IPoE on separate subinterfaces.
We have several thousands of subscribers on each subinterface.
While GeoRed works perfectly in test cases with small number of subscribers on test subinterface there has not been a single time that it had worked as intended in case of real network failure on production subinterfaces.
Even worse, most of the times I had to reload one or both ASRs to get things working again.
We've upgraded 5.3.4 to 6.4.2 in this time with no success.
TAC case of almost a year length was closed with resume that combination of DHCP-Radius proxy with GeoRed is not supported although it is the very same configuration praised by Xander on 06-06-2017 in this thread.
We tried SERG pool synchronisation feature for LC based PPPoE sessions but it does not work.
Configured with RP based sessions and works fine. cXR 6.6.3 So this is an undocumented restriction or bug?
What we found, the pools kind of syncronised between SRG groups. (sessions synced on slave and the IPs are reserved there)
If the Pool-IDs are the same on the pairs the group Slave reservs IPs in the pool.
So a group slave does not allocate same IP into other groups where it is the master. Duplicate IP can occure during the "sync delay"...
Need to consider pool-ids may renumbered after reboot. (Alphabet order the names and IDs start from 0). When new pools added and one BNG rebooted may need to reboot the pair...
Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: