Introduction

xthuijs · ‎04-17-2016

Introduction

Geo redundancy is a powerful new technology for XR BNG that allows for session synchronization between 2 nodes. This means that a session active on one node has a shadow and fully programmed session on a standby node, so that when the active chassis fails, the standby BNG can take over and continue to forward the session info WITHOUT service interruption to the user.

Geo redundancy overcomes some of the restrictions that other redundancy models have which makes it a solution that is very compelling.

Existing redundancy models

Some of the existing models include the use of PPPoE smart server selection, ASR9K nv Cluster, ISSU, MC-LAG/MSTAG. This section outlines their operation and pros/cons.

Smart Server selection

Smart server selection relies on the operation whereby a host sends a PADI (discovery), which is broadcast to multiple devices/BNG's. Normally all hosts send a PADO (offer) back to the client who then connects with one of the offered BNG's for a single connection. By controlling the response time of the PADO's from all BNG's we can make one node more primary for a particular vlan, and the other(s!) standby.

The solution is stateless, meaning that if the active node dies, the client needs to rediscover and will find one or more stadnby BNG's for connection with.

Pro is that this is simple, useful, it provides N+1 redundancy (multiple BNG nodes can be used on the segment for more sharing of the load).

Con is that this is stateless, clients have to reconnect, per vlan bases and for PPPoE only (not usable for dhcp). Though a similar concept can be leveraged for IP sessions by delaying the offer timers of the dhcp server.

nV Cluster

Clustering two devices by linking their brains together via what we call the EOBC (Ether out of band connection), makes two chassis become a perfect mirror from each other. This automatically means that you have stateful redundancy.

It relies on the fact that you are dual homed with a connection in both both racks of the cluster. If the cluster device or rack we call it, fails the other chassis will take over as sole primary and the forwarding over the bundle all happens without any disruption

Pro: powerful, stateful, high scale

Con: sw upgrades, hw restrictions for cluster, requires bundle intefaces and dual homing into both nodes of the cluster, costly/license

MCLAG/MSTAG

Using standard redundancy technologies like mclag or mstag provides for a lot of simplicity. These technologies allow for dual homing and relying on ICCP (mclag) or STP (MSTAG) protocols to detect loops and only have one active link forwarding.

This means that a session is only available and active on one node at the time.

Pro: very simple low cost

Con: long convergence times and stateless

Summary

How nice would it be to have the best of all these solutions and not having so much cons? That is where GeoRed comes into play :)

How to use GeoRed

Geo redundancy provides for a very powerful M:N or N+1 redundancy model depending on how you like to implement it.

Flexible redundancy models via pairing across routers on Access Link basis

- 1:1 (both active/active with load sharing or active/standby) (like nv Cluster)

- M:N (active/standby roles and load is split across multiple routers)

- N:1 (1 backup for N active)

Full circle standby (M:N)

Designated backup (N:1)

There is no special connection required between the BNG's, just an ip connectivity for the redundancy protocol (to be discussed later).

One big advantage is also that the different BNG nodes may be placed in different geo-locations without any limitations!

Complements existing BNG high-availability, redundancy and geo-redundancy mechanisms.

Geo redundancy in a nutshell:

A typical design could look like this:

CPEs are agnostic to redundancy and they see “one BNG / Gateway”. Any switchover is transparent to them. With the redundancy model used, the CPE peers with the same mac address and node ID hence if a failover is required the CPe doesn't even know that it is talking to a different physical device.

Access Nodes are dual/multi-homed for redundancy using a variety of technologies such as MCLAG, Dual Homed (MST-AG), Ring (MST-AG or G.8032), xSTP, Seamless MPLS (PWs), etc. Using heartbeat mechanisms like E-OAM, BFD, etc. for faster fault detection/isolation.

BNG is not just a gateway router, it has subscriber state, policies and accounting/authorization details and subscriber features. Redundancy and synchronization also require sharing of protocol state like DHCP and PPP.

A good redundancy solution also should employ seamless integration with external servers like DHCP/Radius and backend policy/billing systems.

Implementation details

The concept for geo-redundancy is built on top of a sync protocol that is used in MCLAG also: ICCP (inter chassis communication protocol). It is a reliable protocol that allows for state and info sync between 2 chassis.

One of the basic pieces to that is the definition of what we in GEORED will call the Subscriber Redundancy Group (SRG).

Taking the picture from above, that shows the M:N or N:1 redundancy topologies, an SRG is the equivalent of the "X" or "Y" arrows:

Synchronization

Synchronization from “master” to “slave” is done over TCP on per SRG basis between routers using proprietary mechanism – BNG Sync

This mechanism serves the following purposes:

Signaling failures and role changes
Synchronization of subscriber sessions’ control plane states
Communication of other events and commands

When BNG SRG peers connect, first the master slave determination is done, after which sync of state happens from master to slave followed by regular mirroring that happens without delay with without holding up the session provisioning on master.

Session mirroring takes care of complete state once the session is up; and when there is any change or when it is deleted

Roles

Master/Slave roles are defined by the SRG and not defined as a BNG router by itself. This simply means that SRG1 can be active on router ONE and SRG2 will be active on router TWO, and SRG1 will be standby on router TWO also.

active/active – (eg the M:N) BNG could be master for one SRG and slave for another

active/standby – (eg the N:1) dedicated backup BNG could be slave for multiple SRGs from different active BNGs which are masters for those respective SRGs

Role negotiated via BNG sync between routers on per SRG level

Where possible, role can be determined by the underlying access technology

In master role BNG will handle and process all control traffic it receives

In slave role BNG will ignore all BNG and related protocols traffic. It will receive state notifications of the session via the ICCP communication from the active node serving that SRG.

Modes of operation

GEORED can operate in two distinct redundancy operations. That is hot and warm standby.

Hot-Standby Mode (default)

Sessions provisioned on slave in sync with setup on master

Since the sessions are actively programmed on the standby, this will consume hardware resources on slave. Proper planning is necessary here, since if we have BNG node X and Y both serving 50k sessions each, the slave node needs to be able to support 100k sessions when they are actively programmed!

Minimal action on switchover; data plane is already setup for sub-second traffic impact, this is the highest level of redundancy you can achieve.

And especially useful in deployments requiring high and tight SLA

Warm-Standby Mode (for over-subscription)

Sessions data kept in “shadow” database on slave in sync with setup on master

Only consumes some additional memory in control plane for the shadow copy – no provisioning in hardware

Upon failover trigger, sessions are setup at rapid pace from shadow copy

This allows for over provisioning on backup for subscribers. While it still provides for a high level of redundancy, and the "outage" or forwarding loss is determined by the time it takes to hw program the sessions served by the SRG, the failover will result in some session loss (if the SRG serves high number of sessions that take longer to program then the keepalive/timeout of the session).

Session distribution

Example scenario with an active/standby, N:1 model:

Sessions are associated with partitions (vlan 1,2,3,4) on BNG1 with each VLAN mapped to different SRG configured with master role
BNG2 IS acting as backup for all VLANs
Each VLAN has 8k sessions terminated

Example scenario with an active/active, N:1 model:

Sessions are associated with partitions (vlan 1,2) on BNG1 with each VLAN mapped to different SRG configured to Master role
Sessions are associated with partitions (VLAN 3,4) on BNG2 with each VLAN mapped to different SRG configured to Master role
Each VLAN has 8 sessions terminated
Each BNG has 16k session terminated

Radius Interaction

some important notes regarding radius accounting and authorization information

Authentication/Authorization done only from the Master and all profile information is syncd to Slave. Slave does not reach out to radius and relies on the session information received from the active node.
On Failover, Accounting Stop message is sent from old Master and Accounting Start from New Master.
NOTE: Accounting Stop from old master sent on best effort basis and ordering is not guaranteed between it and the Accounting Start sent by the new master. A failover session should be handled as two separate sessions by Radius
Radius (Accounting) messages from BNG are paced with jitter (especially around switchover) to avoid load on the server

Managing upgrades

One of the big advantages of GEORED that overcomes a painpoint of nV cluster is the sw upgrades.

In cluster, an orchestration is necessary to separate the cluster nodes, upgrade one and make a quick switch over to upgrade the other one.

In Geo Redundancy, the BNG nodes can run different sw versions even! and that is no problem. Although we wouldn't recommend too much version disparity between the devices and for the ease of deployment have all BNG nodes in the network, regardless of being part of the GEORED to be on the same sw version as much as possible with the same smu set.

The SW upgrade procedure would be opaque to the redundancy model chosen (N:1, M:N, active/active or active/standby).

Basically the steps include:

Failover SRG's one by one running active on the BNG node to its standby
If hot standby, step "1" will be quick. If warm standby allow for some time for the sessions to be programmed
upgrade the BNG to the desired sw level
pull back all sessions for the SRG's that need to be running active on this BNG

And do this for all the BNG nodes part of the SRG interaction.

NOTE: you can even setup GEO red just for the upgrade procedure. A node that is synchronizing its sessions during this setup is not affected whatsoever.

Session set up and call flow details

The following section graphs out the call flow and messaging between BNG SRG devices and the session.

Initial Session Setup

Only Master does Radius/Policy server interactions for the subscriber using its NAS IP, NAS Port and accounting session ID
In addition to protocol state, subscriber profile (including any further changes as result of CoA) are sync-ed across from master to slave
Slave sets up the same subscriber with a different accounting session ID – it has different NAS IP and likely different NAS Port
Redundancy design and Slave is invisible to Radius/policy server before the switchover. That is the radius/PCRF have no awareness of the fact that a session is synchronized.

Failure scenario

Subscriber already provisioned on slave and ready to forward traffic even before switchover; Loss on fail-over depends on Access network failover or convergence
Core network design – fast reroute, BGP PIC, core convergence
BNG Sync channel used to signal failures and trigger switchovers between BNG routers; this is control plane sync.
Accounting updates – start/stop/interims
DHCP state machine on slave takes over without any client/server interactions
Lease will continue on slave from when master started it
PPPOE/PPP state machine on slave takes over from where master left without any client impact
PPP keep-alive will start flowing from new master on takeover

Use cases

MSTAG

The MSTP protocol is used here to block standby path so we have only one active

In this case each BNG have their own MAC which is used for MST and other Ethernet protocols. In this scenario we need to setup SRG vMAC for BNG sessions. Which will act like an HSRP/VRRP virtual mac in the same facinity. The BNG's use their own mac for the STP communication, we'll use the vmac towards the sessions as their peering/communication point.

For dual homing two MST instances required with VLANs split across them to enable active/active load balancing to each of the 2 BNGs

MST provides “preempt delay” knobs to throttle switchovers and allow stabilization of subscribers on top of it after failure recovery.

Failure detection, or the improved detection for it is done via CFM sessions (at least one per MST instance in any of its VLAN). The CFM session is used to monitor connectivity and to detect which BNG has the forwarding path and which one has the standby/drop path (i.e. CFM session will be UP on active & DOWN on standby)

Coupling the CFM session via EFD with each of the BNG L3 access sub-interfaces on that interface will result in that sub-interface status tracking UP on active side and DOWN on standby side.

Access tracking object monitoring this sub-interface status (which is in turn controlled via EFD based on CFM session) is used for determining SRG role as well as controlling the subscriber subnet route advertisement

In event of failures, as MST re-converges and switches paths, the CFM session status changes and the L3 BNG sub-interfaces get notified of status via EFD such that the SRG role can be switched

MST and CFM timers can be as aggressive as supported by the access devices with stable operations even with full subscriber load

MCLAG

MC-LAG provides consistency of MAC and IP address across the two PoA (i.e. BNG routers). In this scenario there is no need for SRG vMAC since it is managed by MCLAG natively already.

The failure is induced by an object that directly tracks MC-LAG bundle interface status and signals to both SRG (for role determination) & the routing entity (to control the subnet/pool advertisement).

MC-LAG provides knobs to throttle switchovers and allow stabilization of subscribers on top of it in event of link flaps and after failure recovery

Parameters to consider when using MCLAG:

mlacp switchover recovery-delay – to ensure bundle remains slave after recovery from failure and allows subscribers to get sync and stabilized on it in slave mode

mlacp switchover type revertive – means that when the primary comes back, it will assume the primary role also and basically pull everything from the standby back. Like HSRP preempt.

lacp switchover suppress-flaps – to avoid switchover for transient link-flaps

BFD or CFM with EFD can be used for faster detection of failures in addition to LACP protocol mechanisms

LAC

Configuration and setup

Now that you know everything about GEORED you want to go set it up right?! Here is a config piece and explanation what it is for.

Enable BNG GEo Redundancy

group 1

peer 1.1.1.1

Set up SRG and define which group holds which

interface. Multiple groups can be defined.

subscriber redundancy

group 1

interface-list

interface bundle-ether1 id 1

Setup Access Object Tracking for SRG and Summary Subscriber route.

In this example we are tracking the interface bundle state that MCLAG is providing to us.

If we see that the state is going down, that will result in a static route withdraw from the table.

If we have redistribute static configured, the pool summary

will be removed so that the previous standby,

now active can start advertising the summary to start

pulling the traffic.

track access-mclag

type line-protocol state

interface bundle-ether1

subscriber redundancy

group 1

access-tracking access-mclag

router static

address-family ipv4 unicast

10.0.0.0/24 null0 track access-mclag desc sub-pool-summ

Optional SRG configuration to determine more deterministically what the preferred role is and what redundancy mode should be run.

subscriber redundancy

preferred-role master

slave-mode warm

hold-timer 15

A little more detail of the subscriber redundancy configlet

Restrictions and limitations

As with everything in technology, there is always some trade off. This table below is what exists currently as know restrictions for the GEORED solution as of XR 5.3.3

Note that XR6 has quite a significant amount of improvements, that will be documented separately. Since XR 5.3.3 is the going release today for ASR9000 I thought it is important to know what you get and where you need to think about.

Limitation	Recommendation
With just Core tracking, if core interface goes down, SRG switchover is triggered causing traffic black hole on access	EEM script can be used to shut access when core goes down
RA will send with both SRG vMAC as well as interface MAC towards access	use RA preference CLI under dynamic template or access-interface
Accounting records may get lost if we do back-to-back switchover before they sync on master and slave	we should wait for 15 mins before doing Switchover (128k sessions)
Admin clear of sessions from the slave is prohibited	1. If slave is out of sync from master, subscriber redundancy synchronize command can be issued from slave to replay 2. SRG clear command can be issued either from slave or Master to get slave back to normal state
Master reload is not recommended on the access with non-revertive protocol support	Enable revertive configuration on the access-protocol
On flight vmac modification for IPv6 sessions is not supported

Features not supported:

Static subscriber
DHCP Routed subscriber
Packet trigger Sessions
Multicast on subscriber and Qos Correlation
SLAAC for subscriber
BNG as DHCP server
IPv6 ND as SRG client
Diameter & Geo-redundancy interworking (6.2.x)

XR6 enhancements details

Miscellaneous

With great thanks to the GEORED dev team for some of the visualizations used in this paper.

PS. it is highly important not to use pppoe bba-group Global. This is a reserved keyword that is known to break certain SRG cases. name your bba-group to anything but global/Global.

xthuijs · ‎03-01-2017

hi josh,

you'd need the MST(AG) support here because the switch dual homing to BNG-x and BNG-y needs to block one link towards either BNG.

to have the switch block an uplink and not using both at the same time, MSTAG coordinates with geored whcih BNG is the STP root, so that the switch will forward the virtual mac towards the BNG that has the subscriber active.

with HSRP, the standby router can equally forward.

with geored bng the standby router for the session cannot forward the traffic, hence we need the switch to know/learn what link to take hence MSTAG to the rescue :)

cheers!

xander

joshuacmoore · ‎03-01-2017

I thought the purpose of "peer route-disable" was to allow both active/active links but not insert the session routes into the FIB? Hence, no forwarding on the access side for the "slave" BNG and no issue. I am using "peer route-disable" in conjunction with "core-tracking" and "access-tracking" on both BNGs. Would this scenario not be OK for active/active PPPoE termination on each BNG?

I am also seeing a strange behavior as mentioned earlier with Master/Slave roles not working properly. I have "core-tracking" enabled and it detects a "down" state from my track object but the SRG role is not changing from "Master" to "Slave". The result is both BNGs are active on the virtual mac and my layer 2 network is detecting MAC flaps.

BNG 1 config:

RP/0/RSP0/CPU0:Alma-CO-LAB-BRAS#sh run subscriber redundancy
Wed Mar 1 11:33:31.608 EST
subscriber
 redundancy
 source-interface Loopback10
 group 1
 preferred-role master
 virtual-mac 0200.0000.0001
 peer 10.200.200.18
 peer route-disable
 core-tracking WAN
 access-tracking PPPOE
 interface-list
 interface Bundle-Ether500.500 id 500
 !
 !
 group 2
 preferred-role slave
 virtual-mac 0200.0000.0002
 peer 10.200.200.18
 peer route-disable
 core-tracking WAN
 access-tracking PPPOE
 interface-list
 interface Bundle-Ether500.550 id 550
 !
 !

BNG 2 config:

RP/0/RSP0/CPU0:Patterson-CO-LAB-BRAS#sh run subscriber redundancy
Wed Mar 1 11:35:30.099 EST
subscriber
 redundancy
 source-interface Loopback10
 group 1
 preferred-role slave
 virtual-mac 0200.0000.0001
 peer 10.200.200.27
 peer route-disable
 core-tracking WAN
 access-tracking PPPOE
 interface-list
 interface Bundle-Ether500.500 id 500
 !
 !
 group 2
 preferred-role master
 virtual-mac 0200.0000.0002
 peer 10.200.200.27
 peer route-disable
 core-tracking WAN
 access-tracking PPPOE
 interface-list
 interface Bundle-Ether500.550 id 550
 !
 !

See below outputs while I have BNG 1 core-tracking in "down" state:

BNG 2:

RP/0/RSP0/CPU0:Patterson-CO-LAB-BRAS#sh subscriber redundancy group 1
Wed Mar 1 11:36:40.272 EST
Subscriber Redundancy Group ID: 1
 Description : <<not-configured>>

 Status : Enabled
 Init-Role : Slave
 Negotiated-Role : Master Current-Role : Master

 Slave-mode : Hot Hold Time : <<not-configured>>

 Virtual MAC Address : 0200.0000.0001
 L2TP Source Address : <<not-configured>>

 Core-Tracking : WAN
 Status : Up
 Access-Tracking : PPPOE
 Status : Up
 Tracking Status : Enabled

 Peer:
 10.200.200.27 Status : Connecting
 Role(Init/Neg/Cur): Master/Master/Master
 Tracking Status : Up

 Last Neg-Time : 2017 Mar 1 11:16:41
 Last Up-Time : 2017 Mar 1 11:15:46
 Last Down-Time : 2017 Mar 1 11:16:38

 Switchover:
 Last Switchover : 2017 Mar 1 11:16:41 Reason : Peer Down
 Switchover Count : 9
 Hold Time : Not-Running

 Subscriber Session Statistics:
 Count : 0 Slave-Upd-Fail : 0
 Pending Update : 0 Pending Delete : 0
 Tunnel Count : 0

 Interface Count : 1
 Bundle-Ether500.500 Map-ID : 500

BNG 1:

RP/0/RSP0/CPU0:Alma-CO-LAB-BRAS#sh subscriber redundancy group 1
Wed Mar 1 11:30:28.995 EST
Subscriber Redundancy Group ID: 1
 Description : <<not-configured>>

 Status : Enabled
 Init-Role : Master
 Negotiated-Role : Slave Current-Role : Master

 Slave-mode : Hot Hold Time : <<not-configured>>

 Virtual MAC Address : 0200.0000.0001
 L2TP Source Address : <<not-configured>>

 Core-Tracking : WAN
 Status : Down
 Access-Tracking : PPPOE
 Status : Up
 Tracking Status : Enabled

 Peer:
 10.200.200.18 Status : Listening
 Role(Init/Neg/Cur): Slave/Slave/Slave
 Tracking Status : Up

 Last Neg-Time : 2017 Mar 1 11:14:57
 Last Up-Time : 2017 Mar 1 11:14:18
 Last Down-Time : 2017 Mar 1 11:14:54

 Switchover:
 Last Switchover : 2017 Mar 1 11:14:57 Reason : Object Tracking Status Change
 Switchover Count : 5
 Hold Time : Not-Running

 Subscriber Session Statistics:
 Count : 0 Slave-Upd-Fail : 0
 Pending Update : 0 Pending Delete : 0
 Tunnel Count : 0

 Interface Count : 1
 Bundle-Ether500.500 Map-ID : 500

xthuijs · ‎03-01-2017

with geored hotstandby the session *is* programmed on the secondary BNG also, so theoretically it could do the forwarding.

however the primary (SRG), owning the session is responsible for all the qos, accounting etc. it wouldnt have knowledge on what the secondary (SRG) has forwarded or done.

the peer route disable helps with the traffic from north, southbound to the BNG so that the routing naturally goes towards the primary SRG. that addresses the concern that traffic from up north would not go to the client via teh secondary SRG.

that has no relation to the upstream from client northbound to the bng: from the switch perspective, each BNG shares the same VMAC as with HSRP. we need the traffic to go through the primary SRG, so we need a way to force the switch to do that, and that is where STP blocking of the link towards the secondary assists with that.

xander

joshuacmoore · ‎03-01-2017

So you are saying that with SRG, the VMAC is always active on both BNGs regardless of the Master/Slave status? With HSRP, there is only one forwarding router from client perspective through the switching environment because only the active router advertises the VMAC.

If this is the case, wouldn't it be a lot easier to just sync the VMAC usage with the master/slave status compared to being forced to configure STP just to block VMAC from the slave group?

Any thoughts as to why BNG 1 is showing as "Master" for the SRG group even though core-tracking is showing "down" state?

joshuacmoore · ‎03-03-2017

Ok, it seems upgrading to XR 5.3.4 fixed the "Master" status when core tracking is "down". I was on 5.3.3.

What I am observing though is that BNG is not utilizing vMAC properly. When I define "virtual-mac-prefix" I expect the virtual mac to be automatically assigned to each SRG based on the group number. This is not happening. The vMAC is only being used if I explicitly define it under each SRG group. I can validate this by checking the CAM table of my downstream switch. I see the vMAC when configured under the group, don't see it when I use the prefix command. Also, the output of "show subscriber redundancy group" does not show a vMAC assigned when using the virtual-mac-prefix command.

This leads me to my other issue. Even with explicit vMACs assigned, failover is not occurring as described here. I am not seeing a gratuitous ARP or CAM update in my downstream switch when the master changes to slave role and therefore the subscriber session starts dropping packets. Definitely not "seamless".

It seems there is a vMAC issue here.

joshuacmoore · ‎03-06-2017

Xander, If you are interested I have opened TAC Case # 681914939 on this issue.

xthuijs · ‎03-07-2017

hi josh,

few things: yeah if the core interface tracked is down, the srg should failover, that part seems tob e taken care of.

if the vmac is not properly activated on the standby that seems amiss. it needs to be used on the standby same as on the primary, otherwise the sessions will start to drop.

let me have a look at the tac case you have going and we can continue there.

cheers

xander

xthuijs · ‎03-07-2017

oh one extra thing I wanted to comment/mention: while this model of geored is somewhat similar to hsrp, obviously it is not 100% the same, since we have state of subscribers that need to sync. we have to enforce the subs from an srg to take the master end for that vlan(set) otherwise accounting will be a complete mess (and qos too btw).

so the implementation is solid, and we use STP to help force that directional aspect, but there are other options too, like mclag. (all doing the same thing: keeping one link (soft) down).

cheers

xander

xthuijs · ‎03-11-2017

hi josh, when I looked over some traces that rahul has in your case, it occurred to me that the v-mac addr is starting with 02, that is a "reserved" one for e64 (atm address mapping?). might be good to try a different vmac that is not in the reserved/pre-assigned ranges. 01 is mcast 00-01/2 is vrrp

etc.

cheers!

xander

joshuacmoore · ‎03-13-2017

Xander,

In the TAC case I am using the example vMAC prefix provided in Cisco documentation. http://www.cisco.com/c/en/us/td/docs/routers/asr9000/software/asr9k_r5-2/bng/command/reference/b-bng-cr52xasr9k/b-bng-cr52xasr9k_chapter_010000.html#wp2962357073

0001.0002.0000

xthuijs · ‎03-13-2017

hi josh, rahul will contact you and set up some time with one of our dev engs to have a look at this in real time.

cheers

xander

puddingtech · ‎03-15-2017

Xander dude, i hit "XR6 enhancements details" on my phone excited to read what was new and then..... nothing lol Apparently xr6 was a let down :P

BTW are their any actual examples of Geo-Red + MCLAG/LACP i'm trying to wrap my head around it but everytime i flip back and forth between geored and mclag i get a little more lost.

I mean all I want to do is to have 2x 2 port lags on my 9001 one for WAN and one for LAN, and geo-red that config to a second ASR9k1, but for the life of me the more i read the more i get confused trying to figure out how it fits together.

xthuijs · ‎03-16-2017

oooh sorry to hear!!! you didnt find any goodies in xR6? :)

think of GEORED being an HSRP/VRRP pair.

now the thing is however from a north to south path, in HSRP both routers can forward the traffic down. in the south to north, in HSRP, one router owns the virtual MAC and will pull that traffic towards him.

GEORED is almost the same. in the north->south we use address advertisement to make sure we send the traffic to the right master for that address so the accounting for the subscriber and everything remains the same.

for the south->north we need some mechanism, along with the vmac to force the subscriber traffic to the master. MCLAG, keeping one link standby for that vlan on the bundle, OR mstag that uses stp to block a link will aid in directing traffic in the access part of the design.

xander

gogie · ‎05-09-2017

We are trying to make Georedundancy with MC-LAG. Sessions are not created on Slave if parametrised qos is used, slave reporting "Policy-map not found" under debug ipsub ma error. Without parametrised qos, it looks ok.

Is it some limitation of Geored and parametrized qos, or we are missing something in config?

Regards

George

xthuijs · ‎05-09-2017

hi george,

this may be a bug. please do file a tac case for this.

xander

ASR9000/XR Using and understanding BNG GEO-Redundancy

Introduction

Existing redundancy models

Smart Server selection

nV Cluster

MCLAG/MSTAG

Summary

How to use GeoRed

Implementation details

Synchronization

Roles

Modes of operation

Session distribution

Radius Interaction

Managing upgrades

Session set up and call flow details

Initial Session Setup

Failure scenario

Use cases

MSTAG

MCLAG

LAC

Configuration and setup

Restrictions and limitations

XR6 enhancements details

Miscellaneous