Re: SD-WAN Traffic Engineering (TE) with primary & backup hubs

JohnG2020 · ‎04-06-2022

Hello,

I have a challenging routing failover scenario that traffic engineering (TE) is not able to solve for me and wanted to share this with everyone in the Cisco community to see if anyone has experienced this before and knows a possible solution for it.

Attached is a network diagram that walks you through the routing scenario described below.

SD-WAN high-level design:

The topology is a hub and spoke with primary & backup hubs
Each hub has 2 MPLS and 2 Internet WAN circuits using the following color transports; mpls, metro-ethernet, biz-internet & public-internet
The remote sites have 1 mpls and 1 Internet WAN circuit using the following colors; mpls & biz-internet
The remote site and hubs all have 2 vEdge routers
2 vSmart, 1 vManage and 2 vBond controllers
BFD sessions are only built between the remote sites' primary and backup hubs over their respective color transports - mpls to mpls/metro-ethernet & biz-internet to biz-internet/public-internet
A central control policy is used to set the hubs' tlocs-list for all remote routes
The tloc-list sets the preferences to be higher for the primary hub & lower value for the backup hub
Service FW is enabled on primary & backup hubs
When the primary hub goes down and loses all of its color transports, the remote sites automatically failover to the backup hub with the next hub's tloc being available

Failover challenge with the above design:

The remote site A loses the mpls color transport & only the biz-internet color transport is available
The primary hub loses the biz-internet & public-internet transports all internet links are down due to a fiber cut, only mpls & metro-ethernet color transports are up
The remote site A uses the backup hub to route the traffic over the remaining biz-internet color-transport
Remote site B responds back to remote site A using the tloc that is still pointing to the primary hub over the mpls color-transport because it doesn't know that remote site A lost the local mpls
This creates asymmetric routing and gets dropped by the primary hub's FW due to the service chaining

Flavio Miranda · ‎04-07-2022

That´s a very interesting scnario and I believe you had already put a lot of thoughts on it. Honestly, the only idea that cross my mind is perhaps allows asymetric routing on firewall. But I´ll keeping thinking and hopefully someone here knows how to overcome it.

Kanan Huseynli · ‎04-07-2022

Hi,

I don't think that there is a direct and dynamic way to handle these types of scenarios.

You may have a ready policy and apply it when this type of issue happens.

HTH,
Please rate and mark as an accepted solution if you have found any of the information provided useful.

Octavian Szolga · ‎04-08-2022

Hi,

I played a bit with service chaining and quickly tested your setup.

My assumption is that youre using a control policy in which you're doing something like match (VPN "A" + Sites "Spokes") and set (Service FW and TLOC List both hubs where Primary has a better preference).

What if you add to the same set of action statements a 'tloc-action primary' so that vSmart can track your transport end to end?

BR,

Octavian

JohnG2020 · ‎04-08-2022

Hi Octavian,

I played with the tloc-action primary/backup options on the control policy where I am setting the tloc-list service fw. The end-to-end tracking only sets the ultimate tloc to be the local remote sites' tloc. In a hub tloc-list, if you have the list containing only tlocs to 1 hub and not a 2nd hub, it doesn't work where remote site B can then update site A's omp route tloc automatically because it tracks the ultimate-tloc being the local remote tloc A, not the hub. This is what I understood how the TE was intended to be used for. I have never seen it used with primary and secondary hubs in a tloc-list. However, we don't have BFD sessions between remote to remote only to the hubs, so tracking the ultimate-tloc at a remote site doesn't help. It will never route directly between remote sites because the BFD sessions are not formed.

Let's take a look at a common simpler scenario:

- Remote A has only 1 Internet transport, not MPLS

- Remote B has 1 MPLS & 1 Internet transport

- Remote A & B have only BFD sessions for both the primary and backup hubs

- Remote A routes to B via the primary hub using the Internet and service fw chaining

- Remote B has 2 color transport options to route back to A and also uses the service fw chaining in the primary hub

Failover broken:

- Primary hub loses the Internet

- Remote A's BFD Internet session is down to the primary hub but has Internet BFD remaining up to the backup hub

- Remote B still has all its color BFD sessions up to the primary & backup hubs

- Remote B is seeing A's OMP route tagged with the tloc still pointing at the primary hub because it doesn't know that it is 100% failed over to the backup hub

In the above failover scenario, I would think TE should track remote A's tloc that is set for the primary hub, not the local tloc itself. This would then tell Remote B that is using the backup's tloc and update the route in the OMP routing table for it. This is what TE should be doing because normally you have more than one hub in your tloc-list and are not forming BFD sessions between all your remote sites.

Thanks

John

Kanan Huseynli · ‎04-09-2022

Hi John,

how to will track remote A's TLOC and in case of its failure there will be change in service chaining?

regards,

HTH,
Please rate and mark as an accepted solution if you have found any of the information provided useful.

Octavian Szolga · ‎04-11-2022

Hi John,

Now I see your point. You're right.

When I quickly tested your setup, I still had BFD sessions UP between spoke sites even though those weren't used.

I've tried multiple scenarios/policies, but the result is the same. The end to end path is not tracked..

BR,

Octavian

Octavian Szolga · ‎04-12-2022

Hi John,

Although it defeats the whole point of using SD-WAN (advertising services routes, having the flexibility of influencing who goes where up to a point), I think the only option you have to overcome the lack of end to end path tracking is to use a somehow clasical design on the hub to firewall segment:

- each hub has 2x Layer3 interconnects with the firewall;

- each interconnect runs a dynamic routing protocol;

- each interconnect belongs to a separate SVPN;

- branch-A and branch-B LANs are part of separate SVPN segments which can communicate only through the firewall (hub-A or hub-B)

Hub-A receives from OMP the branch-A LAN (SVPN 10) and advertises the prefix to the firewall.

Hub-A receives branch-A LAN from the same fw, but on a separate L3 interconnect that belongs to branch-B SVPN 20.

This prefix is redistributed into OMP so that branch-B gets to know that branch-A LAN is available over that path.

If branch-A loses BFD (transport) to Hub-A, the route is never advertised into OMP SVPN20 of branch-B.

Of course, it may get complicated..

BR,

Octavian

JohnG2020 · ‎04-15-2022

Thanks for your design idea, I appreciate the LAB testing. The only problem for my customer's environment would be the number of routes received in both SVPN10 &20. They've already reached the max on some of their smaller vEdges the total number of supported OMP received routes (30k for 100B). Plus they have 8 total hubs around the world using a hub and spoke topology for their closest region. The hubs all form a full mesh between everyone but the remotes only form BFD to the hubs.

The whole point of service TE is to monitor end-to-end the remote site behind a hub. Cisco should enhance this feature to include the hub TLOC as the ultimate on behalf of the remote sites. In a true hub and spoke scenario you don't form BFD sessions between the remote sites so monitoring a remote ultimate TLOC is pointless in my customer's design.

Here is another interesting problem with service FW chaining in the hubs and route summarization. This design I really wanted to implement but couldn’t come up with a solution to keep the traffic flows symmetrically every time I tested this out:

- Remote A receives default, RFC1918 & specific routes (AWS/Azure) from hub-A

- Remote A advertises its SVPN10 LAN routes to all hubs (A-C)

- Remote A sends all traffic to hub-A using service FW chaining based on the TLOC-LIST setting the hub’s preferences

- Remote B set up the same way to hub-A

- Remotes A & B can communicate without any problems via hub-A service FW chaining

- Remote C joins the fun using hub-C

- Remote A sends a packet to C via hub-A because of the RFC1918 summary

- Hub A sends the packet to C

- Remote C responds back via hub-C because of the RFC1918 summary and is now asymmetric via the FW

The above design would be really nice if the hub can figure out how to become a transit. Only Remote A and B using hub-A send this communication to the FW, everyone else is transit not using the local FW. The question now becomes, how to make the vEdge in a hub to do this? The service FW chaining is set in a control policy outbound to the remote sites using specific remote subnets.

The easiest solution is to tell my customer you can only have one active hub & spoke design and the rest of the hubs will be backups.