08-19-2019 06:28 PM - last edited on 08-20-2019 05:16 PM by Hilda Arteaga
This topic is a chance to discuss more about SD-WAN, it's foundations and inner mechanisms as well as its correct design and implementation to achieve desired business outcomes. Software-Defined WAN (SD-WAN), is a popular technology and this event is aimed to help engineers/customers/partners understand the benefits and possible advantages that its implementation can bring.
To participate in this event, please use thebutton below to ask your questions
Ask questions from Monday 19th to Friday 30th of August, 2019
Featured expert
David Samuel Peñaloza Seijas works as a Senior Network Consulting Engineer at Verizon Enterprise Solutions in the Czech Republic. Previously, he worked as a Network Support Specialist in the IBM Client Innovation Center in the Czech Republic. David is an expert interested in all topics related to networks. However, he focuses mainly on data centers, enterprise networks, and network design, including software-defined networking (SDN). David has a long relationship with Cisco. He has been a Cisco Instructor for the Cisco Academy and was recognized as a Cisco Champion and a Cisco Designated VIP for 2017, 2018 and 2019. David holds a CCNP R&S, CCDP, CCNA Security, CCNA CyberOps and a CCNA SP certification. Currently, he is preparing for a CCDE.
David might not be able to answer each question due to the volume expected during this event. Remember that you can continue the conversation on the SD-WAN community.
Find other events https://community.cisco.com/t5/custom/page/page-id/Events?categoryId=technology-support
**Helpful votes Encourage Participation! **
Please be sure to rate the Answers to Questions
08-26-2019 05:30 AM
Question around control machine limits and ztp:
We’re looking at a fairly large SD-WAN rollout and I was wondering on limits of the controllers around bfd sessions, control connections etc. We’ll most likely have a hub and spoke type configuration as “branches” do not need connectivity between each other. What I’m trying to find information on is how many bfd type sessions a “hub vEdge” device can accommodate in additional to the capacity of vSmarts around control connections to begin to look at sizing things appropriately (including failover of one “Hub” or vSmart device and how this plays into overall design of the overlay control plane).
With respect to ztp, if we would like to deploy our own certificates (in house CA) would we need to “touch” each vEdge before shipping to remote site (or have on-site personnel install a certificate on the device) before the vEdge contacts vBond? Is ztp possible with self-signed certificate requirement?
Thanks,
08-27-2019 03:51 AM - edited 08-27-2019 07:47 AM
Hello @kenneth.meyers
Effectively, as you have mentioned, one of the ways to scale the solution is to rely on a hierarchical model to restrict the tunnels between sites - the solution works in an any-to-any fashion which taxes scalability as the state is held in the network even if those tunnels are not needed.
Quoting a previous reply in this thread:
The main drawback of it being scalability: as each vSmart controller supports a limit of around 5400 control connections (and those are shared when deployed in multitenancy mode), please note that each TLOC will establish a control connection. Furthermore, doing the math by increasing the number of TLOCs in each vEdge will cut down that limit substantially:
One TLOC - 5400 vEdges
Two TLOCs - 2700 vEdges
Three TLOCs - 1800 vEdges
Regarding the vEdge BFD session limits:
As far as I know, the ZTP process relies on certificates signed by a CA, being Symantec or your enterprise root CA chain, which is then installed in vManage (and all vEdges would need to have the root certificate as well - which means touching them). Have not seen this being accomplished with a self-signed certificate.
Hope that helps.
David
08-27-2019 04:40 AM
Thanks David,
Would it be safe to assume that a "HUB" type vEdge device would have the same scaling limitations as the vSmart controllers previously mentioned.
One TLOC - 5400 vEdges
Two TLOCs - 2700 vEdges
Three TLOCs - 1800 vEdges
In the hierarchical model we're wondering how many "spoke vEdges" can connect to the "HUB vEdge" before we start taxing the capabilities of the Hub device with respect to BFD and IPSEC sessions.
Thanks again.
Regards
08-27-2019 06:00 AM
Kenneth,
The scaling for the Edge devices is different than for the controllers. For the controllers, the scaling factor is mainly the number of control sessions. For the Edge devices, it's about the amount of IPSec tunnels, hence the number of BFD sessions. How many sites are you planning to deploy? What kind of device were you thinking of using at the Hub location(s)?
08-27-2019 06:45 AM
Hi Daniel,
How many sites are you planning to deploy?
We're looking at between 7,000 and 8,000 sites.
What kind of device were you thinking of using at the Hub location(s)?
We're trying to find a datasheet or some other document that outlines the capability of each hardware vEdge device. I have not been able to find a datasheet that identifies number of ipsec tunnels (or BFD sessions). What I've found is throughput information, number of interfaces etc. but yet to stumble upon the information around IPSEC tunnel capabilities of each device. Once we have that information we'll be in a better position to figure out the design aspects given we'll know how many sites can peer with each HUB, how to ensure successful failover in case a HUB goes offline etc.
Thanks Daniel!
Ken
08-27-2019 08:49 AM
The number of BFD sessions is indeed convoluted and hard to find, I suppose it could be motivated to the variety of deployment options (e.g. HW vEdge, cEdge, vEgde cloud) - I have not seen official numbers yet, only throughput and interfaces.
The previous slide I have shared is from a Cisco Live session providing some overview f the solution. Hope that helps!
08-27-2019 08:42 AM
Just for the sake of accuracy - the picture I uploaded in my comment wasnt visible for some reason, just did re-upload. It shows an estimation of tunnels per device.
@kenneth.meyers - keep in mind you would need a high performance device to accommodate the number of tunnels a hub would entail in your design.
08-26-2019 03:36 PM
We are a SP looking to use a compute cluster where we deploy one vedge cloud per customer. We would like to put all customers in a shared underlay for the transport interface, that is a single vlan with a /24 and each vedge gets an IP in that subnet. We also would like our customers to have access to their own vManage to make changes. A danger I see here is a customer changing their vedge cloud transport IP to an IP that overlaps with another customer, allowing customer A to bring down customer B. What could I do to prevent that?
08-27-2019 02:50 AM
Hello @Seth Beauchamp
We would like to put all customers in a shared underlay for the transport interface, that is a single vlan with a /24 and each vedge gets an IP in that subnet. We also would like our customers to have access to their own vManage to make changes.
Being an MSP where your business is about offering transport and sharing the same infrastructure with all your customers, this is always a risk. That being said, there are techniques (mostly relying ion virtualization) to segment your customers so their failure domain is contained and separated, hence, not affecting other customers sharing the same infrastructure.
Are you trying to save IP addresses? Allowing the customers to share the same broadcast domain is dangerous, involves fate sharing. Unless you can enforce it somewhere else in the infrastructure (many access lists or similar tools) can only be cumbersome and posse as a highly complex operational model. Is there a hard constraint? is there any other reason behind this request? cant you simply segment them through subnetting? maybe even PVLANs come to my mind if you need/must go down this road, alas, it would not prevent a customer from using an unauthorized IP address and affecting some other customer's operation. The best is always to keep them "together but not scrambled" - with their own playground.
David
08-27-2019 05:19 AM
@David Samuel Penaloza Seijas wrote:Hello @Seth Beauchamp
Are you trying to save IP addresses? Allowing the customers to share the same broadcast domain is dangerous, involves fate sharing. Unless you can enforce it somewhere else in the infrastructure (many access lists or similar tools) can only be cumbersome and posse as a highly complex operational model. Is there a hard constraint? is there any other reason behind this request? cant you simply segment them through subnetting? maybe even PVLANs come to my mind if you need/must go down this road, alas, it would not prevent a customer from using an unauthorized IP address and affecting some other customer's operation. The best is always to keep them "together but not scrambled" - with their own playground.
David
We were thinking to put the whole subnet behind a single public NAT address so we aren't burning tons of public IPs. Of course we can split them into /31s per customer but that takes a bit more effort burning another vlan, setting up a sub interface, etc. I think thats likely what we will do, but I was searching for any other option. Sounds like its that or take the risk.
08-27-2019 06:23 AM
Seth,
I want to highlight that even though the solution is able to work behind NAT, there are still considerations for not turning this into an operational nightmare.
The vEdge routers form control sessions with the controllers using DTLS/TLS. If using DTLS, this is done by using UDP in the port range of 12346 to 12446, where 12346 is the base port. This means that you will have several devices trying to communicate through the NAT device with the same source port. When the NAT device tries to translate the source IP, it will not be able to maintain the source port for all of the Edge devices. Now, there are methods to select a different base port and to do port hopping, but it is more complex than if they were behind no NAT or 1:1 NAT.
Also be very careful with symmetric NAT, that is, NAT devices that translate the source port to something else than the original port. If the original port was 12346, it then got translated to 23800, this can cause issues with the data plane because the symmetric NAT may report a port being used to the vBond but the actual port used between two vEdges may be another one. This can cause issues in forming tunnels between vEdge devices depending on if the other side is behind NAT or not.
08-27-2019 07:46 AM
Thanks Daniel this is good information. If you don't mind let me clarify one thing so I'm sure I understand... The vEdges will communicate to the controllers sourcing port 12346? As opposed to the vEdge sourcing a high number port with a destination to 12346. The control plane will be hosted in AWS and will all get their own public IPs.
08-27-2019 10:32 AM
Yes. Both the source port and destination port is in the range 12346 to 12446. It's not an ephemeral port. I'm guessing they designed it this way to make it more deterministic and easier to write firewall rules etc.
08-27-2019 11:18 AM
Dear @daniel.dib
Thanks for joining this session sharing your knowledge, you're contributions are value and help many to solve their issues and doubts
08-27-2019 12:31 PM
He has been a fantastic sparring partner - always supporting around! Kudos to @daniel.dib !
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide