It is very rare to be building a brand-new network from scratch. A much more common scenario is to introduce automation into a pre-existing network, which is already delivering services to customers. With NSO having been deployed in more than 200 customer networks, this scenario has been encountered numerous times, and in this document, we will discuss the challenges this imposes, and some approaches that have been used.
Managing new service instances - “Ships that pass in the night”
The simplest approach is to use NSO only to deploy and manage new service instances. Other pre-existing service instances are managed through other means, either using CLI towards the devices or using another management system. This could be the end state, or it could be a steppingstone towards more complete NSO management of the network and services.
When using NSO to partially manage networks and devices, the two main issues you will encounter are
When you connect NSO to a network device for the first time, you need to perform a sync-from operation. This will cause NSO to retrieve the running configuration from the device, use the NED in order to parse the configuration into the device YANG model, and store the data in the NSO CDB. NSO will then assume that the configuration in CDB is in sync with the device configuration. The configuration held in CDB is used in order to calculate the configuration changes required for any given operation such as service provisioning or modification/cease.
Whenever NSO initiates a transaction towards a device, the first step in the default transaction process is to check that the CDB representation is in sync with the device configuration. It is the NED’s responsibility to implement the check-sync operation in a way that is suitable for the managed device type, e.g. transaction-id, timestamp, etc. If NSO detects that the device configuration is out of sync with the CDB representation, then the transaction will fail and roll back any changes made so far to other devices included in the same transaction.
In this brownfield scenario, where NSO is only managing new service instances, the assumption is that other service instances are managed through other means, either using CLI towards the devices or using another management system. This means that the CDB configuration is likely to be out of sync with the device configuration at any point in time.
There are a few different cases
In the first case, since the configurations do not overlap, then you can execute the device transactions in such a way that the check-sync operation is not done, assuming that the system will always be out of sync. Additionally, the “no-overwrite” option can be used in order to detect any out-of-band changes to any of the configuration that is going to be changed in the transaction. Any such conflicts can then be dealt with in a fallout process, where someone can figure out which configuration is the correct one, and accordingly either update the service instance or redeploy the service in order to re-establish the service intent from NSO.
The second case can be dealt with in similar manner, except here everything will be fallout, so this is a much more expensive scenario operationally.
The NSO Developer Days video below explains this in more detail.
Allocation and Management of Resources
Most NSO service applications/function packs automatically allocate some types of resources. Resources have different scope – some need to be unique per port, per card, per device or (sub-)network wide. And IP addresses of course generally need to be unique globally.
Allocation of resource in a brownfield network first needs to ensure that it has excluded any resources that are already in use. If the service provider uses an external allocation system, then you need to integrate the NSO service application with this system. The NSO Resource Manager (RM) CFP is designed to be able to integrate with external systems, but it can also be used standalone. If NSO RM CFP allocates resources internally, then you need to mark such resources as used. It now depends on where the service provider maintains such resources, whether they are kept in a manual system such as a spreadsheet. The last resort is to search CDB and use that data to populate the RM CFP with used resources.
Additionally, if the SP is going to create some services with NSO and possibly continue creating other services through other means (e.g. via CLI or other management system), then they need to ensure that they do not allocate overlapping resources. This means that they either need to use the same external allocation system, and/or define non-overlapping resource pools.
Managing Pre-Existing Service Instances
You may also want to manage your pre-existing service instances with NSO. This could involve
How uniform are the pre-existing services?
Imagine that you have deployed a function pack for service type A (e.g. L3VPN), and you have started creating instances of this service type. There are previous instances of service type A on the network, which have been either created manually, or using some other provisioning tool. How uniform are these?
Most likely there will be service instances that could not have been created with the existing function pack. Options include:
What are the pre-existing service instances?
To identify the pre-existing service instances, you need first to know how they were created in the first place. The service provider will probably have record of the service instances
There is also the chance that the records are inaccurate relative to the configurations that exist in the network devices.
You can also try to discover the services from the network devices, but to state the obvious: You can only discover from the network devices data that exists in the device configuration. Data such as service name and ID and customer name etc. may be stored in description fields, or may be implicitly encoded in VRF name, VLAN ID, etc., or may not be stored in the devices at all.
The network will probably also have orphaned service configurations, or parts thereof. These are service instances that should no longer be in use (but may or may not be), or remnants of ceased service instances.
Migrating pre-existing service instances under NSO management
Hence, migrating pre-existing services into NSO automation is a forensic exercise, where you need to find sufficient data to create each instance, and then provision them e.g. using dry-run in order to see if the configuration generated matches what is on the network.
If the service application/function pack automatically allocates resources, then you need either to
For NSO to take ownership of the configuration associated with a pre-existing service instance, it needs to be tricked into believing that it created the configuration. All configuration that is owned by an NSO service instance is tagged with a reference count which reflects how many NSO services depends on this configuration. This ensures e.g. that NSO does not delete the configuration before it is certain that the configuration is no longer in use.
If you don’t update the reference counts then NSO will not own the pre-existing configuration, and if you delete the service in NSO then NSO will put the service back to the state it was before NSO modified it, I.e. back to the pre-existing service.
The NSO Development Guide reference below explains how to deal with the reference counters and has a wider discussion about service discovery.
On-going Discovery and Reconciliation
So far, the assumption has been that you discover any service only once. This is typically done by CX during the deployment of NSO into the service provider network. Once a service instance is under management by NSO then all subsequent changes are done there. From then on, any further service instances of that type are provisioned using NSO. This is the recommended operating mode, and the cheapest.
However, some service providers have organisational and operational constraints that mean that this is not always possible. They necessitate out-of-band changes to service instances under NSO management and even new service instances to be created outside of NSO. This may require the process described above to be applied repeatedly, in order to discover new service instances.
If changes have been made to service instances under NSO management, e.g. changes to an attribute value, then these can easily be identified by a check-sync operation, and then the NSO service instance can be modified accordingly (or otherwise the NSO service intent can be redeployed, overwriting the network configuration)
Reactive Fastmap, Stacked Services and Layered Services Architecture (LSA)
Architectures such as those based on Reactive Fastmap (RFM) or Nano-services add additional complexity. Now the normal dry-run does not work out of the box, since it will show only the configuration after the first iteration of RFM. There are ways of getting around this, however.
It is now common in NSO function packs/service applications to have multiple models on top of each other. While we previously talked about a service models mapping to device models, there are now layers of Stacked Services, possibly in a Layered Services Architecture. Since the purpose of such model stacking is to abstract and hide details towards the bottom of the stack, this makes it harder to provide explicit resource allocation, as these parameters need to be propagated up the model stack to the top model.
Automated Service Discovery and Reconciliation
As we have discussed so far, there are a lot of considerations to be made regarding placing pre-existing services under NSO management. The question now is whether it can be automated, but also whether it should be?
As usual, the answer is “it depends”. If you have 10,000 instances of service A, 1,000 of service B, 100 of service C and 10 of service D, then it’s probably cost-effective to automate discovery of service A, but probably not for service D. This is, of course, if service type A is sufficiently uniformly applied, and that you are not dealing with a hundred variations of service A...
Do consider, however, that service discovery is usually a once-only operation, and it could be easier to outsource to an organisation that does this repeatedly, e.g. Cisco CX.
A framework such as the one described by Dan Sullivan in the video below can greatly reduce the cost of automated service reconciliation, but depending on your services, this can still be costly.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.