Building a Personal NSO AI Assistant with NSO Gitbook to RAG Converter

Qi Li · ‎11-06-2025

NSO is a powerful and sophisticated orchestration platform designed to simplify and optimize network operations. While it may seem complex at first glance and requires some effort to master, its capabilities bring significant long-term benefits for those who invest the time to learn it. During the learning process and daily usage, there will be scenarios which you have certain questions about NSO, but hard to find in the NSO Gitbook and hope you have a Assistant to point you to the right place. At the same time, there will be a use case where the Customer would like to have an AI Controller on the northbound to instruct NSO what to do next. For example, asking NSO to configure a query timeout requires obtaining the XML configuration of the query timeout. The customer can input the intention into the AI Controller, which knows how to configure the query timeout from Gitbook, and then use the payload to modify the ncs.conf. However, the majority of the enterprise networks that have NSO installed are isolated in control planes. Therefore, giving customer the ability and API to build his own NSO AI Assistant is essential for customer to start his first step with AI Use Cases or ask anything bot for NSO.

In this documentation, we will present “NSO Gitbook to RAG Converter API” and present a few examples on how to elaborate this API to build your own NSO AI Assistant. “NSO Gitbook to RAG Converter API” is an API that converts NSO Gitbook https://nso-docs.cisco.com/ into RAG so LLM can consume the return value from RAG as context. Then LLM uses the context extracted to construct always up to date answer. To make sure the content is more LLM-friendly with less noise and effort on data cleaning, we use the NSO Gitbook GitHub repository NSO-developer/nso-gitbook: NSO GitBook GitHub sync instead of HTML Parsing. This gives more accurate search results in RAG and a better understanding of LLM.

Repository

“NSO Gitbook to RAG Converter API” can be obtained from the following repositories - NSO-developer/gitbook-to-rag-converter

Getting started

Start with constructing a structure of your project as the one below, then clone the “NSO Gitbook to RAG Converter API” repository under lib.

.
├── Makefile
├── lib
│   ├── __init__.py
│   └── gitbook-to-rag-converter <<<<< Clone here
├── config.json
├── logs
├── requirments.txt
└── main.py <<<< We gonna work on this file

In this article, we will modify the main.py by importing the “NSO Gitbook to RAG Converter API” and use it to populate the RAG Database and interact with the LLM. For this guide, we will lead everyone through on how to build a simple NSO AI Assistant. Based on the simple NSO AI Assistant built with this guide, one can further extend to Agentic RAG by adding components like context validation.

Usage of “NSO Gitbook to RAG Converter API”

The “NSO Gitbook to RAG Converter API” is simple to use. As preparation, populate the NSO Gitbook Github repository into resource/nso-gitbook as the command below.

git clone git@github.com:NSO-developer/nso-gitbook.git resource/nso-gitbook

Make sure install all the dependency from pip

pip install -r requirments.txt

Setup the API by configure config.json

"vdb_path" - the location of RAG database
markdown_path" - The location of the gitbook (Recommend to keep this as it is)
"embedding_model" - What embedder will be used to populate RAG from Hugging Face
"doc_vers" - which NSO version OF Gitbook Documentation do you want to populate into RAG. latest means the main branch.

{
    "vdb_path":"resource/vectordb",
    "markdown_path":"resource/nso-gitbook/",
    "embedding_model":"sentence-transformers/all-mpnet-base-v2",
    "doc_vers":["latest","6.4","6.3","6.2","6.1"]
}

To populate the RAG Database, use the function.

save_vdb()

This function will populate RAG in every NSO version that is stated in "doc_vers".

Obtain information from the RAG Database. while mode can be "max_marginal_relevance_search" or "similarity", while the default is "similarity", while query is the question from the end user, and "top_result" is how many data do you want to extract.

data=query_vdb(query,top_result=2)

So now we know how “NSO Gitbook to RAG Converter API” works, lets import it into main.py and start constructing a Simple NSO AI Assistant.

Building a Simple NSO AI Assistant

In this chapter, we start by building a simple NSO AI Assistant with “NSO Gitbook to RAG Converter API”. The main concept is shown in the diagram below. We will utilize the context extracted from “NSO Gitbook to RAG Converter API” RAG Database as background information to hint the LLM to answer the query more accurately.

Simple Sceario.png

We start by importing the higher-level library into the main.py

from lib.gitbook_to_rag_converter.langchain_markdown import save_vdb, query_vdb

When the system is started, one needs to start by populating the RAG Database if the database is not exist.

persist_directory = 'lib/resources/vectordb'
init=False
if not os.path.exists(persist_directory+"/chroma.sqlite3"):
    save_db()

Then we take input from the end user and use it to extract output from RAG.

print("What can I help you today:")
query=input()
rag_result=query_vdb(query,mode="max_marginal_relevance_search",top_result=2)

The output is a string with the "top_result" amount of context from RAG. Example of the query for the question "What is Operational Data in CDB in NSO 6.4.8?"

source: title: CDB and YANG - Key Features of the CDB, url: https://nso-docs.cisco.com/guides/development/introduction-to-automation/cdb-and-yang
result: The CDB is a dedicated built-in storage for data in NSO. It was built from the ground up to efficiently store and access network configuration data, such as device configurations, service parameters, and even configuration for NSO itself. Unlike traditional SQL databases that store data as rows in a table, the CDB is a hierarchical database, with a structure resembling a tree. You could think of it as somewhat like a big XML document that can store all kinds of data.  
There are a number of other features that make the CDB an excellent choice for a configuration store:  
* Fast lightweight database access through a well-defined API.
* Subscription (“push”) mechanism for change notification.
* Transaction support for ensuring data consistency.
* Rich and extensible schema based on YANG.
* Built-in support for schema and associated data upgrade.
* Close integration with NSO for low-maintenance operation.  
To speed up operations, CDB keeps a configurable amount of configuration data in RAM, in addition to persisting it to disk (see [CDB Persistence](../../administration/advanced-topics/cdb-persistence.md) for details). The CDB also stores transient operational data, such as alarms and traffic statistics. By default, this operational data is only kept in RAM and is reset during restarts, however, the CDB can be instructed to persist it if required.  
{% hint style="info" %}
The automatic schema update feature is useful not only when performing an actual upgrade of NSO itself, it also simplifies the development process. It allows individual developers to add and delete items in the configuration independently.  
Additionally, the schema for data in the CDB is defined with a standard modeling language called YANG. YANG (RFC 7950, [https://tools.ietf.org/html/rfc7950](https://tools.ietf.org/html/rfc7950)) describes constraints on the data and allows the CDB to store values more efficiently.
{% endhint %}

source: title: Tools - Insights, url: https://nso-docs.cisco.com/guides/operation-and-usage/webui/tools
result: The **Insights** view collects and displays the following types of operational information using the `/ncs:metrics` data model to present useful statistics:  
* Real-time data about transactions, commit queues, and northbound sessions.
* Sessions created and closed towards northbound interfaces since the last restart (CLI, JSON-RPC, NETCONF, RESTCONF, SNMP).
* Transactions since last restart (committed, aborted, and conflicting). You can select between the running and operational data stores.
* Devices and their sync statuses.
* CDB info about its size, compaction, etc.

The "query_vdb" function will actually recognize the NSO version specified in the query and use that as a filter to only extract information for that specific NSO version.

The next thing we are going to do is to feed the context extracted to the LLM and ask LLM to answer the query with the context. For the example below, we use OpenAI as an LLM vendor. However, one can use another LLM vendor that is supported by Langchain.

We start with setting the parameter that OpenAI required "OPENAI_API_VERSION", "AZURE_OPENAI_ENDPOINT", app_key, and the LLM model of choice provided by OpenAI. init_openai create an AzureChatOpenAI object and query_openai invoke the AzureChatOpenAI object created and send a query to the OpenAI.

from langchain.chat_models import AzureChatOpenAI

os.environ["OPENAI_API_VERSION"] = "OPEN API VERSION"
os.environ["AZURE_OPENAI_ENDPOINT"] = "OPENAI ENDPOINT"
app_key="APP KEY"
model="LLM MODEL NAME"


def init_openai(model=None):
    """
    Function to initialize the OpenAI service .
    """
    if not model:
        models=config_model
    else:
        models=model
    llm = AzureChatOpenAI(deployment_name=models, 
                        azure_endpoint = os.environ["AZURE_OPENAI_ENDPOINT"], 
                        api_key=os.environ["OPENAI_API_KEY"],  
                        api_version=os.environ["OPENAI_API_VERSION"],
                        model_kwargs=dict(
                        user=f'{{"appkey": "{app_key}"}}'
                        )
        )
        
    return llm

def query_openai(query,llm):
    """
    Function to query the OpenAI service with a given query string.
    """
    try:
        response = llm.invoke(query)
    except openai.AuthenticationError:
        print("Authentication Error. Update access token")
        llm=update_access_token()
        response = llm.invoke(query)
    return {"content":response.content, "token":response.response_metadata['token_usage']['total_tokens']}

Inside the query, we should construct HumanMessage and SystemMessage. While SystemMessage will include the context we extract from RAG. In the example below, we construct a SystemMessage in systemPrompt and provide a sample hint, and then inject the context extracted from RAG as part of the SystemMessage. Then feed the SystemMessage together with the query in HumanMessage to the LLM to obtain the final result.

from langchain.messages import SystemMessage, HumanMessage, AIMessage

  systemPrompt = f'''
      You are a NSO expert that will answer question based on the context provided below. 

      Here are the set of contexts:

      <contexts>
      {rag_result}
      </contexts>
    `;
    '''

messages = [
    SystemMessage(systemPrompt),
    HumanMessage(query)
]
response = model.invoke(messages)

The response will include the feedback from the LLM. Until here, you have a simple NSO AI Assistant constructed.

Recommended Next Step

As the next step, we recommend to make the AI Assistant personalized by giving it a memory as stated in Memory overview - Docs by LangChain.

At the same time, make RAG more accurate by converting the RAG to an agentic RAG. This will let AI make a decision to rephrase the user question to a better version and validate RAG context is relevant or not.

Regarding how to rephrase the query and the effect of rephrase, one can take a look at the following article - PaperMadeEasy | Rephrase and Respond : Let LLM Ask Better Questions for Themselves | by Himanshu Bajpai | Medium