cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1031
Views
3
Helpful
4
Replies

nso CDB dump - encoding specifics for what is sorta base64? text blobs

atreyulovesyou
Level 1
Level 1

hi,

I'm looking at cdb dumps, as part of trying to create a linked knowledge graph of nso for things like looking at total configuration space of devices versus that managed by services we've created in NSO.  this, along with what amounts to 'sync-from dry-run' should let me see config coverage and things like drift of devices.

I know of no better way to get at the totality of the CDB in a way that allows me to convert it to another data form (such as loading it into dataframes, and then going over those to generate a network graph) - no better way than to make a CDB dump.  querying the apis exhaustively always is VERY slow for me (5.7.x) by comparison, and the CDB dump is suitable for offline processing at another system.

It's pretty nice as an information source, as you end up with 'backpointer:"[ /ncs:services/my-thing:my-thing{xr-xyz-0} ]", which lets you attribute where services are attached what devices (by the looks of things it's read top-to-bottom, meaning a backpointer points you at where a service instance implemented a set of settings, and 'pad' is for things like namespace entries.  In this I mean not every line is attributed with a backpointer.

I should be able to parse these out... but there's some that's not-quite-intelligible to me.

there's encoded text bits, the 'latest-commit-params', 'latest-u-info', and larger blocks which are 'diff-set'.  those 'diff-set' things might be useful.

Attempting to decode these diff-set entries: It appears to be base64 encoding, but there's it comes out as half english text (in our case) referencing various service, tailf, and ned names - in the form of urls.

However, there's a mash among it of what appear to be when you just try to convert this to text, of other high unicode.

My guess is this is somewhat structured data, serialized as these 'diff-set' blocks into the cdb dump. 

Has anyone had success decoding the encoded content of a CDB dump or know about specifics of the dump export encoding?

1 Accepted Solution

Accepted Solutions

atreyulovesyou
Level 1
Level 1

thanks @radioman  I'll see what I can figure.  When I decoded as base64 there ended up being intermingled escaped characters with decoded readable text. 

apparently term · PyPI can decode erlang 'binary_to_term-1' - its some kind of tty format apparently, who knows!  

=====

ok, better, this is an improvement.  this is an object serialization format of some kind, like your example string above.

REF: elixir-py · PyPI

import elixir
import base64

i_am_so_testy = "g2wAAAABaAJkAAh0cmFjZV9pZG0AAAAkYjcxNjU0MDktYTVhNC00YWRhLThhZDMtNWI0OTgy\nNjJkOWNiag=="

i_am_not_an_arrow_i_am_not_a_pickle = base64.b64decode(i_am_so_testy)

foo_d = elixir.binary_to_term(i_am_not_an_arrow_i_am_not_a_pickle)

# In [ipython]: foo_d
# Out[ipython]: [(Atom(b'trace_id'), b'b7165409-a5a4-4ada-8ad3-5b498262d9cb')]

so that thing is an encoded traceID.  

View solution in original post

4 Replies 4

radioman
Spotlight
Spotlight

HI

I have found values of eg. forward-diff-set, diff-set, latest-commit-params and latest-u-info to be containing data in this format: https://www.erlang.org/doc/apps/erts/erl_ext_dist.html

You could try something like this in the erlang shell:

erlang:binary_to_term(base64:decode("g2wAAAABaAJkAAh0cmFjZV9pZG0AAAAkYjcxNjU0MDktYTVhNC00YWRhLThhZDMtNWI0OTgy\nNjJkOWNiag==")).

br.

Kristoffer Larsen

atreyulovesyou
Level 1
Level 1

thanks @radioman  I'll see what I can figure.  When I decoded as base64 there ended up being intermingled escaped characters with decoded readable text. 

apparently term · PyPI can decode erlang 'binary_to_term-1' - its some kind of tty format apparently, who knows!  

=====

ok, better, this is an improvement.  this is an object serialization format of some kind, like your example string above.

REF: elixir-py · PyPI

import elixir
import base64

i_am_so_testy = "g2wAAAABaAJkAAh0cmFjZV9pZG0AAAAkYjcxNjU0MDktYTVhNC00YWRhLThhZDMtNWI0OTgy\nNjJkOWNiag=="

i_am_not_an_arrow_i_am_not_a_pickle = base64.b64decode(i_am_so_testy)

foo_d = elixir.binary_to_term(i_am_not_an_arrow_i_am_not_a_pickle)

# In [ipython]: foo_d
# Out[ipython]: [(Atom(b'trace_id'), b'b7165409-a5a4-4ada-8ad3-5b498262d9cb')]

so that thing is an encoded traceID.  

ramkraja
Cisco Employee
Cisco Employee

Hi,

Yes, as Kristoffer said above, and as you figured out, these are Erlang terms converted to binary and base64 encoded.

But please note that these are strictly *internal* NSO fields, and are critical for the correct operation of FASTMAP. These should really be treated as opaque (which is why they are normally not exposed to any interfaces). You should not ascribe any meaning to them or make decisions based on them. Their format and/or contents are subject to change without notice.

Thanks,

Ram

You should not ascribe any meaning to them or make decisions based on them. Their format and/or contents are subject to change without notice.  - OK

I'm just at a point of trying to turn the cdb dump into knowledge, and I made a little watcher that fires off rollback files and triggers a cdb dump. the deltas in the cdb dump (if you exclude the alarms stuff) seem at cursory examination to be quite interesting as an actual diff when compared to the prior, albeit capturing that diff away from rollbacks.

I need to understand the 'backpointer' and 'refcounter' still, but it seems like its an entry into what then comes lines wise in the dump file, for a service instance association - in my case looking like `[ /ncs:services/my-service:my-service{device-as-target-its-name} ]`

anyhoo, the lines themselves read nicely enough to be chopped into something that'd let me draw a tree.  I feel like I'm remapping to what is probably in yang space or something other else thats arcane to me.

but right now, the goal is turn it into knowledge, and find a way to 'sync-from dry-run' or the like, to see things like config drift, manual applications of settings and whatnot.  Likewise map our coverage in 'what is total ned management space vs what we provide config mgmt for'. 

yangster makes me mad, and (always) crashes my vscode on a remote nso server with java processes which keep starting, consuming memory, and spawning multiple instances of themselves.  entirely non-functional (for me).

I really do hope there's a good way to echo the CDB other than the dumps.  to me it seems like an approximation of a graph db in that you've an empty root tree, and then tailf/other stuff coming off that empty root.  All I've seen at attempts is this: FULLTEXT01.pdf (diva-portal.org)

I know less than zero about networking, but I'd bet you could attach some of these fancy ml algorithms to it if the structure was intelligible.

Networking space is of language, in configuration certainly, and is a protocol and rules defined domain with respect to correctness of form with guided tasks.  There's backing telemetry that can assert that success, and I'm sure it's an exciting time for stuff like self-healing, seek-to-connect with feature xyz kind of stuff for the likes of Cisco.

I just want to finger paint pictures

have a good Friday all.