cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
665
Views
1
Helpful
4
Replies

Efficient ways of collecting data on large vManage estate

Hi, 

I work for a SaaS business called Highlight (highlight.net) which provides a network service assurance platform for Service Provider managed services teams and large/complex enterprises. To do what we do, we need to collect data from the SD-WAN controller (in this case vManage but we also do this with the Meraki dashboard) on a regular basis, normally between every 1 to 5 minutes. Specifically for Catalyst SD-WAN we want to collect state and stats on device interfaces, BFD sessions and App Route statistics

There are some vManage API calls that help us greatly in this regard, and some that don't. Calls like /dataservice/statistics/interface and /dataservice/data/device/state/BFDSessions are 'Org' wide calls so only need to be polled by us once a poll interval.

Unfortunately though, other calls such as /dataservice/device/ipsec/outbound?deviceId={{deviceId}} and /dataservice/device/app-route/statistics?deviceId={{deviceId}} are per device calls and these don't scale well when you have 500+ devices

'Ah but we have bulk state and stats calls!' I hear you say - yes these calls exist, but from what I can tell, there is a 20 minute delay on most of the calls in /dataservice/data/device/statistics/<data-type> - for example the approutestasstatistics data set is showing as 20 minutes delayed in the portal UI, so I suspect the API won't be any better?

So, is there any way of removing the 20 min delay on these bulk calls or are there other API calls we should be using? 

Thanks in advance.

Martin Saunders

4 Replies 4

Hi @MartinSHighlight these are the only APIs you mentioned for what information you are looking for. You could try and use the https://github.com/CiscoDevNet/SD-WAN-Reporting-Tool as this will provide all the information in one go for all devices.

As far as i am aware the fixed rate-limit is per node and this cannot  be change/lowered  https://developer.cisco.com/docs/sdwan/#!browsing-returned-results-sorting-results-filtering-results-and-rate-limits

If i recall, Bulk API state endpoints return data that is updated every 30 seconds. The statistics endpoints return data that is updated every 5 seconds. This means that the results of a call to a state endpoint may be up to 30 seconds old, and the results of a call to a statistics endpoint may be up to 5 seconds old.

Starting from Cisco SD-WAN Release 20.6.1, Cisco SD-WAN Manager supports below API limits:

  • API Rate-limit: 100/second

  • Bulk API Rate-limit: 48/minute

I am yet to see anything from others asking if Aggregated Queries would solve this https://developer.cisco.com/docs/sdwan/#!query-format/aggregated-query

Hope this helps.

Please mark this as helpful or solution accepted to help others
Connect with me https://bigevilbeard.github.io

Thanks very much for your quick reply and your help. The SD-WAN reporting tool looks interesting - we'll certainly have a look at that.

Regarding the bulk stats query - one issue is that by default vManage is set to only collect stats every 30 mins:

MartinSHighlight_0-1695734630334.png

According to Systems and Interfaces Configuration Guide, Cisco SD-WAN Releases 19.1, 19.2, and 19.3 - System and Interfaces Overview [Cisco SD-WAN] - Cisco - that can be lowered to every 5 minutes - I'm not sure how well a big vManage system will cope with that? 

 

Interesting, i was not aware that could be lowered in the UI. That might be a TAC question regarding concern/risk of lowering that.

Please mark this as helpful or solution accepted to help others
Connect with me https://bigevilbeard.github.io

This can certainly be lowered as mentioned in release notes:

The minimum time you can specify is 5 minutes and the maximum is 180 minutes.
However, if we are trying to collect more stats that would mean more data on this vManage instance.