02-14-2019 07:00 PM
checked with command "/etc/init.d/ncs status" to find out that the cdb clients increasing these days, which will case NSO does not response. below is the output of the status of CDB part:
cdb:
cluster mode: master (synchronous replication)
current transaction id: 1550-80281-219216@aptx1nso365.webex.com
running:
filename: /var/opt/ncs/cdb/A.cdb
disk size: 155.3216 MB
ram size: 568.4042 MB
read locks: 0
write lock: unset
operational:
cluster mode: master
current transaction id: 0
filename: /var/opt/ncs/cdb/O.cdb
disk size: 1.10046 MB
ram size: 6.10082 MB
subscription lock: unset
no pending subscription notifications
registered cdb clients:
client name: Cdb-ResourceManaged-1623-pool:404
type: client
db: operational
subscription-lock: false
client name: Cdb-ResourceManaged-1610-pool:403
type: client
db: operational
subscription-lock: false
client name: Cdb-ResourceManaged-1605-pool:402
type: client
db: operational
subscription-lock: false
client name: Cdb-ResourceManaged-1590-pool:401
type: client
db: operational
subscription-lock: false
How to clear the client connections? And how to find out why the clients numbers keep increasing?
02-14-2019 10:06 PM
That the sequence number increases is normal, but if the number of active connection increases without bound you have more of a problem.
If it grows rapdily, it depends on who is responsible for the connection, that software needs to make sure to close their sessions when they are done with them. In this case you list four active connections from the resource manager - which is quite normal.
For development purposes, a package reload will reset most of these for you.
02-14-2019 10:48 PM
02-14-2019 11:01 PM - edited 02-14-2019 11:01 PM
Okay, so then it is being immediately re-created. Do you actually have over 400 allocation pools that are being monitored or what are the client names that make up the bulk of that list?
02-14-2019 11:11 PM
02-15-2019 01:18 AM
A few different things. First of all, are you sure the slow down is because of the connections and? Generally, oper-data connections like these are pretty cheap.
Regarding tracking it, ncs --status (which is what hides behind the command you run) is pretty good for seeing the current state, to see things happening (in particular for configuration sessions) devel.log is good. For a system install you might have to turn that on explicitly in ncs.conf.
If all the connections (or at least, several hundred of them) are from the ResourceManager you might want to file a ticket to get an explanation.
02-20-2019 05:23 PM
We find that we have daily sync-from job which is calling api/running/devices/device/%s/_operations/sync-from, it causes the the connections increase sharply.
Below is the snapshot:
[root@apsj1nso001 ~]# /etc/init.d/ncs status | grep -i "client name" | wc -l
568
[root@apsj1nso001 ~]# curl -X POST -u apiuser:90V1rtua1 http://localhost:8080/api/running/devices/device/ORD10-WXBB-PE01/_operations/sync-from
<output xmlns='http://tail-f.com/ns/ncs'>
<result>true</result>
</output>
[root@apsj1nso001 ~]# /etc/init.d/ncs status | grep -i "client name" | wc -l
570
02-21-2019 12:43 AM - edited 02-21-2019 12:47 AM
That seems more likely. So, what does the session look like in the status output? Doing a lot of synch-from at the same time will slow your system down of course, does it cause you a problem once the synchronization is done?
02-21-2019 05:20 PM
02-22-2019 02:41 AM
Hello,
I think these sessions are from a NED, and not from the 'resource manager' package.
There were similar issues of file descriptor leaks in the cisco-ios NED, for instance, and it was fixed in version 6.0.13 (that was already 6 months ago, so I'm not sure if you are still using the broken version).
What are the versions of the NEDs you are using?
02-22-2019 05:20 PM
02-25-2019 01:35 AM
Hi,
There is this note in the CHANGES for cisco-ios version 6.0.13:
- Properly clean up NSO resources when closing NED.
I have seen a couple of cases earlier, where the file descriptor usage was increasing forever, with versions of cisco-ios NED prior to this fix. It is especially worse if you have a lot of devices.
Try with a NED version later than 6.0.13, and see if that solves the problem.
/Ram
03-25-2019 10:52 PM
I have update the cisco-ios version 6.3, still take no effect. below is the detail information
[root@apsj1nso001 ~]# /etc/init.d/ncs status | grep -i "client name" | wc -l
2749
ncsadmin@ncs# show packages package package-version
PACKAGE NAME VERSION
--------------------------------
acl106 1.3
bblinkservice 1.2.1
cisco-asa 6.0.4
cisco-fmc 1.0.4
cisco-ios 6.3
cisco-iosxr 6.6.2.1
cisco-nx 5.6
citrix-netscaler 3.0.23
serviceseconciliation 1.3.1
snmp-notif-recv 1.0
tailf-hcc 4.3.2
upservice 1.6
Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the NSO Developer community: