cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1760
Views
25
Helpful
8
Replies

CSPC 2.7.4 won't collect.

spemiller
Level 1
Level 1

Hi.

Our CSPC system stopped collecting a while ago.  Discovery and uploads to the SNTC system work fine so I didn't notice that data collection has been broken.

The message in the Job log is:

 
 
Initializing...
Discovering known devices...
Unable to send message to the Cisco Backend. Reason:No response to registration request from the connectivity module.
 
I re-ran a collection a couple times with tcpdump running and the CSPC host never sends any packets attempting to reach anything off our network.  Lots of SNMP to devices during discovery but nothing at the time of collection.  It looks like the problem existed prior to the last software upgrade or two.
 
Anyone have an idea of what to do to get it to collect again?
 
Thanks.
 
1 Accepted Solution

Accepted Solutions

Spencer,

 

The discrepancy you see in the portal with the switch stack being replaced is most likely due to a fresh inventory not being ran successfully and uploaded, so you may be seeing stale data.

 

Could you share a screenshot of the error as well as providing some log files via PM. The log files can be gathered into a zip with the below:

 

#zip -r /home/collectorlogin/conn_logs.zip /opt/ConcsoTgw/tail-end-gateway-decoupled/bin/nohup.out /opt/ConcsoTgw/tail-end-gateway-decoupled/bin/CONN_TEG_LOGS/tail-end-gateway.log

 

To answer your question about discovery vs. collection, the discovery is the collector learning of the device and during collection bulk SNMP data is gathered from each device and aggregated into a zip file we call the transport file which is then uploaded to the portal. 

 

As far as redeploying goes, it may be a quicker resolution than continuing troubleshooting but I will leave that one up to you. I would like to figure out why this error is being encountered.

 

Thanks,

Brandon

View solution in original post

8 Replies 8

brawall
Cisco Employee
Cisco Employee

Hello,

 

Can you access the collector via CLI and get to root user and issue the command #service cspc restart

 

Once complete, try a new collection and upload and update the post. 

 

Thank you,

Brandon

Thanks for the quick reply Brandon.

 

I did as you asked and it failed on collection with the exact same message.  I had restarted CSPC and even the entire OS / VM in the past with no change.

 

===================================================================

[collectorlogin@cspc-a ~]$ su - root
Password:
Warning: your password will expire in 70 days
[root@cspc-a ~]#
[root@cspc-a ~]#
[root@cspc-a ~]# service cspc restart
Stopping CSPC...                                           [  OK  ]
Shutting down MySQL..                                      [  OK  ]
Stopping Connectivity Tail-end Gateway:                    [  OK  ]
Checking If Services are Shutdown...                       [  OK  ]
Starting CSPC...                                           [  OK  ]
Starting MySQL..                                           [  OK  ]
Starting Base Collector and Add-Ons... Starting CSPC:
Starting Connectivity Tail-end Gateway:
nohup: redirecting stderr to stdout
                                                           [  OK  ]
Starting Tomcat...                                         [  OK  ]
[root@cspc-a ~]# exit
logout
[collectorlogin@cspc-a ~]$
================================================================

I logged into the WebUI as Admin123, selected Collect from the menu bar, selected collection profile SNTC and pressed Finish.

================================================================

 

To provide more accurate info, uploads haven't been happening the past few weeks either but I haven't looked at why yet.  Looking at the job logs now, there are not any failing upload jobs that I can see.  Just the failing collect jobs.  I checked the SNT collection profile and it is set to repeat every 2 weeks on Monday with no end date.

 

Any other ideas?

 

Thanks,

Spencer

 

 

Spencer,

 

A few things to try. First can you verify that your hostname you have set resolves locally? You can just do a #ping cspc-a

 

Second, please go to Settings -> Application Settings -> Export Settings. Toggle Upload via from Connectivity to another option and press OK, then change it back to Connectivity after and try the collection again.

 

Also, just to clear things up, in your collection profile do you have it set to upload to remove server in the collection? If so, this is why you don't see anything in upload jobs since the upload actually occurs during the collection job. 

Brandon,

 

I realized that I didn't try an upload and report on that so I did an upload job and it was successful and it shows as processing in SNTC.

 

From the CSPC cli, it resolves fine and a grep of the /etc/hosts file shows the fqdn as a second name to the 127.0.0.1 localhost line.

 

I toggled upload, saved, went back in and toggled it back to connectivity and saved.  I'll try a collect after the SNTC processing finishes.  I had toggled this in the past from someone's advise and it didn't change things then.

 

The collection profile does have the "Upload to remote server" checked.

 

I did find the log info for the scheduled collection job and it has been running but failing with the same reason.  Just had to go to the next page in the WebUI to get past all the logs for recent test jobs. 

 

We use this to collect device inventory for SNTC but I don't think we do any SSH or telnet.  Just SNMP.

Do you know what is collected via SNMP in the discovery phase and what is collected fic CLI to devices and how it all affects what gets sent to SNTC?  It looks like some devices have recent timestamps so I wonder if chassis/modules/serial numbers etc are updated in the discovery (SNMP) phase and collection is not even needed or if this does need to be fixed.  I'm not against deploying a new VM and adding the seed files, certificate, license info if it is likely to resolve this.

 

Thanks,

Spencer

 

Spencer,

 

When you get a chance, confirm if the upload works when the collection runs. If so, then this issue seems to be resolved. Can you verify a little more on what you mean by the timestamps and having to redeploy?

 

If you are referring to timestamps on portal, I did notice that you are viewing your inventory in Comprehensive mode. This can be changed to view only the data in the most recent upload under Application Settings -> Report Preferences -> Latest View.

 

Lastly, the data we collect during discovery is basically what you see when you view the managed device list in the collector. The OIDs that we collect during the collection job can be viewed at Settings -> Manage Data Collection Profiles -> Select profile/Modify -> Select Datasets. We do require that the collection be run prior to upload. 

 

Thank you,

Brandon

Hi Brandon.

 

Uploads with the collection job did not work this morning and has not happened since May 14th from what I see.  For the timestamps, yes I was referring to the Collection date column of the Inventory > All devices view on SNTC with Report preferences set to Comprehensive view.  Some have yesterday's date and many have a date of May 14 or earlier.  Scrolling through I didn't notice any in between.  Sorry that I was clear on that before. 

 

Thanks for the pointer to what is being collected.  I'm wondering what is done in the device discovery phase vs collection phase in relation to getting inventory info to upload to SNTC.  tcpdump shows lots of SNMP traffic but I don't know if that was just device identity checking in device discovery phase or if some collection took place but the log incorrectly shows the job failing.  I suspect inventory info is in the collection phase and it is not happening.  I checked SNTC Inventory > All Equipment with Comprehensive view on for a switch that had a chassis replaced and the old serial number shows in SNTC.  A switch stack that had a unit added does not show the additional switch in SNTC.

 

My reference to redeploying is about reinstalling the CSPC software if we can't find the cause of collection failing with the message"Unable to send message to the Cisco Backend. Reason:No response to registration request from the connectivity module".

 

I think May 15th is around the time I realized that our collector hadn't been doing new device discovery and I ran a device discovery job.  It looks to me like collection discovers existing devices so any devices added to the network won't get added with the scheduled collection job.  Collection has been broken since.

 

If my info isn't relevant, please don't let it distract you.  Let me know what  you think the next step is.

 

Thanks,

Spencer

 

 

 

Spencer,

 

The discrepancy you see in the portal with the switch stack being replaced is most likely due to a fresh inventory not being ran successfully and uploaded, so you may be seeing stale data.

 

Could you share a screenshot of the error as well as providing some log files via PM. The log files can be gathered into a zip with the below:

 

#zip -r /home/collectorlogin/conn_logs.zip /opt/ConcsoTgw/tail-end-gateway-decoupled/bin/nohup.out /opt/ConcsoTgw/tail-end-gateway-decoupled/bin/CONN_TEG_LOGS/tail-end-gateway.log

 

To answer your question about discovery vs. collection, the discovery is the collector learning of the device and during collection bulk SNMP data is gathered from each device and aggregated into a zip file we call the transport file which is then uploaded to the portal. 

 

As far as redeploying goes, it may be a quicker resolution than continuing troubleshooting but I will leave that one up to you. I would like to figure out why this error is being encountered.

 

Thanks,

Brandon

Thanks for your help on this Brandon.

 

To anyone looking at this thread with a similar issue, we decided to replace the CSPC system with an ova deployment of version 2.7.4 and plan to install the 2.8 update later in October.  It looks like there was too much stuff broken to bother trying to fix it.  With just a few items to enter in for discovery and collection it is fairly straightforward for us to redeploy.

 

Thanks.