Solved: Re: DCR internal error in communication channel - Page 4

helexis · ‎10-12-2009

I have reported this error before...and have a TAC case open for it...but have found a workaround that I wanted to share that might shed some light on the issue.

The URL of Common Services > Device Management when I get the error contains the FQDN.

If I modify this URL by removing the domain suffix and attempt the same change in the DCR the change is successful.

Any ideas?

Joe Clarke · ‎11-02-2009

As I said before, there is no way this is client-related. All of this happens on the backend. The way it works is that you submit an HTTP request saying you want something to happen. A servlet (like a CGI) takes this request, and calls a remote method via our proprietary CSTM RPC system. CSTM then tries to resolve the desired request using the given Java class and method names. THIS is failing from time to time. Why that failure occurs is not clear. For some reason, the DCR JVM running the CSTM client code is unable to resolve the method name from the given class.

helexis · ‎11-03-2009

I just sent the patched DCR.log to my engineer.

Looking forward to your diagnosis. :)

helexis · ‎11-03-2009

I was able to capture the logs for both an unsuccessful credential mod and a successful one.

The logs for each were sent to my engineer at seperate times. Be sure you get both as needed. :)

Joe Clarke · ‎11-04-2009

I found the problem. As I predicted, it has nothing to do with browser, FQDN, or anything. It is a transient issue that only affects Windows SMP systems. It tends to occur mostly on faster machines. A patch is on its way.

helexis · ‎11-04-2009

Is it probable that this issue was prevalent in LMS 2.6 and 3.1?

I am certain we experienced the "internal error in communication channel" error with all of our previous versions (2.6 and 3.1).

We just kept upgrading hoping it was resolved in the new releases.

Joe Clarke · ‎11-04-2009

Yes, it is present in all versions of LMS 2.5 and higher.

helexis · ‎11-07-2009

My dcr.log is interesting.

Full of FATAL messages.

Not sure if I have looked at it since the workaround patch.

I moved RME to the Master DCR and am readding DFM to the cluster on a seperate server. The DCR isn't replicating the device yet so I was checking the DCR logs.

helexis · ‎11-07-2009

WARNING: Certificate HostName [FQDN] and the URL Host Name [shortname] do not match

Where does it get this "URL hostname?"

helexis · ‎11-07-2009

Figured it out.

If I am going to have certs with FQDN, I MUST use the FQDN when I configure the DCR master hostname.

Joe Clarke · ‎11-07-2009

All messages starting with "XXX" are my debugging messages, and are nothing to worry about. I sent your engineer a new patch which removes all debugging.

Joe Clarke · ‎11-07-2009

The URL hostname is the hostname used when making the HTTP request (i.e. he value of the Host: header). This is typically the URL entered in the client browser, but for IPC calls, it can be the hostname as configured in PX_HOST or in regdaemon.xml.

Joe Clarke · ‎11-07-2009

Yes, same for SSO.

Sven Hruza · ‎11-16-2009

Is there a bug ID for the issue?

Thanks!

Joe Clarke · ‎11-16-2009

CSCtd07131