cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
3765
Views
0
Helpful
4
Replies

Logger Service keeps shutting down and restarting

saharhanna
Level 1
Level 1

Hello,

I have upgraded my lab UCCE call center to v9.0

I have a simplexed environment

Ever since, the logger service keeps shutting down and restarting.

Recently, the node manager issued an error message and shut down the server

I tried to check some logs but I point out the problem, there are errors in different processes.

Attached are the logs.

1 Accepted Solution

Accepted Solutions

Gergely Szabo
VIP Alumni
VIP Alumni

Hi,

if the NM tries to reload the server, it indicates a Problem (capital inteded).

Indeed, there's something interesting going on:

08:29:11:822 la-rpl Trace: Starting Recovery Key for Admin table is 6717118506000.0 

08:29:11:822 la-rpl Trace: The largestkey = 7201232308043.0 >= startkey = 6717118506000.0  

08:29:11:823 la-rpl Trace: To correct this problem: Stop logger service. Use ICMDBA tool to sync configuration data from its partner logger database. Restart logger service. 

08:29:11:823 la-rpl Fail: Assertion failed: largestkey < startkey.  File: ICRDB.CPP.  Line 742

08:29:11:860 la-rpl Trace: CExceptionHandlerEx::GenerateMiniDump -- A Mini Dump File is available at logfiles\replication.exe_20130523082911824.mdmp 

08:29:12:074 la-rpl Unhandled Exception: Exception code: 80000003 BREAKPOINTFault address:  754A3219 01:00012219 C:\Windows\syswow64\KERNELBASE.dllRegisters:EAX:00000000EBX:00000000ECX:00001890EDX:E1043F00ESI:015F8AB0EDI:00000005CS:EIP:0023:754A3219SS:ESP:002B:003CE2D8  EBP:003CE2E0DS:002B  ES:002B  FS:0053  GS:002BFlags:00000246Call stack:Address   Frame754A3219  003CE2E0  DebugBreak+26EB1459C  003CE2EC  EMSAbortProcess+C6EB1ACD1  003CF7F8  EMSReportCommon+1A16EB1ADBB  003CF818  EMSFailMessage+2B013BBE5A  003CF8A8  ICRDb::ICRDb+44A013B2FE2  003CF9B8  main+582015D96C2  003CF9FC  NtCurrentTeb+174767333AA  003CFA08  BaseThreadInitThunk+1277449EF2  003CFA48  RtlInitializeExceptionChain+6377449EC5  003CFA60  RtlInitializeExceptionChain+36

The short version: configuration data is corrupt.

The longer version: the above message in red informs about the result of a sanity check. Each configuration change creates a new row in one of the tables holding the config info, and each row contains a RecoveryKey which is usually a large number incremented by the insertion. The error message says the largest key (= last key) contains a value that is lower than the first key. Naturally, this is something to consider for a lonely philosopher, but the rigid world of Cisco ICM does not allow metaphysical phenomena. Lower numbers are supposed to be lower than higher numbers.

This, of course, raises an exception and the Logger service restarts. If there are too many restarts, the Node Manager kicks in and restarts the machine - this is just a mechanism that prevents a larger extent of data corruption.

Now, if there's an other side Logger - fine, as the error message suggests, you can initiate manual replication (provided the other Logger database contains valid information).

Unfortunately, as you have written, this is a side A only environment. This may mean:

- accepting the situation, stopping ICM, throwing out the logger database, recreating it and reinstalling the Logger service,

- poking around in various tables to check what may be saved - this may mean the beginning of an adventure.

G.

View solution in original post

4 Replies 4

Gergely Szabo
VIP Alumni
VIP Alumni

Hi,

if the NM tries to reload the server, it indicates a Problem (capital inteded).

Indeed, there's something interesting going on:

08:29:11:822 la-rpl Trace: Starting Recovery Key for Admin table is 6717118506000.0 

08:29:11:822 la-rpl Trace: The largestkey = 7201232308043.0 >= startkey = 6717118506000.0  

08:29:11:823 la-rpl Trace: To correct this problem: Stop logger service. Use ICMDBA tool to sync configuration data from its partner logger database. Restart logger service. 

08:29:11:823 la-rpl Fail: Assertion failed: largestkey < startkey.  File: ICRDB.CPP.  Line 742

08:29:11:860 la-rpl Trace: CExceptionHandlerEx::GenerateMiniDump -- A Mini Dump File is available at logfiles\replication.exe_20130523082911824.mdmp 

08:29:12:074 la-rpl Unhandled Exception: Exception code: 80000003 BREAKPOINTFault address:  754A3219 01:00012219 C:\Windows\syswow64\KERNELBASE.dllRegisters:EAX:00000000EBX:00000000ECX:00001890EDX:E1043F00ESI:015F8AB0EDI:00000005CS:EIP:0023:754A3219SS:ESP:002B:003CE2D8  EBP:003CE2E0DS:002B  ES:002B  FS:0053  GS:002BFlags:00000246Call stack:Address   Frame754A3219  003CE2E0  DebugBreak+26EB1459C  003CE2EC  EMSAbortProcess+C6EB1ACD1  003CF7F8  EMSReportCommon+1A16EB1ADBB  003CF818  EMSFailMessage+2B013BBE5A  003CF8A8  ICRDb::ICRDb+44A013B2FE2  003CF9B8  main+582015D96C2  003CF9FC  NtCurrentTeb+174767333AA  003CFA08  BaseThreadInitThunk+1277449EF2  003CFA48  RtlInitializeExceptionChain+6377449EC5  003CFA60  RtlInitializeExceptionChain+36

The short version: configuration data is corrupt.

The longer version: the above message in red informs about the result of a sanity check. Each configuration change creates a new row in one of the tables holding the config info, and each row contains a RecoveryKey which is usually a large number incremented by the insertion. The error message says the largest key (= last key) contains a value that is lower than the first key. Naturally, this is something to consider for a lonely philosopher, but the rigid world of Cisco ICM does not allow metaphysical phenomena. Lower numbers are supposed to be lower than higher numbers.

This, of course, raises an exception and the Logger service restarts. If there are too many restarts, the Node Manager kicks in and restarts the machine - this is just a mechanism that prevents a larger extent of data corruption.

Now, if there's an other side Logger - fine, as the error message suggests, you can initiate manual replication (provided the other Logger database contains valid information).

Unfortunately, as you have written, this is a side A only environment. This may mean:

- accepting the situation, stopping ICM, throwing out the logger database, recreating it and reinstalling the Logger service,

- poking around in various tables to check what may be saved - this may mean the beginning of an adventure.

G.

Hi Gergely,

Thank you for the reply.

I guess that the problem occurred due to the fact that the clock on my server was set to a future date, I fixed it to the current date and this is what messed up the database.

I did as you suggested, threw away the logger db, uninstall the logger service, and re-install it again

This solved the problem.

Thank you,

Sahar

Hi,

yes, this happened to me in my lab, too, for some strange reason the clock was running ahead (must be something related to the fact that it was a virtual machine) and it had killed the configuration database.

Anyway, since then the first thing I usually do is installing a reliable NTP client. A bit of an advertisement:

http://www.meinbergglobal.com/english/sw/ntp.htm

Plus a highly customisable monitoring tool (sends alarms, draws graphs etc):

http://www.meinbergglobal.com/english/sw/ntp-server-monitor.htm

G.

Hi all,

I faced the same in my lab. I fixed the issue by synchronizing sideA on itself and purging message log through icmdba (ucce 11.5)

David