03-04-2014 04:30 PM - edited 03-16-2019 10:00 PM
I have a growing number of CP7961G and 41G phones in one of my buildings that go into a state where they continuously reboot and my only recourse is to swap them out with a spare. My telecomm specialist is concerned that if this continues, we will run out of spares.
Particulars: We are running UCM v7.1.5, non-clustered, on a publisher and a subscriber. The phones are about 6 years old. The building is served by three Catalyst 3560-48PS which are not approaching their PoE limits. We have verified that the problem is not related to a particular switch, port, or patch cord. The phone will continue rebooting when we bring it back to the test bench. Wiping out its config in UCM and putting a different config on it has no effect.
It is something of a challenge to grab any logs off the phone between reboots, but I managed to retrieve the following messages right before it went down. Any advice?
2662: NOT 14:15:52.387021 JVM: Startup Module Loader|cip.midp.midletsuite.InstallerModule:? - propertyChanged - device.settings.fullyregistered value=true
2663: NOT 14:15:52.389162 JVM: Startup Module Loader|cip.midp.midletsuite.InstallerModule:? - FULLY_REGISTERED - Resetting retry installer interval
2664: NOT 14:15:52.391383 JVM: Startup Module Loader|cip.midp.pushregistry.e:? - setAcceptConnections - ENABLED
2665: NOT 14:15:52.393446 JVM: Startup Module Loader|cip.midp.midletsuite.InstallerModule:? - propertyChanged - device.settings.fullyregistered value=true
2666: NOT 14:15:52.395528 JVM: Startup Module Loader|cip.midp.midletsuite.InstallerModule:? - FULLY_REGISTERED - Resetting retry installer interval
2667: NOT 14:15:52.397532 JVM: Startup Module Loader|cip.midp.pushregistry.e:? - setAcceptConnections - ENABLED
2668: NOT 14:15:52.399608 JVM: Startup Module Loader|cip.midp.midletsuite.InstallerModule:? - propertyChanged - device.callagent.messages.0 value=0
2669: WRN 14:15:52.402999 JVM: Startup Module Loader|cip.mmgr.dt:? - [MediaMgrSM]: Unhandled Event, State = StateOnHook Event = EventServicesTxStop
2670: ERR 14:15:52.405170 JVM: Startup Module Loader|cip.midp.rms.RecordStoreImpl:midp.rms.IOExceptionInCompactRS - Exception while loading(262160) a RecordHeader() in compactRecords(): java.io.IOException: load invalid offset 262160
2671: ERR 14:15:52.419248 JVM: 2:15:52p|java.io.IOException: load invalid offset 262160
at cip.midp.rms.b.a(I)V(Unknown Source)
at cip.midp.rms.RecordStoreImpl.compactRecords()V(Unknown Source)
at cip.midp.rms.RecordStoreImpl.flush()V(Unknown Source)
at cip.props.a.b()V(Unknown Source)
at cip.props.f.a()V(Unknown Source)
at cip.props.PropertyManager.timerFired(Lcip/sys/k;)V(Unknown Source)
at cip.sys.ar.a(Lcip/sys/ab;Lcip/sys/x;)V(Unknown Source)
at cip.sys.ar.sinkMessage(Lcip/sys/x;)V(Unknown Source)
at cip.sys.n.sinkMessage(Lcip/sys/x;)V(Unknown Source)
at cip.sys.aa.a(Lcip/sys/ad;)Ljava/lang/Throwable;(Unknown Source)
at cip.sys.aa.e()V(Unknown Source)
at cip.sys.l.run()V(Unknown Source)
at java.lang.Thread.startup(Z)V(Unknown Source)
2672: ERR 14:15:52.423225 JVM: Startup Module Loader|PropertyManager:timerFired - Exception flushing RMS.javax.microedition.rms.RecordStoreException: Exception while loading(262160) a RecordHeader() in compactRecords(): java.io.IOException: load invalid offset 262160deleting primary RMS and requesting hardreset to load backup RMS.
2673: ERR 14:15:52.425558 JVM: Startup Module Loader|RmsPropertyStore:? - javax.microedition.rms.RecordStoreException: deleteRecordStore error: record store is still open
2674: NOT 14:15:52.472344 JVM: Startup Module Loader|cip.cfg.ConfigManager:? - --->ConfigManager PropertyChanged: device.system.systemstate
2675: NOT 14:15:52.474498 JVM: Startup Module Loader|cip.cfg.ConfigManager:? - <---ConfigManager PropertyChanged: device.system.systemstate
2676: ERR 14:15:52.476791 JVM: SystemManager n/a SYSTEM_RESTARTING
2677: ERR 14:15:52.617343 JVM: SystemManager n/a *systemUnload -->*
2678: ERR 14:15:52.619287 JVM: SystemManager n/a unloadAllModules ->
2679: ERR 14:15:52.621685 JVM: SystemManager n/a Unload in this order:
2680: ERR 14:15:52.623620 JVM: SystemManager n/a SCCP Call Control
2681: ERR 14:15:52.625579 JVM: SystemManager n/a InstallerModule
2682: ERR 14:15:52.627558 JVM: SystemManager n/a NotificationModule
2683: ERR 14:15:52.629416 JVM: SystemManager n/a SubscriptionManager
2684: ERR 14:15:52.631503 JVM: SystemManager n/a Starts Automation Feature
2685: ERR 14:15:52.633381 JVM: SystemManager n/a Sidecar
2686: ERR 14:15:52.635212 JVM: SystemManager n/a Vieo
2687: ERR 14:15:52.637111 JVM: SystemManager n/a AppStatus
2688: ERR 14:15:52.638977 JVM: SystemManager n/a Help
2689: ERR 14:15:52.640997 JVM: SystemManager n/a Go4
2690: ERR 14:15:52.642971 JVM: SystemManager n/a Messages
2691: ERR 14:15:52.644974 JVM: SystemManager n/a Directories
2692: ERR 14:15:52.646820 JVM: SystemManager n/a Services
2693: ERR 14:15:52.648743 JVM: SystemManager n/a Settings
2694: ERR 14:15:52.650702 JVM: SystemManager n/a Push Service
2695: ERR 14:15:52.652569 JVM: SystemManager n/a HTTP Server/Microservlet Container
2696: ERR 14:15:52.654502 JVM: SystemManager n/a HTTP
2697: ERR 14:15:52.656352 JVM: SystemManager n/a Config Manager
2698: ERR 14:15:52.658176 JVM: SystemManager n/a Security
2699: ERR 14:15:52.660245 JVM: SystemManager n/a TFTP Client
2700: ERR 14:15:52.662092 JVM: SystemManager n/a Call UI
2701: ERR 14:15:52.663910 JVM: SystemManager n/a Media Manager
2702: ERR 14:15:52.665803 JVM: SystemManager n/a Call Agent
2703: ERR 14:15:52.667676 JVM: SystemManager n/a Application Manager
2704: ERR 14:15:52.669492 JVM: SystemManager n/a Display
2705: ERR 14:15:52.671540 JVM: SystemManager n/a Input Manager
2706: ERR 14:15:52.673500 JVM: SystemManager n/a Localization Manager
2707: ERR 14:15:52.675439 JVM: SystemManager n/a Reset Service
2708: ERR 14:15:52.677400 JVM: SystemManager n/a Property Manager
2709: ERR 14:15:52.679351 JVM: SystemManager n/a Timer Manager
2710: ERR 14:15:52.681306 JVM: SystemManager n/a Trace Service
2711: ERR 14:15:52.683319 JVM: SystemManager n/a Trace Manager
2712: ERR 14:15:52.685942 JVM: SystemManager n/a module "SCCP Call Control"/sccpccapi(31) is UNLOADING
2713: ERR 14:15:52.814693 JVM: SystemManager n/a module "SCCP Call Control"/sccpccapi(31) is UNLOADED
2714: ERR 14:15:52.819241 JVM: SystemManager n/a module "InstallerModule"/installer(30) is UNLOADING
2715: WRN 14:15:52.892189 JVM: Startup Module Loader|cip.sccp.cn:? - Read exception: java.net.SocketException: Error 0
Close(d) Connection ..., errno=135
2716: ERR 14:15:52.907053 JVM: 2:15:52p|java.net.SocketException: Error 0
at cip.io.SecureInputStream.socketRead([BII)I(Native Method)
at cip.io.SecureInputStream.read([BII)I(Unknown Source)
at java.io.BufferedInputStream.fill()V(Unknown Source)
at java.io.BufferedInputStream.read()I(Unknown Source)
at java.io.DataInputStream.readInt()I(Unknown Source)
at cip.io.i.readInt()I(Unknown Source)
at cip.sccp.ax.a()Lcip/sccp/bu;(Unknown Source)
at cip.sccp.cn.e()V(Unknown Source)
at cip.sys.l.run()V(Unknown Source)
at java.lang.Thread.startup(Z)V(Unknown Source)
2717: NOT 14:15:53.293053 JVM: Startup Module Loader|cip.midp.midletsuite.g:onRun - FinalizeObjectThread INTERRUPTED
2718: ERR 14:15:53.297661 JVM: SystemManager n/a module "InstallerModule"/installer(30) is UNLOADED
2719: ERR 14:15:53.300806 JVM: SystemManager n/a module "NotificationModule"/notification(29) is UNLOADING
2720: ERR 14:15:53.303783 JVM: SystemManager n/a module "NotificationModule"/notification(29) is UNLOADED
2721: ERR 14:15:53.306581 JVM: SystemManager n/a module "SubscriptionManager"/subscriptionman(28) is UNLOADING
2722: ERR 14:15:53.318673 JVM: SystemManager n/a module "SubscriptionManager"/subscriptionman(28) is UNLOADED
... ... ...
2819: NOT 14:16:00.563892 SYSMSG: pid 16 (/ubin/dsp) Normal Exit, status = 1
2820: INF 14:16:00.563965 runtime = 135.140 secs
2821: INF 14:16:00.564011 user cpu = 0.136747533 secs
2822: INF 14:16:00.564054 system cpu = 0.489695153 secs
2823: INF 14:16:00.564091 child user cpu = 0.000000000 secs
2824: INF 14:16:00.564125 child sys cpu = 0.000000000 secs
2825: INF 14:16:00.564173 sys interrupts = 1.283613038 secs for 20499 interrupts
2826: INF 14:16:00.564219 total cpu = 0.626692686 secs ( 0% utilization )
2827: NOT 14:16:01.250272 DHCP: SIGTERM received...DHCP is exiting
2828: NOT 14:16:01.500314 DNS: SIGTERM received...DNS is exiting
2829: ERR 14:16:01.750246 login: [3]:loginSIGTERM signo:23
2830: ERR 14:16:01.750682 login: [3]:tcgetattr failed errno:25
2831: NOT 14:16:02.021689 SECD: exitHandler: ** EXITING, Ver: 1.4(1.2) [May 11 2011, 08:03:23] **
2832: NOT 14:16:06.120490 init: Starting /sbin/inetd
2833: NOT 14:16:06.124411 init: /sbin/inetd started as pid=7
2834: NOT 14:16:06.175078 INETD: pid = 7
2835: NOT 14:16:06.175901 INETD: vvlan = 0 will be initiazed to 4096
2836: NOT 14:16:06.179804 INETD: sshAccessReadConf: sshAccess = 0
2837: NOT 14:16:06.505331 init: Starting /sbin/imgauthd
2838: NOT 14:16:06.509235 init: /sbin/imgauthd started as pid=20
2839: NOT 14:16:06.510234 init: Starting /bin/tftpClient
2840: NOT 14:16:06.514066 init: /bin/tftpClient started as pid=23
2841: NOT 14:16:06.514896 init: Starting /bin/secd
2842: NOT 14:16:06.518723 init: /bin/secd started as pid=14
2843: NOT 14:16:06.519551 init: Starting /bin/ntp
2844: NOT 14:16:06.523520 init: /bin/ntp started as pid=12
2845: NOT 14:16:06.524351 init: Starting /ubin/vieo
2846: NOT 14:16:06.528190 init: /ubin/vieo started as pid=10
2847: NOT 14:16:06.529020 init: Starting /ubin/dsp
2848: NOT 14:16:06.533004 init: /ubin/dsp started as pid=31
2849: NOT 14:16:06.533834 init: Starting /sbin/ewcl
2850: NOT 14:16:06.537658 init: /sbin/ewcl started as pid=25
2851: NOT 14:16:06.538484 init: Starting /bin/mkdir
2852: NOT 14:16:06.547443 init: /bin/mkdir started as pid=21
2853: NOT 14:16:06.565041 CDP-D: pid = 30
2854: NOT 14:16:06.582629 DHCP6: COLDBOOT - wait 4 seconds...
2855: NOT 14:16:06.589560 PAE: pid = 13
2856: NOT 14:16:06.592548 PAE: phone is not single port
2857: WRN 14:16:06.599864 PAE: paeReadPCStatus: empty file
2858: NOT 14:16:06.600744 PAE: link-0 is UP
2859: NOT 14:16:06.601574 PAE: starting supplicant
2860: NOT 14:16:06.602389 PAE: EV_DISABLED, ST_DISCONNECTED->ST_DISABLED
2861: NOT 14:16:06.603180 PAE: supplicant is disabled
2862: NOT 14:16:06.603987 PAE: paeSetLastSupStatus: LastSupStatus 0
2863: NOT 14:16:07.808304 ESP: send ADMIN, logging = 1, shell = 0, ipconfig = 1
2864: NOT 14:16:07.820307 IMG_AUTH: Version: 3.6(1.0): TNP REL
2865: NOT 14:16:07.821288 IMG_AUTH: model: CP-7961G, check for built-in key
2866: NOT 14:16:07.822133 IMG_AUTH: built-in platform RELEASE key is present
2867: NOT 14:16:07.822938 IMG_AUTH: built-in common RELEASE key is present
2868: NOT 14:16:07.824329 IMG_AUTH: TNP, built-in keys : Platform (curr), Common (curr)
2869: NOT 14:16:07.830481 tftpClient: pid = 23
2870: NOT 14:16:07.847867 SECD: main: Starting... Ver: 1.4(1.2) [May 11 2011, 08:03:23]
2871: NOT 14:16:07.853471 SECD: initSecMode: resuming sec mode NONE
2872: WRN 14:16:07.875848 SECD: WARN:loadCert: phone has no LSC cert
2873: NOT 14:16:07.876880 SECD: initResources: have cert+pkey: MIC only, no LSC
2874: NOT 14:16:07.877924 SECD: perform_selftests: FIPS RSA SHA1 Self-Tests Passed
2875: NOT 14:16:07.879697 SECD: perform_selftests: FIPS RSA HMAC-SHA1 Self-Tests Passed
2876: NOT 14:16:07.881400 SECD: perform_selftests: FIPS RSA AES Self-Tests Passed
2877: NOT 14:16:07.882752 SECD: perform_selftests: FIPS RSA 3DES Self-Tests Passed
2878: NOT 14:16:08.232243 SECD: perform_selftests: FIPS RSA Sign/Verify Self-Tests Passed
2879: NOT 14:16:08.233274 SECD: perform_selftests: FIPS RSA RNG Self-Tests Passed
2880: NOT 14:16:08.234162 SECD: perform_selftests: FIPS RSA Library Self-Tests Passed
2881: NOT 14:16:08.235031 SECD: runFipsPostTests: FIPS Security Information: RSA Self-Tests Passed
2882: NOT 14:16:08.236899 SECD: TL_initFromCTLFile: No TL file: /flash0/sec/ctl/CTLFile.tlv
2883: WRN 14:16:08.238165 SECD: WARN:initCTL: ** phone has NO CTL file **
2884: NOT 14:16:08.239709 SECD: TL_initFromITLFile: No TL file: /flash0/sec/ctl/ITLFile.tlv
2885: WRN 14:16:08.241073 SECD: WARN:initCTL: ** phone has NO ITL file**
2886: WRN 14:16:08.241978 SECD: WARN:resetSecMode: ** WARN ** resetting sec mode to NONE (non-secure)
2887: NOT 14:16:08.244938 SECD: clearCapfList: CAPF table cleared
2888: NOT 14:16:08.246183 SECD: initCapfClnt: CAPF clnt initialized
2889: ERR 14:16:08.251313 SECD: EROR:initEntropy: Calling R_rand_seed with bytes 128
... ... ...
03-04-2014 09:07 PM
If your switch is Cisco, can you please specify the exact model of the switch?
03-05-2014 02:04 PM
I believe you can find that in my original post, unless I misunderstand your question, in which case I apologize.
--Norm
03-05-2014 04:30 PM
Ok, you got 3560. Just want to make sure you have the switches with FastEthernet access ports, right?
Do you have any switches with copper GigabitEthernet?
03-05-2014 05:17 PM
Yes, these are FastE ports on the 3560s.
I have a 3750 with GigEthernet PoE ports in another building.
03-05-2014 05:38 PM
Too bad. I was hoping you had one where the faulty phone is so you could do a TDR.
03-05-2014 05:52 PM
If this is happening across multiple 3560s in a building, and with different patch cords, and when directly plugged into a test switch (a 3560-8PC), wouldn't you say most of the problems a TDR can detect would be ruled out? Unless perhaps some flaky office connections are corrupting TFTP transfers to the phones. In that case I would image the replacement phones eventually having problems as well.
I was wondering if the Record Store Exception messages in lines 2670-2672 might be some clue. I'm not familiar with what to expect in normal logs, but those strike me as suspicious.
--Norm
03-05-2014 08:14 PM
- Have you attempted the factory reset procedure on the phones? What's the result?
- Do you have 7941/61s elsewhere on the same cluster not displaying the issue?
- What firmware version are the phones running?
- What do the cucm traces look like leading up to the issue?
- What if anything is different about this site than another?
Regards,
Aaron
03-10-2014 04:19 PM
Aaron, thank you for the troubleshooting tips! I am more of a routing/switching guy rather than IP telephony.
My availability to troubleshoot the issue has dwindled temporarily for a few days due to an upcoming audit so I haven't had a chance to follow through on everything you listed, but I did find the factory reset procedure and gave that a whirl on the defunct phones and that successfully resucitated them.
What remains to be solved then is determining what put them into this state. I hope your additional suggestions will lead me to an answer.
Let me throw in here that upon examining my three switch logs again, I believe the switches may be removing power from the phones. I am getting a number of "%ILPOWER-5-IEEE_DISCONNECT: Interface Fa0/xx: PD removed " messages which I originally thought were triggered by the phone restarts but am now of the belief that they lead ahead of the restarts. So that would point to a PoE or power problem of some sort. Since this occurs across three different switches in the building, I don't believe I am looking at a malfunctioning switch. Though in considering what is different about this building from all the others I maintain, something recently occurred to me, this building has solar panels! So now I'm beginning to wonder if something about its power inverters is disagreeing with my PoE?
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide