09-02-2013 01:30 AM - edited 03-21-2019 07:43 AM
We have branch with about 180 phones. Mostly SPA504G, few SPA508G. Identical configuration of all of them with exception of SIP account name and password and button label (device's phone number).
For some reason I triggered (using SIP NOTIFY) cold restart of all of them. I did it in deep night, no phone has been in use. Most of phones restarted with no problem. But 48 phones didn't. They become lost in neverending reboot loop. Just for completeness I attached complete syslog from one of them on bottom, but "why it did it" is not the question of the day.
The phone can be recovered by "reset to factory default" only. It's not problem as we have zero-touch environment, so even phone in factory default configuration will configure self within minutes.
The question of the day is: how I can order the phone to reset to factory default remotely ? Phones are accessible for short time between reboots (they even succesfully register to exchange). I wish it will not be necesarry to walk personally from phone to phone. In advance, we have other branches in other cities and also several foreign branches. I need to cold restart phones on those branches as well, but I hesitate some phones will enter reboot loop as well.
Any advice will be aprecitated.
Sep 1 23:24:15 gateway ip: 10.xx.yy.1
Sep 1 23:24:15 IDBG: LS, 270-4d8
Sep 1 23:24:15 IDBG: SOK
Sep 1 23:24:15 IDBG: st-0
Sep 1 23:24:15 Dict_D> Store pDictTftp->prev_dict_enable: 2
Sep 1 23:24:15 Dict_N> !!!OK EXIT TFTP main process
Sep 1 23:24:15 Resolving 10.xx.zz.1
Sep 1 23:24:15 [BKpic]Loading text background image
Sep 1 23:24:15 [BKpic]Loading text background image
Sep 1 23:24:15 [0]Reg Addr Change(0) 0:0->a....01:5060
Sep 1 23:24:15 [0]Reg Addr Change(0) 0:0->a....01:5060
Sep 1 23:24:15 [0]RegOK. NextReg in 600 (0)
Sep 1 23:24:15 Dict_N> DICT_loadFromFlash eng dictionary ok, paylen = 52754
Sep 1 23:24:16 Dict_N> DICT_loadFromFlash non-eng dictionary ok, paylen = 57154
Sep 1 23:24:16 Dict_N> DICT feature is enabled!
Sep 1 23:24:16 fu:0:0be63, 7.29 1
Sep 1 23:24:16 [0]SubOK. NextSub in 599 (1)
Sep 1 23:24:16 [0]SubOK. NextSub in 599 (1)
Sep 1 23:24:18 fs:042600:042752:131072
Sep 1 23:24:18 fls:fuuuuuuaff:13:3120:127248
Sep 1 23:24:18 fbr:0:3000:3000:0be12:000e:000d:7.5.4
Sep 1 23:24:18 fhs:01:0:0001:upg:app:M:7.4.3a
Sep 1 23:24:18 fhs:02:0:0002:upg:app:0:7.4.8a
Sep 1 23:24:18 fhs:03:0:0003:upg:app:1:7.4.8a
Sep 1 23:24:18 fhs:04:0:0004:upg:app:2:7.4.8a
Sep 1 23:24:18 fhs:05:0:0005:upg:app:0:7.4.9a
Sep 1 23:24:18 fhs:06:0:0006:upg:app:1:7.4.9a
Sep 1 23:24:18 fhs:07:0:0007:upg:app:2:7.4.9a
Sep 1 23:24:18 fhs:08:0:0008:upg:app:0:7.4.9c
Sep 1 23:24:18 fhs:09:0:0009:upg:app:1:7.4.9c
Sep 1 23:24:18 fhs:0a:0:000a:upg:app:2:7.4.9c
Sep 1 23:24:18 fhs:0b:0:000b:upg:app:0:7.5.2b
Sep 1 23:24:18 fhs:0c:0:000c:upg:app:1:7.5.2b
Sep 1 23:24:18 fhs:0d:0:000d:upg:app:2:7.5.2b
Sep 1 23:24:18 fhs:0e:0:000e:upg:app:0:7.5.4
Sep 1 23:24:18 fhs:0f:0:000f:upg:app:1:7.5.4
Sep 1 23:24:18 fhs:10:0:0010:upg:app:2:7.5.4
Sep 1 23:24:18 dhcp opt 66: ""
Sep 1 23:24:19 fu:0:0be7d, 5.1.1 1
Sep 1 23:24:23 resync rule: https://karlin-provisioning.xxxxx.cz/Cisco/Provisioning.php?MAC=c89c1d6d....;PSN=504G;Product=SPA504G;Serial=CBT150805..;SW=7.5.4;HW=1.0.2(0001);CERT=Installed;IP=10.xx.yy.147;EXTIP=;PRVST=0;EMS=;MUID=EMU;GPP_O=CZ;GPP_P=20130708T113333CEST
Sep 1 23:24:23 resync rule: https://karlin-provisioning.xxxxx.cz/Cisco/Provisioning.php?MAC=c89c1d6d....;PSN=504G;Product=SPA504G;Serial=CBT150805..;SW=7.5.4;HW=1.0.2(0001);CERT=Installed;IP=10.xx.yy.147;EXTIP=;PRVST=0;EMS=;MUID=EMU;GPP_O=CZ;GPP_P=20130708T113333CEST
Sep 1 23:24:23 ++ j=0 sip=c3....92
Sep 1 23:24:23 ++ j=0 sip=c3....92
Sep 1 23:24:23 fs:042600:042752:131072:198019807999
Sep 1 23:24:23 pbs 230912
Sep 1 23:24:23 SPA504G c8:9c:1d:6d:..:.. -- Requesting resync https://195.xx.yy.146:443/Cisco/Provisioning.php?MAC=c89c1d6d....;PSN=504G;Product=SPA504G;Serial=CBT150805..;SW=7.5.4;HW=1.0.2(0001);CERT=Installed;IP=10.xx.yy.147;EXTIP=;PRVST=0;EMS=;MUID=EMU;GPP_O=CZ;GPP_P=20130708T113333CEST
Sep 1 23:24:23 SPA504G c8:9c:1d:6d:..:.. -- Requesting resync https://195.xx.yy.146:443/Cisco/Provisioning.php?MAC=c89c1d6d....;PSN=504G;Product=SPA504G;Serial=CBT150805..;SW=7.5.4;HW=1.0.2(0001);CERT=Installed;IP=10.xx.yy.147;EXTIP=;PRVST=0;EMS=;MUID=EMU;GPP_O=CZ;GPP_P=20130708T113333CEST Sep 1 23:24:23 FMM >>>> Requesting profile
Sep 1 23:24:23 request reboot type=4 reason=System 4(8000)
Sep 1 23:24:23 request reboot type=4 reason=System 4(8000)
Sep 1 23:24:23 [0]SubOK. NextSub in 1 (1)
Sep 1 23:24:23 [0]SubOK. NextSub in 1 (1)
Sep 1 23:24:23 [0]UnRegOK
Sep 1 23:24:24 [0]SUBS:TMO w/f NOTIFY 19
Sep 1 23:24:24 [0]SUBS:TMO w/f NOTIFY 19
Sep 1 23:24:32 fu:0:0bfda, 4.71 5.1.10 5.1.12 5.1.14 5.1.19 6.8 7.13 1
And on Sep 1 23:25:20 the same will start again. Repeat ad nauseam.
Note that log claim "Requesting profile" but phone sent no packet (even no TCP SYN) so the content of profile can't cause the abend. I suspect that problem is related to TCP stack or OpenSSL library, but it's not topic of this thread. Now I need to found how to recover those phones at the first.
09-05-2013 11:22 AM
There's no factory reset that can be done remotely. I inquired with dev and that reboot reason is a phone crash. Does the issue also happen with 7.5.5? Since 48 of the 180 phones are doing this, what's different in the network setup, if any?
09-05-2013 02:39 PM
There's no factory reset that can be done remotely
There has been one, but it no longer works.
Of course, such change is not documented anywhere nor I received an answer to question related to it. See
Does the issue also happen with 7.5.5?
It's not easy to say. I can't trigger the problem in my lab even for 7.5.4. I can reproduce it only on large networks triggering reload of all devices. As some devices may become bricked and can't be recovered remotely, I can't test it so much.
The issue I described here is new for me. More often I hit other restart-related problem - device will not enter "never ending loop" but just freeze. Fortunately, it can be solved by power cycle (no problem with PoE everywhere).
I restarted (SIP NOTIFY) about 780 phones in three other locations within past two weeks and about 100 failed to start requiring power cycle (but only 3 devices ends with neverending loop). "Freezing" problem is known to us, we hit it on every upgrade I remember (e.g. from 7.4.7). But it can be solved, so I consider it anoying but not severe.
It's unsolvable booting loop that I hesitate about. All four branches I restarted are located in Prague, so I can solve problems easily. And they are our own branches. But now I should restart and upgrade branches of other companies in Poland, Germany, Italy, France, Hungary and Belgium ...
Since 48 of the 180 phones are doing this, what's different in the network setup, if any?
Good question. As far as I know, there is no difference in network setup nor phone configuration in all our branches. The only difference between this branch and three other is - in the first cases I restarted all 180 devices at once. On other three branches I sent command in smaller batches, about 30-40 devices at once, then few seconds pause, then next batch.
I assume it's a kind of race condition so it will not be easy to debug it ...
And even worse, it may be tied to particular configuration (I assume it as no other complained about it here) ...
Thank you for your support.
09-13-2013 01:31 PM
Just for completeness - CiscoIPPhoneExecute problem is caused by broken CRLF handling and there is workaround. In advance, reason for perpetual reboot loop seems to be the one described in - so it can be solved as well.
The only remaining problem is - phone freeze sometime instead of upgrade, but it can be solved by power cycle which is easy with POE (and all at all it's better to initiate upgrade using power cycle instead of software method)
It can't be considered ideal, but every problem has either workaround or can be solved remotely - so it can be considered acceptable...
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide