SPA112 Reboot every 72 hours (3 days) exactly.

marcusjclifford · ‎09-09-2012

Hi,

I have a Cisco SPA112 which connects to a 3CX server. I also have addional phones, inluding Cisco SPA504G phones.

The issue I have is the SPA112 reboots every 3 days, exactly to the minute. This has happened 3 times now. The 504G is fine (40+ days uptime)

Nothing has changed on the provisioning of the ATA, when I do change things it reboots at 02:00 as expected, so I don't think this is related to reprovisioning.

I can provide more information e.g. logs if required, but thought I'd initially check if there is possibly a setting somewhere that I am unaware of that would cause this reboot every 72 hours exactly.

Thanks in advance,

Marcus

marcusjclifford · ‎09-09-2012

I should have included this information:

Product Information
Product Name:	SPA112	Serial Number:	CCQ162503NC
Software Version:	1.2.1(004)	Hardware Version:	1.0.0
MAC Address:	C40ACBBC7XXX	Client Certificate:	Installed
Customization:	Open

System Status
Current Time:	9/9/2012 22:13:41	Elapsed Time:	00:35:12
RTP Packets Sent:	0	RTP Bytes Sent:	0
RTP Packets Recv:	0	RTP Bytes Recv:	0
SIP Messages Sent:	24	SIP Bytes Sent:	20821
SIP Messages Recv:	23	SIP Bytes Recv:	10833

As you can see the elapsed time is 35 mins, this ties in with the reboot event.

Dan Lukes · ‎09-10-2012

Direct debug messages from phone to a syslog server on some computer. Debug messages may (or may not) reveal more to you.

marcusjclifford · ‎09-13-2012

OK, so again as expected it rebooted right at the expected time.

The output from the syslog is: (Most recent first)

[ 0.000000] DMA zone: 62 pages used for memmap

Reboot here

syslog-ng version 1.6.12 going down

[17227986.824000] cordless: deinit

So it looks like the "cordless: deinit" is the first notable event, before that it was just regular standard messages.

Dan Lukes · ‎09-13-2012

Just to be sure, what's the Debug_Level ? Did you set it to maximum ?

By the way:

Product Name:	SPA112	Serial Number:
Software Version:	1.2.1(004)	Hardware Version:	1.0.0
MAC Address:		Client Certificate:	Installed
Customization:	Open

System Status
Current Time:	9/13/2012 15:26:14	Elapsed Time:	10 days and 14:07:53
RTP Packets Sent:	2471	RTP Bytes Sent:	395360
RTP Packets Recv:	2066	RTP Bytes Recv:	330560
SIP Messages Sent:	38608	SIP Bytes Sent:	18585856
SIP Messages Recv:	38589	SIP Bytes Recv:	19492207
External IP:

Line 1 Status
Hook State:	On	Registration State:	Registered
Last Registration At:	9/13/2012 15:19:41	Next Registration In:	200 s

marcusjclifford · ‎09-13-2012

Both logs (System and Kernel) are set to Debug level and output to the syslog server.

I don't doubt these devices can stay up as intended, it is too precise to be like a memory leak or something. I'm confident it is some setting that is causing this to happen.

Service pack 1 for 3CX V11 was released yesterday, so I'll update to that over the weekend and see if it makes any difference, but any advice in the mean time is more than welcome.

Dan Lukes · ‎09-13-2012

It's seems you are speaking about the setting in container. The values in Log_Configuration container seems to drive logging of Linux kernel and other utilities

But I asked about Debug_Level tag at root level of configuration file - it configure debug level of SIP application itself. Abend of such application will cause reboot of system as well.

I agree it seems not to be memory leak, but it may be other resource exhaustion. It it is a resource used during specific periodic request only, then it may be exhausted after same number of requests. Like MWI notification, an keep-alive packet or so.

If you can run a packet dumping utility then do it. Save SIP packets (port 5060) only - it should be sufficient. If the same SIP event will occur just before rebots, then you are on the right way, it's specific SIP packet related issue.

Or it may be caused by specific configuration. Just yesterday I spent several hours analyzing periodic reboots of SPA508G - they has been caused by specific combination of Mobility-related settings and Provisioning_Rule. Try to start with factory-default configuration with minimal changes ( SIP server & name & password on Line 1 only ). If device become stable, then it's specific configuration issue.

Well, just try and hope the Cybertan team will release an more stable firmware during our lives ...

marcusjclifford · ‎09-19-2012

OK, so I have found what I think is causing this.

Thanks to Dan I enabled the more significant logging and was able to see at the time it rebooted it was trying to do a firmware upgrade check.

I should have spotted this before as on the Voice->Provisioning Page the value for Firmware Upgrade: Upgrade Error Retry Delay: was set as 259200 which in seconds is 3 days.

What it seemed to be doing was every 3 days checking for firmware, and perhaps because the 3CX PBX did not have any firmware for the device it rebooted (the SPA112 may also reboot to check for the firmware??).

I have set the option for Upgrade Enable to No (it was yes) and I will see if this cures the problem.

Dan Lukes · ‎09-21-2012

Missing firmware file (e.g. 404 Not Found reply) should not trigger device restart. But not sure if something is returned (but not valid firmware file).

marcusjclifford · ‎09-25-2012

OK, confirmed what was in my previous post.

The reboot was caused by the SPA112 trying to update it's firmware from the PBX server.

Setting the Upgrade Enable to No has stopped the device rebooting, and it has now been up for 4 days +

I still think it is a bug that if it can't update, due to no firmware file existing it reboots, but at least I have a solution now.

I can advise that if you have any issues with your device do you the syslog fuction outputting to an external SysLog server - it made it much easier to diagnose what was going on.

Dan Lukes · ‎09-25-2012

I'm still not sure about details related to unsuccesfull firmware update. "Not found" reply doesn't trigger reboot on my device. I suspect other scenario like your particular server is redirecting "not found" bug to a generic non-bug page (liek index page or so) which is mis-interpretter as firmware file by device, so upgrade has started but failed (because verification of downloaded firmware file failed). It may triger reboot.

If you can catch packet dump of HTTP request asking the upgrade file, it may help to clarify the problem.

By the way, mark the question as answered. It will help others to found the solution.

No of my response can be claimed as "correct answer". You answered the question. Unfortunatelly, Cisco doesn't expect that someone can answer it's own question. So if you want to claim the question as answered, you have no other solution than mis-give credits to me, despite I'm not interested in it. It's up to you.

Dan Miley · ‎11-09-2012

usually if the firmware is upgrading you will see reboot reason 'firmware upgrade' or somesuch

if the reboot reason is provisioning, that's usually because the 3cx is changing the provisioining file.

your upgrade rule should have a conditional statement so it doesn't upgrade the firmware if it's already at the correct version. the telephony provisioning guide has the full syntax, but it looks something like this

in your provisioning file change:

( $SWVER != 1.2.1)? tftp://192.168.2.244/pathtotheV1.2.1file.bin

the conditional says software version not equal 1.2.1 then execute (the ? in the command) the tftp. but if it is 1.2.1, do nothing.

if you don't use the conditional it will continuously download the firmware once the upgrade error retry timer is out. Once it does the firmware update, it installs and reboots, then the timer starts again.

reboot reason llist

https://supportforums.cisco.com/docs/DOC-9889

here's the full provisioning guides.

https://supportforums.cisco.com/docs/DOC-9894

hope it helps.

Dan