cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
31047
Views
29
Helpful
117
Replies

SPA112 goes catatonic, crashes, no dialtone

sgtpanties
Level 1
Level 1

Hi, I am a home user and ex-programmer. I'm using the Cisco SPA112 with the latest firmware 1.2.1 (004). Every two or three days the ATA          crashes.  Here are the symptoms:

     a) no dial tone

     b) line 1 LED is ON (as normal)

     c) ATA is PINGable

     d) IVR is unresponsive

     e) web setup URL does not respond, can not get to the setup page

     f) The reset button does not respond.

My only recourse now is the check if I have a dialtone every morning and, if not, pull and reinsert the power plug.   The cold boot of course results in the local log being lost.  Here is the log right after boot up.

Jan  1 00:00:04 SPA112 kern.warning [    0.000000] CPU: ARM926EJ-S [41069265] revision 5 (ARMv5TEJ), cr=00053177

Jan  1 00:00:04 SPA112 kern.warning [    0.000000] Machine: NXP PNX8181

Jan  1 00:00:04 SPA112 kern.warning [    0.000000] Memory policy: ECC disabled, Data cache writeback

Jan  1 00:00:04 SPA112 kern.warning [    0.000000] CPU0: D VIVT write-back cache

Jan  1 00:00:04 SPA112 kern.warning [    0.000000] CPU0: I cache: 32768 bytes, associativity 4, 32 byte lines, 256 sets

Jan  1 00:00:04 SPA112 kern.warning [    0.000000] CPU0: D cache: 32768 bytes, associativity 4, 32 byte lines, 256 sets

Jan  1 00:00:04 SPA112 kern.warning [17179569.184000] Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 7874

Jan  1 00:00:04 SPA112 kern.warning [17179569.184000] PID hash table entries: 128 (order: 7, 512 bytes)

Jan  1 00:00:04 SPA112 kern.warning [17179569.184000] Console: colour dummy device 80x30

Jan  1 00:00:04 SPA112 kern.warning [17179569.260000] Mount-cache hash table entries: 512

Jan  1 00:00:04 SPA112 kern.warning [17179569.264000] Board HW MODEL : 0x3

Jan  1 00:00:04 SPA112 kern.warning [17179569.336000] squashfs: LZMA suppport for slax.org by jro

Jan  1 00:00:04 SPA112 kern.warning [17179569.384000] [ip3912] : Bridge Mode...

Jan  1 00:00:04 SPA112 kern.err [17179569.408000] physmap-flash physmap-flash.0: map_probe failed

Jan  1 00:00:04 SPA112 kern.warning [17179569.416000] Using Full Image\'s RootFS

Jan  1 00:00:04 SPA112 kern.warning [17179569.420000] Using static partition definition

Jan  1 00:00:04 SPA112 kern.warning [17179569.424000] !!! do adler32 checksum !!!

Jan  1 00:00:04 SPA112 kern.warning [17179571.260000] File system image checksum OK

Jan  1 00:00:04 SPA112 kern.err [17179571.344000] ksz8873 0-005f: failed with status -1

Jan  1 00:00:04 SPA112 kern.warning [17179571.348000] ksz8873: probe of 0-005f failed with error -1

Jan  1 00:00:04 SPA112 kern.warning [17179571.352000] PNX8181 watchdog timer: timer margin 16 sec

Jan  1 00:00:04 SPA112 kern.warning [17179571.372000] GACT probability on

Jan  1 00:00:04 SPA112 kern.warning [17179571.376000] Mirror/redirect action on

Jan  1 00:00:04 SPA112 kern.warning [17179571.380000] u32 classifier

Jan  1 00:00:04 SPA112 kern.warning [17179571.384000]     Performance counters on

Jan  1 00:00:04 SPA112 kern.warning [17179571.388000]     input device check on

Jan  1 00:00:04 SPA112 kern.warning [17179571.392000]     Actions configured

Jan  1 00:00:04 SPA112 kern.warning [17179571.396000] Netfilter messages via NETLINK v0.30.

Jan  1 00:00:04 SPA112 kern.warning [17179571.400000] nf_conntrack version 0.5.0 (1024 buckets, 4096 max)

Jan  1 00:00:04 SPA112 kern.warning [17179571.412000] ipt_time loading

Jan  1 00:00:04 SPA112 kern.warning [17179571.464000] VFS: Mounted root (squashfs filesystem) readonly.

Jan  1 00:00:04 SPA112 kern.warning [17179576.144000] ***** LED_DRV init *****

Jan  1 00:00:04 SPA112 kern.warning [17179576.148000] ***** LED_DRV end *****

Jan  1 00:00:04 SPA112 kern.warning [17179576.176000] *** sys event driver initialized ***

Jan  1 00:00:04 SPA112 kern.err [17179579.184000] br0: Dropping NETIF_F_UFO since no NETIF_F_HW_CSUM feature.

Jan  1 00:00:04 SPA112 kern.warning [17179580.880000] Empty flash at 0x00069454 ends at 0x00069600

Jan  1 00:00:05 SPA112 daemon.err dnsmasq[136]: failed to load names from /etc/hosts: No such file or directory

Jan  1 00:00:23 SPA112 kern.err [17179600.844000] br0: Dropping NETIF_F_UFO since no NETIF_F_HW_CSUM feature.

Jan  1 00:00:27 SPA112 daemon.err system[1]: notify add wan1 interface(br0)

Jan  1 00:00:27 SPA112 daemon.err system[1]: start 0 vlan-id... ok

Jan  1 00:00:39 SPA112 daemon.err dnsmasq[136]: failed to load names from /etc/hosts: No such file or directory

Here are some things I have tried on my own, to no effect:

      a) Tried running with Provisioning/Firmware Upgrade/Upgrade set to false.

      b) Reloaded the firmware with IE.  Before, I did it with FireFox.

      c) Did a factory reset and re-entered the settings suggested for my ATA from Callcentric.

How should I proceed?

117 Replies 117

Thanks.

But I'll wait for Cisco's direction on this after they analyze my debug logs and packet capture files.

I just got the link from support to try the build you suggested.

Thanks.

Patrick, TAC tells me this issue is related to a memory leak and  possibly an issue with CDP. What MIBs can I poll to find out the  available memory, total memory and used memory?

George-  I have been using just the generic linux meminfo OID which works fine.  One note, however, there was some indication that SNMP might be contributing to the leak.  Having said that, we are approaching 96 hours on my unit with the latest FW while running SNMP and the memory looks pretty stable at 5% free.  This is the first time I have run with SNMP scanning for the last few builds - I am pretty optimistic since we have gone this far - usually it would die around now... good luck.

Thanks Dale although 5% free doesn't sound very good to me.

On another note, I noticed our SPA's started crapping out not long after I had enabled SNMP on them. As Patrick said above, this may be an attributing factor.

How is everyone else doing on the new firmware? Is there anything specific that causes the SPA to become "stupid?"

I'm using 1.3.1(003) on 25 SPA's. No crashes since last firmware update, uptime 14 days and counting.

I didn't notice any difference with/without SNMP feature - older versions of firmware crashed anyway.

Almost 16 days and still at 5% free memory.  Looks really good. 

Does anyone have issues with Call-waiting caller-ID on the old 1.2.1-004 or new 1.3.1-003 version? The call-waiting beep comes in and then there is no CID displayed. I have issues where some brands of phones don't work with it and some do. (The non-working ones work fine with a regular MTA just not with the SPA). I've been provided with a "version-less" beta firmware that appears to have fixed my issue but it's not fixed in 1.3.1-003.

Just another data point:  all is well here, running 1.3.1 (003) Dec 17 2012 - thanks again!

(I'm actually afraid to touch it again - it ain't particularly broke right now, so I'll be careful when the next release goes GA...)

_KMP

George – I'm experiencing inconsistent behavior with Call Waiting Caller ID with both v1.2.1 (004) and v1.3.1 (003). The CW tone comes across, but then CID doesn't always show up. I'm using the same analog phone in all tests.

I also observed that the CW CID on my phone didn't go away when the CW call hung up.

Looks like we should split this issue off into another thread.

Here's the thread I started for CWCID issues

https://supportforums.cisco.com/thread/2193038

Dale, George & Patrick,

I just wanted to note that I am also using a SPA112 formerly on 1.1.0(011) and currently 1.2.1(004).  I -had- both CDP and SNMP enabled.  Through troubleshooting, by disabling CDP & SNMP I have over 80 days of uptime on 1.2.1(004).  However - If I enable SNMP - I crash within a day or so. Catatonic (ping, no web, no ivr, dialtone) etc, etc...

I'm encouraged that others are reaching my conclusions too.  (I was unsure if I should have added my findings here previously...)

Kudos to Patrick for releasing beta firmware to the masses...  Shame on Cisco   for taking about a year to figure out memory leaks on system daemons.

Patrick - ETA for this firmware to become GA?  I sure would like to have a look at the release notes...

Cheers!

/Mark

Klayton Collier
Level 1
Level 1

I've been running 1.3.1(003) on my SPA122 for a while now. Haven't had my unit go catatonic on me since, but I was having failed registration issues. I didn't report them as I thought they were a result of my network configuration. After disabling SNMP it became much more stable and is running without any obvious issues.

Good call guys, I certainly don't need SNMP for my setup so this is a good fix for me. Thanks!

Klayton & Everyone. At some point right before the SPA goes stupid I noticed registration failures.

Here's the flow and issue that I noticed:

SPA122-----register--->Softswitch

SPA122 <----401 unauth---Softswitch

SPA122 ----re-register w/o digest info--->Softswitch

SPA122 <----401 unauth---Softswitch

It looks like the SPA wasn't providing the challege response digest w/ MD5 password so the switch keeps failing the registration attempts and then the SPA goes super stupid.

George-

I noticed that same behavior and figured out why.  Just as it's running out of memory, it grinds to a near halt.  My ping times went from 5ms to 2000ms and the SIP qualify did the same thing.  It's as if there is one task that's sucking up 95% of the CPU. I actually had a debug build that had telnet access and it took 10 minutes to complete a login using telnet.  So I think this is actually a symptom of the failure mode rather than a root cause.  The good news is that 1.3.1 (003) has been running for over 16 days now even with SNMP polling it.

Hope this is helpful. --Dale