cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
5996
Views
10
Helpful
15
Replies

WRP400 voice module freeze - Where are you, Cisco?

aemcomcco
Level 1
Level 1

Hi Experts, I write about this argument because our WRP400s continue to freeze.
We are a Internet and Telephony Service Provider and we have chosen Cisco-Linksys CPEs imagining to find the same stability of well know Cisco equipments.
Unfortunately, we find that the Linksys devices are not Cisco devices.
We use WRP400s since version firmware 1.00.04c.

Over the years we have tested this list of firmwares:
1.00.04c
1.00.06
1.01.00
1.01.01
2.00.05
2.00.11
2.00.16
2.00.20
2.00.21
2.00.26 (actually used)

We can say that there have been improvements but we are far from a stable device.

The reason of this post is a common block presents in latest version 1 and all veriosion 2 of the firmware.
After several days of normal activity the voice module of WRP400 freezes.

This is a description of WRP400 during this block:

- PING is OK
- remote access to the router web menu is OK
- Internet surfing from LAN is OK
- phone line is KO
- remote access to the voice web module is KO
- every operations from the router web menu are ineffective (change LAN IP, diagnostic PING or diagnostic LAN search, reboot button, reboot url of version 2.00.26, ecc)

The only procedure to restore normal working is power off and on the device.

Now I'm trying to reproduce the cause of these stops without succes.

I think that the problem is in the router module and consequently blocks the voice module.
I have a feeling that the DHCP module is causing the problem (for now there are only hypotheses).

I think in a memory leak and so the first question is:
Is there a tool to monitor memory and CPU usage?
Can you find a workaround? e.g. a configurable hardware reboot in cron?

I'll update this post with new elements as soon as possible.

Thanks in advance.

1 Accepted Solution

Accepted Solutions

The case CSCub70060 is fixed in firmware version 2.00.32.

View solution in original post

15 Replies 15

Hi,

Your problem is known from years. Have a look at this thread:

https://supportforums.cisco.com/thread/2021695?tstart=0

In my case setting "restrict source ip" changed that device is not hanging 10-15/month, but only 1-2/month. Currently you can only fix the issue completley by buying different device.

Regards, Tomasz

aemcomcco
Level 1
Level 1

Hi experts, do you have news for us?

Regards.

We are waiting for a feedback.


Just a clarification. Our 2000 WRP400s are in a private network protected by ASA firewall and TippingPoint IDS. The "restrict source ip" option is enabled and the SIP port used is not the standard 5060 port.

We observed that the voice module freezes apparently after provisioning resync procedure. I've attached our config.
Can you investigate?

In our XML file for WRP400 is also present a router configuration module but the WRP400 doesn't loads this part. Can be a problem?

We hope in your news.
Regards.

Where are you, Cisco?

Hello aemcomcco,

I see from your configuration file you are running the latest firmware for your WRP400's, the configuration file was that created with the SPC tool for this version of firmware?

If it has, have you setup a syslog server and set one of the WRP400's to debug log level and did any kind of logging that could show us further information into what is happening?

I see you commented in another post, and am assuming you have configured to block 5060 inbound. Is it locking up every time it registers or at a certain time frame?

Is there any of the devices you have that you purchased in the last year?

Cisco Small Business Support Center

Randy Manthey

CCNA, CCNA - Security

First of all, thank you for your interest.
Below my answers:

1) The XML file used to provisioning WRP400 routers was generated from a previous version of SPC file. The exact version of SPC used was wrp400-1-01-00-spc-win32-i386.exe.
During time, the XML file was manually updated with new XML tag options.
This is the same procedure used with other Cisco Linksys devices without problems encountered with WRP400, e.g. SPA2102, SPA8000, SPA30x, SRP520.

Just one consideration. The versione 2.00.20 introduced a new XML tag:
Added two new configurable and provisionable parameters, “Soft IRQ polling
count” and “Voice Active,” to address an issue with voice cutoff. Users may
adjust these settings to fit their deployment.
URL to configure these parameters: http://192.168.15.1/VoiceDebug.asp
Use the following syntax for XML provisioning:
softirq_active_voice=0,softirq_polling_cnt=1

I updated my XML file putting this new option under but the last SPC tool 2.00.26 reports this option under .

What is the right position?

I  ask this because our WRPs do not load the profile.

2) We have more than 2000 WRPs. The issue of freeze seems random. We tried to monitor some devices via syslog but we were not lucky.
We will try again to give you a feedback.

3) The freeze happens on every WRP400, old and fresh.

Regards.

It's very difficult "capture" the problem or replicate it.
When logs are on, we observe that WRP400 sends this message every 2 seconds: ---- eval_prov_logic 1 ---- 

e.g.

Jan 16 14:40:23 10.95.102.96 [0]RegOK. NextReg in 296 (1)

Jan 16 14:40:24 10.95.102.96                 ---- eval_prov_logic 1 ----  42040 --    8409139

Jan 16 14:40:26 10.95.102.96                 ---- eval_prov_logic 1 ----  42041 --    8409337

Jan 16 14:40:28 10.95.102.96                 ---- eval_prov_logic 1 ----  42042 --    8409544

Jan 16 14:40:30 10.95.102.96                 ---- eval_prov_logic 1 ----  42043 --    8409742

Jan 16 14:40:32 10.95.102.96                 ---- eval_prov_logic 1 ----  42044 --    8409940

Jan 16 14:40:34 10.95.102.96                 ---- eval_prov_logic 1 ----  42045 --    8410138

Jan 16 14:40:36 10.95.102.96                 ---- eval_prov_logic 1 ----  42046 --    8410345

Jan 16 14:40:38 10.95.102.96                 ---- eval_prov_logic 1 ----  42047 --    8410543

Jan 16 14:40:40 10.95.102.96                 ---- eval_prov_logic 1 ----  42048 --    8410741

Jan 16 14:40:42 10.95.102.96                 ---- eval_prov_logic 1 ----  42049 --    8410939

Jan 16 14:40:44 10.95.102.96                 ---- eval_prov_logic 1 ----  42050 --    8411137

Jan 16 14:40:46 10.95.102.96                 ---- eval_prov_logic 1 ----  42051 --    8411344

Jan 16 14:40:48 10.95.102.96                 ---- eval_prov_logic 1 ----  42052 --    8411542

Jan 16 14:40:50 10.95.102.96                 ---- eval_prov_logic 1 ----  42053 --    8411740

Jan 16 14:40:52 10.95.102.96                 ---- eval_prov_logic 1 ----  42054 --    8411938

Jan 16 14:40:54 10.95.102.96                 ---- eval_prov_logic 1 ----  42055 --    8412145

Jan 16 14:40:56 10.95.102.96                 ---- eval_prov_logic 1 ----  42056 --    8412343

Jan 16 14:40:58 10.95.102.96                 ---- eval_prov_logic 1 ----  42057 --    8412541

Jan 16 14:41:00 10.95.102.96                 ---- eval_prov_logic 1 ----  42058 --    8412739

Jan 16 14:41:02 10.95.102.96                 ---- eval_prov_logic 1 ----  42059 --    8412937

Jan 16 14:41:04 10.95.102.96                 ---- eval_prov_logic 1 ----  42060 --    8413144

Jan 16 14:41:06 10.95.102.96                 ---- eval_prov_logic 1 ----  42061 --    8413342

etc.
Jan 16 15:28:34 10.95.102.96                 ---- eval_prov_logic 1 ----  43479 --    8696943
Jan 16 15:28:36 10.95.102.96                 ---- eval_prov_logic 1 ----  43480 --    8697141
Jan 16 15:28:38 10.95.102.96                 ---- eval_prov_logic 1 ----  43481 --    8697339
Jan 16 15:28:40 10.95.102.96                 ---- eval_prov_logic 1 ----  43482 --    8697537
Jan 16 15:28:42 10.95.102.96                 ---- eval_prov_logic 1 ----  43483 --    8697744
Jan 16 15:28:44 10.95.102.96                 ---- eval_prov_logic 1 ----  43484 --    8697942
Jan 16 15:28:46 10.95.102.96                 ---- eval_prov_logic 1 ----  43485 --    8698140
Jan 16 15:28:48 10.95.102.96                 ---- eval_prov_logic 1 ----  43486 --    8698338
Jan 16 15:28:50 10.95.102.96                 ---- eval_prov_logic 1 ----  43487 --    8698545
Jan 16 15:28:52 10.95.102.96                 ---- eval_prov_logic 1 ----  43488 --    8698743
Jan 16 15:28:54 10.95.102.96                 ---- eval_prov_logic 1 ----  43489 --    8698941
Jan 16 15:28:56 10.95.102.96                 ---- eval_prov_logic 1 ----  43490 --    8699139

When voice module freezes, simply no more of these ---- eval_prov_logic 1 ----  messages are sent.

But logs do not show abnormal SIP activity.

When this happens, SIP REGISTER messages are correctly sent from the WRP to the SIP server, but incoming INVITEs to the WRP400 are not shown in logs. Seems that the "Restrict Source IP" filter blocks all incoming INVITEs.
After a reboot, the WRP400 came back to normal activity.

Sometimes all the web menu freezes, not only voice web menu, simply during WEB GUI navigation.

e.g.
You try to run the Administrator-Diagnostics-Detect Active LAN Client(s) tool. After this, nothing happens, ping to WRP400 is OK, but the web gui became unreachable.

Another strange issue:
in some logs there is this message: (root) CMD (/sbin/check_ps)
in other no!
But I've the same firmware loaded on all my WRPs.

Can you give me an explanation?

Regards.

Probably the best thing to do is to find a unit that is still under support, then open a case with the Cisco STAC and reference this post.

I reviewed your config, and it seems to be a template with no voice lines enabled.  It doesn't look like it was created with the current SPC tool,

You may have to download the SPC tool and create a sample config to use for provisioning, 

The Cisco engineers will need the actual config from a router, and probably a full syslog from the failure.

Once the case is opened, it can be escalated to engineering for (potentially) a firmware fix/update.

Dan

I've new logs and new questions for you.
Can you describe what is the "eval_prov_logic" process is?
How can you see in the file WRP400_eval_prov_logic.log, immediatly after I've enabled logs on this WRP, the process "eval_prov_logic" has stopped.
In my experience, when this happens the WRP doesn't accept incoming calls even if SIP REGISTER messages are correctly processed.
Only after a reboot, the "eval_prov_logic" starts again.

What is "check_ps" process? Why some logs report this every 2 minutes and others no?

Others logs report "kernel: ++++++++  tdu_restart".

All our 2000 WRPs have same software and hardware properties (Software Version: 2.00.26 Hardware Version: 1.00.01) and are configured with same XML provisioning file but the behaviour is not linear, is different, too much different.

Can you forward these info to developers to investigate the issue?

I know that there is a new beta firmware 2.00.27.
Do you have unofficial release notes?

Regards.

Here are the answers to your questions,

and a couple questions of my own ...

<>

The router has a linux or unix core, I'm not privy to the modules or what they do. 

After we escalate your case, you will have direct attention of the 2nd level engineering, and developers from the business unit.

<>

same, those do sound like unix system processes.

<<2000 WRPs have same software and hardware properties ... same  XML provisioning file but the behaviour is not linear, is different,  too much different.    Can you forward these info to developers to investigate the issue?>>

Yes I can, Please forward me the serial number of a unit that is still under support, and we can open a case,

My questions are :

Did you download the SPC tool for this firmware version and create the  template config to use for provisioning? 

Or...Is it coming from a service or script that creates it on the fly? 

I looked at your file wrpconfig_b.xml   ,and it looked good, there were some things missing (line enable) and some things filled with 'placeholders' x.x.x.x, so we would need an actual config if the device,

The syslog files look good.  after we create the case, the escalation engineers will probably ask you for more details (packet  captures, topology, etc)

The best way to get this moving forward is to:

get the serial number of a unit that is still under support,

open a case and ask it to be escalated.

Chat and other support numbers are here

http://www.cisco.com/en/US/support/tsd_cisco_small_business_support_center_contacts.html

dlm...

The XML file used to provisioning WRP400 routers was generated from a previous version of SPC file. The exact version of SPC used was wrp400-1-01-00-spc-win32-i386.exe.

During time, the XML file was manually updated with new XML tag options.

This is the same procedure used with other Cisco Linksys devices without problems, e.g. SPA2102, SPA8000, SPA30x, SRP520.

The wrpconfig_b.xml file is actually used in our production environment.

Some parts are missing to allow manual changes (e.g. disable a line).

Some parts are replaced by x.x.x.x to hide the real value.

These are some serial number:

Device Serial No:CR301J900723

Device Serial No:CR301J900458

Device Serial No:CR301J706514

Device Serial No:CR301K500386

Regards.

I'm here again with a new feedback.

We updated up to 2k WRP400 with the beta firmware 2.00.30 activating the  automatic reset under "administration-system-automatic mainteinence" setted to  Saturday 3AM weekly.

After a weekend, a lot of WRP400 have the same  problems:

    - voice module freeze with web manu unaccessible, the browser  responds with "Error 324 (net::ERR_EMPTY_RESPONSE): The server has closed the  connection without send data"

    - reboot button and reboot URL string and  other operations have no effects

    - after working hard on the router web  menu, the device blocks its web menu

    - customer surfing works and we have  ping responses

    - only the hard reset re-establish the correct  work

So, we have configures the daily restart. In this case the major part of the WRPs restarted as expected.

In our opinion the weekly programmable reboot doesn't work properly, we  have some devices with this function enabled but with uptime greater than 2  weeks:

Product Name: WRP400 Serial Number: CR301J605494

  Software  Version: 2.00.30 Hardware Version: 1.00.01

  Voice Module Version:  1.0.18(20101206a) MAC Address: 0023697CE18D

  Client Certificate: Installed  Customization: Open

System Status       

  Current Time: 6/13/2012  14:54:40 Elapsed Time: 13 days and 19:55:39

  RTP Packets Sent: 826578 RTP  Bytes Sent: 18994020

  RTP Packets Recv: 827516 RTP Bytes Recv: 19014115

   SIP Messages Sent: 9308 SIP Bytes Sent: 6861861

  SIP Messages Recv: 16834  SIP Bytes Recv: 6404342

Probably, in 7 days the blocking condition stops the cron function.

Another consideration, the voice module  version in the firmware 2.00.30 is 1.0.18(20101206a), while in firmware version  2.00.27 the voice module is more recent: 1.0.19(2011110Test).

Can you clarify  why?

We have an other important feedback regarding the firmware 2.00.30 for the  WRP400.
The auto restart function of the voice module works right but it  starts also when the line is in used causing hanged calls.

Can you fix this issue implementing an idle reset of the voice module?

You can find a WRP400 log in attached.

Regards.

Hi Cisco, I need some informations about WRP400  Freezing problem.

First of all, the auto reboot feature of voice  module introduced in 2.00.30 firmware has partially fixed the issue but I've  still many devices with same problem.

In addition, this feature causes hanging calls. The  idle reset is a must.
The WRP must reset voice module only if there are no  active calls.
This is the request of CSCub70060. What is the severity level  of this case? I hope level 3 or 2.

Is there a way to implement a daily idle reboot  feature of both, router and voice, modules? In my opinion this type of reboot  (only when there are no active calls) should solve the problem  definitively.

I'll wait your feedback.

Best Regards.

The case CSCub70060 is fixed in firmware version 2.00.32.