We support many 2811 gateways at customer sites, all handling T.37 fax (almost entirely onramp) and nothing else. Occasionally we see bursts of the following in the logs, showing that incoming fax calls are being rejected due to high CPU:
Sep 12 12:07:14.980 CDT: %IVR-3-LOW_CPU_RESOURCE: IVR: System experiencing high cpu utilization (96/100).
Call (callID=1365621) is rejected.
Sep 12 12:07:20.864 CDT: %IVR-3-LOW_CPU_RESOURCE: IVR: System experiencing high cpu utilization (96/100).
Call (callID=1365622) is rejected.
Sep 12 12:07:50.564 CDT: %IVR-3-LOW_CPU_RESOURCE: IVR: System experiencing high cpu utilization (98/100).
Call (callID=1365625) is rejected.
In the above example, the logs continued for about 7 minutes. A 'show processes cpu history' was done about half an hour after the event and showed this in the 'last 60 minutes' graph:
Most recent CPU is on the left, so this shows that the CPU was ramping up over a period of 20-25 minutes, remained at 100% for about 8 minutes, then abruptly dropped back to normal. Uptime was 30 weeks when this happened. Because it's so infrequent and we have no way to reproduce it, it's unlikely that we'll be able to grab diags while the CPU is at 100%.
The IOS version is 12.4(25f), running on a 2811 with the NM-HDV2-2T1/E1 board and the PVDM2-48 DSP card. The DSP firmware is overridden to 26.4.501 due to previous issues with out-of-spec tones and outbound fax not working. It has 512MB memory and 128MB flash. I see there's a 12.4(25g) IOS version available, but the caveats list doesn't seem to contain anything relevant.
Any ideas as to what might be causing this? A rogue fax? Is IOS 15 likely to help? Some of the fax gateways we support only have 256MB memory so won't be able to go to IOS 15 easily.
Note that these calls have been up for over 3 hours. This particular example is on a 2811 running IOS 15.1(4)M7.
We cured the high CPU by killing the two long-lived calls:
fgw#clear call voice causecode 31 id 32FF
fgw#clear call voice causecode 31 id 3302
We worked with Cisco TAC on this a while back, and found that when this happens the CPU is consumed by the DocMSP process getting into a loop. However, TAC refused to go any further because our onramp TCL script is customized to pass through the RDNIS when present, so that we can use that as the fax target number. Not really a relevant change, but rules are rules.
Anyway, we had DocMSP debugs enabled in this case (debug fax dmsp all, with console logging disabled). They confirmed looping for the two calls; here's one of the loops:
Certificates are the first step to deploy Cisco Meeting Server, preparing certificate are very important to enable different services. As a VOIP administrator, mastering the concept of certificates is unavoidable
Using multiple CA servers, instead of a si...
I just finished to write a comprehensive certificates preparation for Cisco Meeting Server Clustering. Through 60 pages I explained in detail, how to create certificates for database cluster, callbridge cluster, certificate chain for webbridge3, certifica...
Translation Pattern is the most important tool in the Call Routing Process for Cisco Unified Communication manager. Largely used in the Globalized Dial Plan, inter-site and intra-site dialing, and a powerful tool to solve the problem of overlapping direct...
Customer relationship management (CRM) tools are essential for managing relationships, collaborating with your team, and closing deals. The most forward-thinking CRM companies now integrate artificial intelligence (AI) into their software. The combination...