cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
12542
Views
190
Helpful
48
Replies

Ask Me Anything- CUCM Troubleshooting: Best Practices for Reading Trace Files

Cisco Moderador
Community Manager
Community Manager

This topic is a chance to discuss more about how to read Cisco Unified Communications trace files. In this session, Cisco Designated VIP Maren Mahoney will answer questions about how to examine, locate and extract information from CUCM trace files. While most questions about trace files are about call setup and troubleshooting, questions about other trace files (DirSync, ILS, CTI Manager, etc.) are welcome and encouraged.

In addition, Maren will provide the best tips and tricks to help you understand and fix different issues such as: what trace fill to poll, how to add router/CUBE debugs to a trace file for telephony troubleshooting and how to follow a single call through multiple trace files. As well as the recommended tools for parsing a trace file.

To participate in this event, please use the Join the Discussion : Cisco Ask the Expertbutton below to ask your questions

Ask questions from Monday 7th to Friday 19th of October, 2019

Featured expert
maren.jpgMaren Mahoney has been in the information system industry for more than 25 years with roles in employee development, technical support & helpdesk administration, network administration, management and engineering, networking courseware development and instruction. She is a Senior Technical Instructor at Sunset Learning Institute and teaches a range of technologies but specializes in Unified Communications. Before joining Sunset Learning Institute (SLI), Maren worked for Cisco Systems as a Network Consulting Engineer. She also worked for several Cisco Reseller Partners in engineering and technical instructor roles. Maren is an official Cisco Certified Systems Instructor (CCSI). She holds a bachelor’s degree in Mathematics and Russian studies and plans one day to gain a Master’s Degree in Mathematics. Maren holds different certifications in Routing, Switching, Data Center and a CCIE in Collaboration (#50569).

Maren was recognized as a Cisco Designated VIP in 2019 for her contributions to the Cisco Community in the IP Telephony category.

Maren might not be able to answer each question due to the volume expected during this event. Remember that you can continue the conversation on the Collaboration, Voice and Video Community

**Helpful votes Encourage Participation! **
Please be sure to rate the Answers to Questions

48 Replies 48

ryanbrothers
Level 1
Level 1

Hello,

I use the Cisco 3905, 7841, 7861, and 8851 models.  When I access them via web page I have console logs and core dumps and debugs.   Where does one start to begin troubleshooting a phone having issues ?  What are some items to look for in these logs that may hint the phone is having some issues ?

 

Thanks,

Ryan

Ryan,

Before poking around in phone logs, if you have a phone that won't boot I would start with tracing the phone boot process, most of which can be verified on the phone itself:

  1. Power/POST
  2. Load an Operating System - Look for the load file in the Settings Menu under "Model Information" or Status, or elsewhere depending on the model. If you are not physically with the phone, you can issue "show cdp neighbors detail" on the connected switch to see the firmware on the phone.
  3. VLAN - The Network information in the phone will tell you which VLAN it believes it is in. It it correct?
  4. DHCP - Like VLAN, is the information correct?. The phone will tell you what information it's working with, especially the TFTP option issued by the DHCP pool.
  5. TFTP: In the Status messages on the phone, you should see the phone requesting it's MAC-specific cnf.xml file. If you see it requesting the XMLDefault.cnf.xml file, then it is trying to auto-register which means the CUCM does not "know" about the phone (which probably means the MAC is misconfigured in the CUCM).
  6. Also in the status messages: Look for "DHCP Timeout" or "TFTP Timeout".

But if you are asking about the phone logs, then I would imagine that you have checked out the items that list already. (But it's a good list for folks that are new.)

Things to check in the phone console log:

  • REGISTER - Your SIP phones will send a REGISTER message to CUCM. The phone's console log will have that message in it. Is it going to the right CUCM server? Are you receiving a TRYING message back? (Which means it was "heard" by the CUCM). Is it sending registration messages over and over again in around 2-minute intervals (which indicates a connectivity problem)?
  • Also, do you see the Line register after the phone begins the registration process? SIP phones must register at least one DN in order for the phone to fully register with CUCM. You want to look for Registration state change: SIP_REG_STATE_REGISTERING ---> SIP_REG_STATE_REGISTERED and also LINE 1: REGISTERED
  • ERR - These lines will either contain error messages, or will immediately precede/follow a set of messages with a fuller explanation of what might be going on. For instance, if the phone's ITL will not allow it to register you will see an ERR line with something like EROR:https_cert_vfy: HTTPS cert not in CTL. The two lines preceding that line will show the phone checking the CTL and the ITL. This EROR message indicates that neither "worked".
  • REASON - Search for "Reason:" or "Reason=" and look at the number that follows. Each number is specific to the reason the phone isn't registering. The link I had to the list of these codes is broken on Cisco's website, but I'll see if I can get a fresh link for you. But somewhere in the console log around the "ERR" and "REASON" lines will be a fuller explanation of what the reason code is.

I hope this was helpful. I really like your question and I may do a Tech Talk on it later on.

Maren

Maren,

Thank you this is very helpful !

Jayant Anand
Level 1
Level 1

Hi Maren,

 

What are the best practices you generally recommend for tracing a call in the call manager logs and specific scenarios like call drops or media resources issues?

Also, incase of multiple server hops like call traversing through multiple call managers, what is the best way to track a call between different cluster? 

 

Regards

Jayant Anand

 

I know I'm not the host of this conversation, but I did just recently watch these two videos from a TAC engineer:

Same Node
https://www.youtube.com/watch?v=dwuZYD1Jlpw

Between Nodes
https://www.youtube.com/watch?v=d6nrUA2ckZY

Hi Jayant!

Let me take the second part of your question first. @Anthony Holloway posted links in his reply to two videos by Patrick Kinane that show how to track a call between two phones on the same node, and then between two phones on two different nodes in the same cluster. (And I'll include the links below as well.) Patrick's technique for tracking a call is the best I've ever seen and taught me a thing or two.

As for tracking a call between clusters, start by identifying the Call-ID and Session-ID in the outbound INVITE message from the first cluster. Use that information to locate the same INVITE Message as it arrives inbound on the second cluster. From there, you can use the techniques Patrick shows to track the call through the trace files in the second cluster.

Of course in a production environment a single call can span more than one file. In the case where tracing a call spans more than one trace file, open both of those files (or, heaven forbid, all three files) and use the "Find All in All Open Documents" feature in Notepad++ to locate the relevant lines.

Here are the links to the two videos:

Read CallManager Traces - Phone To Phone Same Node 

Read CallManager Traces - Phone To Phone Different Node 

Now for the first part of your question. For dropped calls, it will depend on the reason for the drop.

A 404 Not Found error means, of course, that there is either a digit manipulation issue or a permissions issue (PT/CSS in the case of CUCM). For those errors, I don't generally use trace files to troubleshoot. Rather, I use the Dialed Number Analyzer to see what CUCM is doing with the call internally, and the Dialed Number Analyzer for CUBE TAC tool to analyze my CUBE or router's config.

A 503 Service Unavailable error is the second most common thing you'll see in SIP calls. For those you'll occasionally have an IP address misconfigured in CUCM or a dial-peer, but most often this will indicate codec negotiation problem. To identify (or at least get a first approximation of) the cause of the error, look for the Q.850 cause code in the error message. A "47" is "Resource unavailable – unspecified", but generally indicates a codec problem. To start troubleshooting that: If it is an internal call, look at the region settings. If that doesn't reveal the problem, dig into the trace. In the trace file, watch the SDP exchange during call setup to get an idea of what the devices' capabilities are and what they are requesting. Hopefully the answer will pop out. If not...

I did a Tech Talk for Sunset Learning that walked through a trace file showing how CUCM processes a call that requires a media resource, specifically a conference bridge. It shows CUCM processing region/codec information, the process for allocating the media resource, and the process for having the call legs redirect through the media resource. Knowing what is "supposed" to happen will help you identify what isn't working. That video is currently in a student-only video library. I will get it moved to the public library and post it in another response to this message once it is available.

Please feel free to ask additional questions. And thanks for this one!

Maren

Jayant,

Here is a link to the video I did on Media Resources. It show DSPs registering, and then follows a call allocating a transcoder. It will show you what to look for in a normal call. Comparing that to the call you are troubleshooting should help you identify where the media resources allocation is failing.

If you have more questions or would like a further discussion of this, let me know.

Using Trace Files to Troubleshoot Media Resource Allocation 

Maren

Anthony Holloway
Cisco Employee
Cisco Employee

Wow! As a fellow VIP I commend you for putting yourself out there like this.  Good luck!  I'll be sure to follow the discussion.

Thanks, Anthony. You are right that this does not feel like a small thing. Your good wishes are appreciated!

Maren

piyush aghera
Spotlight
Spotlight

Excellent initiative... 

 

I want to ask about most painful issue an admin can face (at least for me) and that is voice quality issues within cluster and for PSTN calls. I know there are many questions found and there are many different answers to this question, but I would like to know what tools do you recommend and approach you recommend to trace logs for these issues and tackle them effectively.

I had faced this issue couple of times and they turned out to be network related issues; but it becomes quite challenging to prove that the issue needs to be looked upon from network perspective.

 

Thank you.

Piyush,

I'm sorry for the delay in answering your question, but I wanted to ponder how to respond. As you pointed out, a lot of community discussions, blog posts, and white papers have been written on this topic so I didn't know what I could add.

In general, there isn't much in the trace files or phone console logs that can identify call quality issues since those files are primarily internal-processing related rather than payload-delivery related. That said, Call Management Records (CMRs) can be analyzed for packet-loss and delay statistics. If you can show that a particular piece of gear or a single site is common to a bunch of poor-quality calls that can help you convince the network folks that the problem may be theirs.

For phone/headset problems, Wireshark is my go-to tool for analyzing the traffic flowing in and out of the phone. (Bluetooth is a little harder, of course.) If you think the problem is QoS-related, an analysis tool like NetFlow can help prove that. But that usually takes buy-in from the network team in the first place.

In the end, I don't know that I have any special magic to share on this question so I doubt my answer very satisfying. Your question reminds me of discussions surrounding faxing over IP - everyone hates it, it's a pain to troubleshoot, and no two problems are the same.

Maren

How do i collect an audit logs files from CUCM Publisher using CLI ?

Souley,

You can see a list of the Application Audit Logs using the CLI command:

file list activelog audit/AuditApp/*

Once you have the list of files, you can view one with:

file view activelog audit/AuditApp/<name-of-file>

You can also set up transfer of these files to another server with the information here:

Configure Remote AuditLog Transfer Protocol 

I hope this answers the question you are asking. Let me know if there are more answers I can provide.

Maren

Awesome thanks Maren i really appreciated it. I was able to see the audit logs.

Can I see the audit logs from a specific date and time ? for instance October 1st to October 3rd ?

 

Thank you,

 

Souley