cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
96857
Views
5
Helpful
20
Replies

IP Phone restart and reset issue

Dear All,

Can someone help me here?

Our IP phones are getting resetting and restarting frequently. Details are given below, but its not affecting our active calls.

9:38:38a 14: Name=SEPECC882B0AD77 Load= SCCP45.9-0-3S Last=UCM-closed-TCP
9:38:38a 18: Name=SEPECC882B0AD77 Load= SCCP45.9-0-3S Last=Failback
9:40:10a 10: Name=SEPECC882B0AD77 Load= SCCP45.9-0-3S Last=TCP-timeout
9:41:11a 14: Name=SEPECC882B0AD77 Load= SCCP45.9-0-3S Last=UCM-closed-TCP
9:41:11a 18: Name=SEPECC882B0AD77 Load= SCCP45.9-0-3S Last=Failback
10:09:49a 10: Name=SEPECC882B0AD77 Load= SCCP45.9-0-3S Last=TCP-timeout
10:09:51a 23: Name=SEPECC882B0AD77 Load= SCCP45.9-0-3S Last=Reset-Restart
10:28:00a 10: Name=SEPECC882B0AD77 Load= SCCP45.9-0-3S Last=TCP-timeout
10:28:10a 23: Name=SEPECC882B0AD77 Load= SCCP45.9-0-3S Last=Reset-Restart


App Load ID        jar45sccp.9-0-3TH1-22.sbn
Boot Load ID        tnp65.8-3-1-21a.bin
Version            SCCP45.9-0-3S
CUCM Version        7.1.5

Thanks in advance

20 Replies 20

Tommer Catlin
VIP Alumni
VIP Alumni

Looks like they are loosing connections to CUCM.  Check on your heartbeats to/from the IP phone below to its primary CUCM server and backup.  It seems to be bouncing between primary and backup for some reason.

- WAN connection issue

- busy network

- connection issue at CUCM

- mismatched port speed at the phone/switch

Hi Tommer Catline,


Thanks for your information !!!!

CUCM server is in local LAN. I have checked the LAN performance and have not seen any packet drop between Phone and CUCM. Futher I have changed the keeplive timer also, but still I am facing the same problem.

From your desktop computer, if you ping the CUCM Publisher and Subscriber IP addresses, do you see any dropped packets or delays going on?

No, I dint see any drops and delay.

Did you guys ever find a solution to this?  I am having this problem on a Gig LAN network from the office across the street.  That's connected over fiber.  I am pinging the devices constantly and there are no drops, but the user's 7942 phones received: "CM Fallback Service Operating" There are no network drops and all other applications work without any issues.  Other phones on the LAN also seem to work fine.

Thanks,

Raul

Hi all,

We had similar issue with 7962 and 7965 phone, working perfectly in my office, but when moving to end user, the phone was continuously restarting from 5 to 30 seconds after registering. That office where end user was situated uses HP switch (on the other hand 78XX phones work there with no problem). I went through all possible options in configuration and debugging, when finally finding in logs some issues on phone with vlans (old vlan 4096, new vlan 4095).

The issue was resolved setting vlan tagging on the Cisco phone and HP switch, where default router was behind HP switch, namely Cisco.

HPH,

Regards,

Aleš

Thanks for your solution Ales.  I actually resolved our problem by factory resetting the phones and performing a reboot on the CUCM servers which had been up for nearly 2 years after gaining access to the OS. Has not re occurred since.  Should not all our equipment is on Gig Cisco 3850 switches with Fiber connectivity, though the network does not appear to have been a problem.

 

Hopes this helps others.

 

Regards,

Raul

Hi Guys,

this sounds very much like a TCP timeout issue normaly caused by some sort of stateful filtering done between the CUCM and the end user handset.

We had the same issue caused by Checkpoint firewalls. There is a known bug in SecureXL where it will impact tcp packets if you utilie a PPPOE link.

This is still an active bug on Giai 77.30.

Point being is that make sure you do not have any filtering that would impact or force different timeout values on your tcp packets

7:36:55a 14: Name=SEP00270D3CEECB Load= SCCP42.8-4-2S Last=UCM-closed-TCP7:36:56a 6: Name=SEP00270D3CEECB Load= SCCP42.8-4-2S File Not Found7:37:27a 14: Name=SEP00270D3CEECB Load= SCCP42.8-4-2S Last=UCM-closed-TCP10:37:58a 14: Name=SEP00270D3CEECB Load= SCCP42.8-4-2S Last=UCM-closed-TCP10:37:59a 6: Name=SEP00270D3CEECB Load= SCCP42.8-4-2S File Not Found10:38:29a 14: Name=SEP00270D3CEECB Load= SCCP42.8-4-2S Last=UCM-closed-TCP2:25:38p 14: Name=SEP00270D3CEECB Load= SCCP42.8-4-2S Last=UCM-closed-TCP2:25:39p 6: Name=SEP00270D3CEECB Load= SCCP42.8-4-2S File Not Found2:26:09p 14: Name=SEP00270D3CEECB Load= SCCP42.8-4-2S Last=UCM-closed-TCP6:22:52p 10: Name=SEP00270D3CEECB Load= SCCP42.8-4-2S Last=TCP-timeout6:22:53p 6: Name=SEP00270D3CEECB Load= SCCP42.8-4-2S File Not Found6:23:24p 10: Name=SEP00270D3CEECB Load= SCCP42.8-4-2S Last=TCP-timeout10:20:22p 14: Name=SEP00270D3CEECB Load= SCCP42.8-4-2S Last=UCM-closed-TCP10:20:23p 6: Name=SEP00270D3CEECB Load= SCCP42.8-4-2S File Not Found10:20:54p 14: Name=SEP00270D3CEECB Load= SCCP42.8-4-2S Last=UCM-closed-TCP

Please any one can help us out from this issue

dsobrinho
Level 9
Level 9

Hi guys,

Does anybody knows something like that? I have the same problem with Cisco IP Phone 7962, in branch office in RJ.

I have been checked the WAN, QoS and LAN and no problems was found.

Daniel Sobrinho

Ian Terry
Level 1
Level 1

Did you ever get a resolution to the restart/reset issue ?

Thanks

Hi Ian,

If you are using third generation phone(7970, 79x1, 79x2, 79x5), then there is no fix for it.

This is what I got latest update.

There are a couple of things that need to be kept in mind:

1.       Phones will unregister from the CUCM and register to the SRST GW

2.       GW will tear down Q.931 backhauling from CUCM and will function as a standalone call agent.

The phones will lose either SCCP or TCP keepalives. TCP keepalives being missed generally trigger SRST much faster than SCCP keepalives being missed.

The phones will register to the GW even if Q.931 backhauling has not been torn down. Hence, there might be a brief period where the phones have registered to the GW but the PRI is still MGCP controlled. In such a case, calls will initially fail. After some time (and this is post the 30 second MGCP KA) the Q.931 backhauling will be torn down (isdn bind-l3 ccm-manager will be removed) and the PRI will function as if it has been configured on an H.323 GW. Here we need to understand that the phones showing "Registering" or "CM Fallback Service Operating" are not indicative of the GW going into SRST. The phones will go into SRST much faster than the Q.931 backhauling being torn down.

The behavior you are seeing is the mechanism for timing out a TCP connection and has nothing to do with the SCCP keepalive itself.  Any time the phone sends a TCP packet to the server and does not receive a TCP Ack.  The phone will retransmit the packet at decreasing intervals until the session is timed out (phone sends TCP RST) and at that point the phone will failover to the next CCM server or SRST reference.

The SCCP keepalives are sent at regular intervals, based on a value presented to the phone during registration (30 seconds by default). If the phone gets a TCP ack for the keepalive, but no SCCP keepaliveAck from the server then you can get into the situation where the phone unregisters due to keepalive timeout (after 2 or 3 such missed keepaliveAcks).

The former is a network problem, the latter is an application problem where, the network layer of the CCM server is acknowledging that the message was received but the CCM application is not responding.

You will note in your example when the phone registers with the SRST router the sccp Alarmmessage it sends will contain a string like "last=TCP Timeout" or similar.

The 3rd gen phones (7970, 79x1, 79x2, 79x5) are much more aggressive in timing out the TCP session than the 2nd gen phones. What took your 7960 26 seconds to unregister will take a 7965 about 8 seconds.

I had a juniper firewall once between a remote site and CUCM and the SCCP keep alives were being delayed. This caused issues with some phones of course. Not sure if this is something you have or something similar. You may also adjust the trigger in CUCM SCCP keepalives to a higher value. This may help as well.

Hi Hariharan

Thanks for the information - it looks like we are suffering from the 3rd generation phone TCP mechanisms.

Still doesn't get away from the fact that our carrier is causing us to have retransmissions but at least we can identify some workarounds now.

Many thanks again.