Were you able to get the resolution for your issue? I am getting the same issue in my environment too.
What is the exact model of your Cisco switch?
I started having this problem as well. At first, I thought it was the IOS because the 3560G switch it was happening on had pretty old software. I upgraded the switch and it seemed to have resolved it but I just had the phones on that switch go down again today. Interestingly, I also had a different switch do this same thing.
Here is something else that is interesting. We have had a CME with about 20 phones running for several years on a 3560G-24P with no problems. We recently upgraded ourselves and our parent company to a BE-6000 (they had a Nortel with TDM phones). Our CME was running in conjunction with the BE-6000 as we pre-deployed phones. As soon as we put the new 7945 phones out, that switch started having problems. Nothing had changed with that switch, our CME or our 7945 phones for years and that switch had absolutely no problems. The new phones we put out had newer firmware so I'm leaning towards the phone firmware and/or CUCM doing something odd. Also, the phones that were still working on our CME went down when the new phones went down, so it looks like something system wide on the switch, not localized to one port.
We have the issue of phones resetting randomly in a CMBE6000 system with only 48 phones connected. There are 4 locations, 3 connected via EPL and 1 location via point to point T1.
All locations, including the main site are seeing sporadic phone resetting. All the phones are either 7942 or 7962 with SCCP42.9-1-1SR1S app load.
CCM version is 8.5
Over the past month I have initiated two TAC calls. The following has been done with no resolution to the problem:
The Cisco Unified IP Phone firmware 7.2(1) introduced a Geometric TCP mechanism to permit IP Phones to measure the round-trip delay between the IP Phone and Unified CM, then adapt the keepalive timeout value. This provided a very accurate failover mechanism when the network delay is consistent.
However, if the network delay is inconsistent, this mechanism may cause the IP Phones to inaccurately attempt failover. The Cisco Unified IP Phone firmware 8.4(2) introduces the ability for the Network Administrator to disable this behavior, if necessary, through the Detect Unified CM Connection Failure parameter defined on the IP Phone device configuration. The default value is Normal; this Geometric TCP mechanism can be disabled if the parameter is set to Delayed.
Has anyone resolved this issue?
One of my client is facing similar issues. There servers are across the WAN in a DC. In my case, one of the remote site is loosing phones intermittently. Other remote sites are working fine at the same time.
Myself and TAC has spent lots of time troubleshooting this and finally pinpointed bad WAN connection at that site but customer is confident that there are no issues in the WAN.
I've also done similar troubleshooting what you mentioned above except disabling "Geometric TCP" which I've done last night.
If the issue continues, my next steps will be to run wireshark at remote site and at DC. If I see pakcet loss/unexpectedly higher delays, I can go back to customer and prove that the issue lies somewhere in the network and not in voice gears.
We have as well. We had another customer that was indeed having WAN issues and once resolved the phones resetting or bouncing between fallback SRST and Call Manager stopped.
In this case I, Point-to-Point T1, we have tested and worked with the Carrier to eliminate the ciruits as the potential problem.
I too have had several TAC cases and spent many hours on this with no resolution or next steps from TAC.
We are of the same opinion that it will take some wireshark captures to truly see what is going on. I plan to put a system with wireshark on the same switch as the call manager and do port mirroring to monitor the communication to the Call Manager for a 24-48 hour period.
If you do the same I would recommend that you set filters on Wireshark to only look for the keep alive/SCCP ports or you will be inundated with a lot data.
I am curious as to what you will find.
Meet to had the same issue,fortunately I had called the fiber testing guy.
when he tested there was a fiber link issue which was joined,when it was cut.
Surprisingly,in the first test with general characterstics it was ok.but when we tested with specific characterstics,we were able to identify.