cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1341
Views
0
Helpful
6
Replies

Percentage of Phones won't register after CUCM 9.1.2 Sub and TFTP failover

apishko
Level 4
Level 4

Hoping this has happened to someone else and they can shed some light on the subject.  I have a CUCM 9.1.2 cluster running, pub, 2 subs, and 2 tftp servers.  At the moment it's a small deployment so I have all phones registered to sub 01 as primary and using tftp01 as the primary tftp server.  In order to have some VMWare maintenance done I recently shut down the pub, primary sub and primary tftp server forcing the phones to use the other half of the cluster.  Everything there worked as expected, phones failed over, etc.  The problem occurred after the VMWare maintenance was done.  At that point I brought up the Pub and the primary sub (maintenance wasn't yet completed on the host that contained the primary tftp server).  After the primary sub was up and back online is where the problem started.  Random phones (all 8945 SIP that are all running the same firmware version SIP8941_8945.9-3-2-12) would just not register.  Some of them would show registered in CUCM, however they were non-functional as DNs weren't displayed, no dial tone, etc. Others would just not register all together.  In order to get a phone to register I had to perform a factory reset on the physical phone, if I did that they came back with no problem.  Later after the maintenance was completed on the host that contained the primary tftp server and I brought that back online, I still had issues with phones however just a power cycle was enough to get those phones registered.  I opened a TAC case one this but it ended up at packet captures and at that point ever phone I reset would just register correctly so I could never get an example of a phone that wouldn't register.

I'm convinced it's certificate related and having to do with the trust list, which is why I believe the factory reset on the phone would allow the phone to correctly register.  There just doesn't seem to be any clear indication as to why it would be certificate related.  Names weren't changed on any of the UC servers, VMWare version was upgraded from 5.0 to 5.1 but I can't see how that would have caused certificate issues.  

 

Any help would be grateful or if this has also happened to anyone I'd love to hear what you did to resolve this as for future maintenance windows I have to be confident that phones will be able to fail over between cluster nodes.

 

Thank you,

Alex

6 Replies 6

apishko
Level 4
Level 4

Also to add, it wasn't all phones some would re-register just fine and I can't find any correlation between the ones that did and the ones that didn't.

 

Thanks again.

I am having same sort of random issues after the upgrade to 9.1.2.20000-28 that I have to restart the TFTP service and physically shut and no shut the ports on the switch to get the phone to register again.

 

My biggest issue is with 7821 phones that will register en you reset them or power off and on and then it won't register again.

Best Regards

Brian Meade
Level 7
Level 7

Did you happen to look at the status messages or console logs of one of the affected phones?  That should show if there was an ITL issue or help track down another potential issue.

 

CallManager traces from that time frame would also be helpful if it's not an ITL issue and rather some sort of registration issue.

I actually did capture the console log off one of the phones, see attached.  I don't have the relevant cucm traces.  Looking through the console logs it does look to me that the issue is certificate related but the TAC engineer didn't seem to think the logs showed that.

So the phone is unable to authenticate the ITL it is getting or the new config file.  This could mean that the ITL is corrupt on one of the nodes or that TVS isn't working properly.

 

I would try running "show itl" on each node and check the bottom of the output to see if the ITL is verified successfully.

 

I would also make sure TVS is running on all nodes okay.

 

You also should be able to just delete the ITL from the affected phones rather than doing a full factory reset in case this comes up again.

Gordon Ross
Level 9
Level 9

I've not had your exact problems, but I've had issues with 78xx phones registering. The latest firmware seems to have helped, but there are plenty of bugs in the 78xx firmware. (I've got two TAC cases open about the 78xx phones and another one or two waiting in the wings)

 

GTG

Please rate all helpful posts.