03-04-2011 07:54 AM - edited 03-13-2019 07:20 PM
I wanted to get clarification about how exactly phones will register to the secondary or tertiary servers if the link to the primary fails.
I have a 3 server CUCM cluster. 1 subscriber that some phones register to is separated from the other 2.
When the link to that sub goes down, the phone should then register to the secondary server of its Cisco Unified CallManager Group configuration, correct?
If this is true, then my next question is how long does it take to then failover to the secondary server?
SRST is not involved in this, it's just the CUCM Group Config. Any help clarifying how phones behave in this situation?
03-04-2011 09:55 AM
I found my own answer in the Cisco Press Troubleshooting Cisco IP Telephony book, page 155:
"The TFTP configuration file received by the IP phone has a list of up to three CallManagers
for registration purposes. The order of CallManagers indicates their priority. The first is the
highest-priority CallManager for the IP phone. It is the first place the IP phone attempts to
register. The highest-priority CallManager is also called the primary CallManager.
When the IP phone registers with its primary CallManager, it also establishes a standby
TCP connection to the next-highest–priority available CallManager, sometimes called the
standby CallManager, secondary CallManager, or backup CallManager.
The IP phone knows that there is an alternative CallManager for quick failover if it loses connectivity
with its primary CallManager.
At any given point in time, the CallManager to which the phone is registered is its active
connection. The CallManager node that has a standby TCP connection for the IP phone is
the standby connection.
Failover occurs under two conditions:
• If the TCP connection between the IP phone and the primary CallManager goes down.
Incidentally, stopping the CallManager service on a server causes all the TCP connections
to be closed and all the IP phones on that server to register to their standby.
• If CallManager has not responded to three consecutive KeepAlive messages sent from
the IP phone.
Any number of network-related issues could cause the TCP connection to go down. When
the IP phone registers with its standby CallManager, it registers with an alarm indicating
why it failed over.
Under the second condition, CallManager fails to respond to three consecutive KeepAlives.
The IP phone sends a KeepAlive message every 30 seconds by default. CallManager should
answer each KeepAlive the IP phone sends with an acknowledgment message. If
CallManager fails to respond to three consecutive KeepAlive messages, the IP phone marks
the connection as “bad.” The IP phone does not tear down the TCP connection, but it does
not attempt to re-register with the “bad” CallManager either. It continues sending
KeepAlive messages to the “bad” CallManager until CallManager tears down the TCP
connection. This delay gives CallManager time to respond if it recovers quickly. After 10
minutes, the IP phone removes the “bad” tag and again tries to establish communication
with CallManager using KeepAlive messages in the process just described. During this
time, the IP phone attempts to establish a connection with its secondary CallManager (or
its tertiary if the secondary is not available) and registers if possible.
If a call is in progress when an IP phone detects the loss of a TCP connection, the IP phone
does not fail over until the call is finished. If CallManager fails to respond to three consecutive
KeepAlive messages while an IP phone has an active call, the IP phone again waits
until the call is finished before registering to its standby CallManager."
Hope that helps someone else in the future.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide