cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
777
Views
10
Helpful
7
Replies

CUBE selecting next available server in server group

mumbles202
Level 5
Level 5

Working on a CUCM cluster w/ 4 nodes; 3 in Site A and 1 in Site B.  Site C has a connection to each of the buildings and has the server in Site B setup as the last option in the CM group in the event the site loses it's connection to Site A.  Options ping is enabled on the trunk facing the CUBE at Site C.  On the Cube at Site C I have configuration similar to this:

 

voice class sip-options-keepalive 1



voice class server-group 1
ipv4 1.1.1.1 preference 1
ipv4 1.1.1.2 preference 2
ipv4 1.1.1.3 preference 3
ipv4 1.1.2.3 preference 4



dial-peer voice 999 voip
description ---OUTBOUND Call to CUCM----
preference 1
destination-pattern 7....
session protocol sipv2
session server-group 1
voice-class codec 1 
voice-class sip options-keepalive profile 1
voice-class sip bind control source-interface GigabitEthernet0/0/0
voice-class sip bind media source-interface GigabitEthernet0/0/0
dtmf-relay rtp-nte digit-drop

 

this works fine in sending calls to Site A, but if the connection from A to C is severed I still see the CUBE trying to send calls to the nodes at Site A.  And if I do a "show voice class sip-options-keepalive 1" all 4 nodes still show as Active.

7 Replies 7

TONY SMITH
Spotlight
Spotlight

What settings do you have for Up and Down intervals, retries and timers?  Is it possibly it's just taking longer than you'd expect for the servers to show as Down.  You should be able to see the polling in SIP debugs to confirm it's definitely trying and definitely not getting a response.

By the way I think it's worth tweaking the Invite retries and Trying timer so that while you're waiting for the options polling to decide the node is down, it doesn't take forever trying and retrying a dead server.  Defaults would have it trying for something like 32 seconds.

Thanks for the post.  I was using the default timers.  I can tweak them, but at the time of the testing the link to Site A had been down for some time. 

Nuno Melo
Level 4
Level 4


this works fine in sending calls to Site A, but if the connection from A to C is severed I still see the CUBE trying to send calls to the nodes at Site A.  And if I do a "show voice class sip-options-keepalive 1" all 4 nodes still show as Active.


sip server group only determins the  cucm server order to be used

 

Since you have a server group with all the cucms If IP connectivity does not break between between your cube and all of CUCM nodes, options-ping will not busy out the dial-peer that is the intended behavour

 

So it would be better if you organize the cucm into groups according to the location and path and create 2 or more dial-peers each with its own cucm group. In this case sip options-ping will busy out the dial-peer in case the cucm is not reachable, in which case the dial-peer matching logic will chose the next best available dial-peer

 

Although the dial peer will stay up, individual servers should show as "Busy" if unreachable.  For example deliberately adding an unreachable server into this group I see the following .  Note the session IDs are because that is a test gateway so the server group has been changed around a few times ...

 

show voice class sip-options-keepalive 20
Voice class sip-options-keepalive: 20            AdminStat: Up
 Description:
 Transport: system               Sip Profiles: 0
 Interval(seconds) Up: 60                Down: 120
 Retry: 2

  Peer Tag      Server Group    OOD SessID      OOD Stat        IfIndex
  --------      ------------    ----------      --------        -------
  2600          20                              Active          26

  Server Group: 20               OOD Stat: Active
   OOD SessID   OOD Stat
   ----------   --------
   4            Active
   2            Active
   5            Busy

What's not shown on that output is the preference.  I added the fake server (IP 1.2.3.4) as the first preference.  Testing with an inbound call right away the server was seen as "Active" so an inbound call tried it first, resulting in a 2.5 second delay.   After a minute or so when the server status showed as "Busy" then the inbound call did not try that server but instead went straight to the second preference.

Thanks for the information Tony.  Do you have any recommended tweaks/timer settings for the keepalive to have the failover occur a bit quicker?

I usually leave the polling timers at the defaults.  But I try to reduce the Trying timer and Invite retries so that there's not an unacceptable delay if one of the servers is unreachable, during the time that the polling hasn't yet taken it out of service.

Do you have yours showing as "Busy" yet?

If not then what happens if you add a fictitious server into your server group?

Thanks Tony.  I had to bring up the link to Site A before I could fully test.  I'll see if i can add a non-existent server to the group and see if that busies out.