cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Announcements
Walkthrough Wednesdays

SIP Failover and Load Balancing

9788
Views
15
Helpful
4
Comments

 

A- Load balancing of calls outbound from Unified CVP

Four options are possible

1- Local Static SIP Routes

  • Only recommended for LAB use
  • If you have multiple routes, CVP is only going to try the first one even if it is down or failed
  • Not scalable
  • If there is a need to change one route then all the CVP Call Servers must be touched

2- Local CVP SRV Records

  • This is the most scalable and efficient method when coupled with SIP Proxy Servers as mentioned in the following example.
  • If there are multiple SIP Proxy servers (or any SIP entity for that matter like VXML-GW) in the network the call can be distributed in the round-robin method by defining equal weight and the equal priority.
  • Within CUP SIP Proxy itself you can configure static routes with Priority and Weight.

Below is an example local SRV configuration file. It must be named in a text file named srv.xml and manually placed in the c:\cisco\cvp\conf directory of the Call Server where the SIP Service is running.

 
<?xml version="1.0" encoding="UTF-8" ?>
<locater>
 
<host name="cups.cisco.com"> 
<record weight="30" priority="1" destination="10.86.129.1" port="5060"/> 
<record weight="30" priority="2" destination="10.86.129.2" port="5060"/>
</host>
 
 
<host name="vxml-gateway.cisco.com"> 
<record weight="30" priority="1" destination="10.86.129.23" port="5060"/> 
<record weight="30" priority="2" destination="10.86.129.109" port="5060"/>
</host>
 
 
<host name="vgw.cisco.com"> 
<record weight="30" priority="1" destination="10.86.129.100" port="5060"/> 
<record weight="30" priority="1" destination="10.86.129.101" port="5060"/>
</host>
 
</locater>
 
  • So in example above (host with cups.cisco.com), SIP request will always go to 10.86.129.1 and if 129.1 is failed it will go to .2
  • Because the priority is different (when priority is different, the weight is ignored)
  • CVP is going to wait till the SIP timer expires and then send it to 129.2

 

  • In the second example above (one with the vxml-gw.cisco.com) first record is on the top priority
  • So all calls will be routed always to 10.86.129.23.
  • Only when first host goes down, then the call will be sent to 10.86.129.109
  • But in this case CVP has no way of knowing about 10.86.129.23 going down dynamically
  • And after SIP retry timer kicks in, then it will send it to 10.86.129.109
  • So each and every call will have to wait before the timer expires and then it will be sent to 10.86.129.109

 

  • In the third example above (one with the vgw.cisco.com)
  • Since the priority and weight both are same, that means all calls will go into round-robin fashion
  • Lets assume that the first host is down, then every second call will have to wait till SIP timer expires and then it will switch to 10.86.129.101

 

3- DNS SRV Records

Essentially same as above. But the SRV functionality is handled by DNS Server now. In CVP configure a static route pointing towards the DNS SRV Server

4- SIP Proxy Server

You can use SIP Proxy Server for load balancing and failover. CUPS SIP Proxy support priority and weight options.

 

B- Load balancing of calls coming into Unified CVP from the GW (either VGW or VXML-GW)

1- The GW could have static routes pointing towards multiple CVP Server

In the dial-peer you could configure the preference value. So if CVP1 for example failed, the call could go to CVP2 after the first try times-out

2- The GW can be configured for GW based local SRV records.

Configure local SRV records in the gateway. It could point to a SIP Proxy Server or a CVP server.

 
 
GW-Local-SRV---->CVP1
         |------>CVP2
 
 
OR
 
 
GW-Local-SRV---->CUP1-------CVP1
         |        |--------CVP2
         |
         |----->CUP2-------CVP1
                 |--------CVP2
 
 

3- The GW can be configured to used DNS based SRV records

GW can point towards the DNS based SRV records.

 
GW--->DNS-SRV----CVP1
         |------CVP2
 
 
OR
 
 
GW-->DNS-SRV----CUP1-------CVP1
         |       |--------CVP2
         |
         |-----CUP2-------CVP1
                 |--------CVP2
 
 

4- The GW can be configured to use SIP Proxy Server

 
GW-->CUP---->CVP1
      |----->CVP2
 
 

 

CVP Local SRV SIP failover behavior on various SIP error messages

Failover Scenario:

Consider following failover scenario (Load balancing and mid-call SIP failure scenario would behave the same way)

 
<host name="vxml-gw.cisco.com"> 
<record weight="30" priority="1" destination="10.86.129.1" port="5060"/> 
<record weight="30" priority="2" destination="10.86.129.2" port="5060"/>
</host>
 
  • With local SRV how does CVP behave if the SIP end point sends
    • TEMP Failure
    • BUSY
    • Server Error
    • Or other standard SIP error message


Question:
Is CVP going to retry the next destination with priority=2 route automatically, if it receives one of the above SIP error messages? And would it happen after some timeout or immediately?


Answer:
If the other SIP end-point responds with a 4XX responses like 480 or 486, this indicates to CVP that the final endpoint was reached and that this is the final disposition. Therefore, the failover does not occur. The CVP Local SRV implementation is such that it only rolls over to the next element in the SRV table if it gets a 503 service unavailable rejection or an timeout occurs. The CVP will not roll over with a 500 rejection either. Unlike the proxy servers (Unified SIP Server SIP Proxy and Unified Proxy Server), CVP local SRV does not have the SRV configurability to specify more rollover codes beside 503

 

SRV rollover is only applicable to initial INVITE requests. It does not apply to mid call requests such as reinivtes.

Comments
pantinor
Beginner

On the Question above, the last sentence is incorrect.  The sentence that reads "This is the behavior even for a mid-call sip failures as well." should be removed.

The correction should read "SRV rollover is only applicable to initial INVITE requests.  It does not apply to mid call requests such as reinivtes."

Syed Shahzad Ali
Beginner

Thanks Paul for clarifiying it and reviewing it. I have made the changes based on your comment.

Courtland Holder
Beginner

I would like to share the following.

Configuring DNS SRV for failover and redundancy on a SIP call flow.

In this scenario we configured 2 SRV records on the ingress gateway for the CVPA and CVPB

Test Scenario 1:

1. Shutdown call server process on CVPA which brings down the CVPA PIM

2. Place 10 calls in queue

3. All 10 routed successfully  to CVPB.

This test was repeated by shutting down CVPB and bringing CVPA back online and same results as above

.

Test Scenario 2:

1. Shutdown switchport connected to CVPA which brings down the CVPA PIM

2. Place 10 calls in queue

3. 50% routed successfully after 2 seconds and 50% routed successfully after a delay of approx 63 seconds  to CVPB.

This test was repeated by shutting switch port connected to  CVPB and bringing CVPA back online and same results as above

The difference in the 2 scenarios is that in scenario 1 the pim was inactive but the IP Address was pingable while in scenario 2, the IP Address was not pingable.

SOLUTION:

Set the following sip-ua parameters.

sip-ua

retry invite 2            --   default 6

timers trying 100    --   default 500

After these values were set calls were routed 100% of the times within 2 seconds.

Syed Shahzad Ali
Beginner

Hi Courtland,

Thanks for sharing your experience and taking time to document it here.

With CVP 8.0 since SIP OPTION is implement both on the CVP Call Server and Gateway side, now the gateway and CVP server will know the availability of the SIP destination in advance. So the call won't be routed to the failed or down server. So now you dont have this issue of long delays before reaching to the working Call Server

Shahzad