Solved: CUCM 10.X SIP Trunk Dial-Peer Failover

mzajeski · ‎10-28-2015

We are having a real strange issue with our 4331 gateways when testing SRST.

Information

4331 Configured in CallManager via SIP

4331 has an Inbound PRI

Config Snipit

dial-peer voice 1000 voip
preference 1
destination-pattern 51754[13]....$
session protocol sipv2
session target ipv4:10.100.2.23
session transport tcp
voice-class codec 1
voice-class sip options-ping 60
dtmf-relay rtp-nte sip-notify
no vad
!
dial-peer voice 1001 voip
preference 2
destination-pattern 51754[13]....$
session protocol sipv2
session target ipv4:10.100.3.23
session transport tcp
voice-class codec 1
voice-class sip options-ping 60
dtmf-relay rtp-nte sip-notify
no vad
!
dial-peer voice 1002 voip
preference 3
destination-pattern 51754[13]....$
session protocol sipv2
session target ipv4:10.37.2.5
session transport tcp
voice-class codec 1
voice-class sip options-ping 60
dtmf-relay rtp-nte sip-notify
no vad
!

The desired Result for SRST is for the inbound calls to failout on dial-peer 1000 and 1001 due to a wan outage and end up on the CUC server inside the 4331. What actually happens right now is that the call only hits dial-peer 1000 and fails.

Do we need voice-class sip keepalive timers on the dial-peers to get it to roll from 1000 to 1001 and 1002?

Thanks in advance!!!!

mzajeski · ‎10-28-2015

I attached bot command results.

View solution in original post

devils_advocate · ‎10-28-2015

So the inbound calls currently hit Dial Peer 1000 and fail?

Or they hit DP 1000 and when this is unavailable for some reason, they are not failing over to use DP 1001?

mzajeski · ‎10-28-2015

The inbound calls hit Dial Peer 1000 and fail. When the session target in dial-peer 1000 is not available, it isnt trying dial-peers 1001 or 1002. That is my issue.

devils_advocate · ‎10-28-2015

It looks like you are using ping to tell the gateway whether the dial peer is working, presumably during your tests you made sure the gateway was unable to ping 10.100.2.23?

Can you post the output of:

#show dialplan number 5175415555

It should show DP1000 as the first match but if you scroll down, you should see DP1001 as the next match.

Can you also provide the output of:

#show dial-peer voice summary

It would be helpful to show this output when DP1000 is unavailable.

Thanks

mzajeski · ‎10-28-2015

I attached bot command results.

devils_advocate · ‎10-28-2015

If you look at the results of #show dial-peer voice summary, you will see what DP1000 is marked as 'Active' in the KEEPALIVE column. If the DP is marked as active,it could be that the Gateway still thinks it can route calls there, therefore the calls are not failing over to DP1001.

The results of the #show dialplan number suggest that DP1001 is the next in the call routing list so it does suggest that the Gateway still thinks DP1000 can route calls, even if it actually can't.

Remove the keepalive from the dialpeer 1000 and test again.

Thanks

mzajeski · ‎10-28-2015

Remove the keepalive? What command do you suggest?

mzajeski · ‎10-28-2015

Again, the important issue with this centers around rolling over to a secondary dial-peer and tertiary if it exists. This is huge during normal operations as well when the WAN is up to the secondary CallManager but the primary is down.

devils_advocate · ‎10-28-2015

Are you testing this with an actual call or via csim start xxxx?

mzajeski · ‎10-28-2015

In the field we dialed the direct number that came into the PRI, but for testing I will lab the scenario and use CSIM. Our current configuration does not have a SIP-UA section configured at all and that might be the root cause of no rolling.

We will be testing with using a SIP-UA section with timers as well as setting "voice-class sip options-keepalive" in the dial-peers. I will post the results. Thanks!

devils_advocate · ‎10-28-2015

I am fairly convinced the issue is to do with the SIP timers, if you search Google for 'sip dial peers not failing over', there are several articles which explain the default SIP timers are over 60 seconds before the Gateway will route the call to the next Dial Peer.

mzajeski · ‎10-29-2015

Worked this morning with a mock lab setup. It wasn't exact with regards to inbound stuff, but we were able to determine which commands were the "true" fix for dial-peers headed toward CallManager.

The "VOICE-CLASS SIP OPTIONS-KEEPALIVE" command was key to immediate rolling from dial-peer to dial-peer (CallManager endpoints). We used the up-interval/down-intervals set at 5 with a retry of 2. Waiting 15 seconds or so, calls processed on the gateway immediately bypassed the dial-peers where the CallManager's were unreachable.

Adding the sip-ua section with "retry invite 2" and "timers trying 250" did nothing for our testing. These settings are however used with almost all of our configurations to SIP carriers and other SIP integrations. I think these settings are important for failing over of a dial-peer when certain SIP responses are received. Since they didn't affect our test, that probably means dial-peers to CallManager 10 don't get the same type of responses (yet) that would cause rolling.

Bottom line with my issue - our voice gateways receive PRI and POTS lines coming in and are configured as SIP Trunks to the CallManager. Most SIP Trunks configured to CallManager have some sort of SIP endpoints that they extend to. In those cases, the sip-ua section becomes vital for rolling (inbound and outbound). Thanks for taking the time to research this with me and provide insight!!!!!

devils_advocate · ‎10-28-2015

Looking at this post, it could be to do with the SIP timers.

https://supportforums.cisco.com/discussion/11516996/sip-dial-peer-failover-not-working

Perhaps try adjusting the timers as stated in the post above and see if it works.

Thanks

devils_advocate · ‎10-28-2015

Remove this line:

dial-peer voice 1000 voip
preference 1
destination-pattern 51754[13]....$
session protocol sipv2
session target ipv4:10.100.2.23
session transport tcp
voice-class codec 1
voice-class sip options-ping 60
dtmf-relay rtp-nte sip-notify
no vad
!

Obviously in order for the gateway to pick DP1001, it needs not to be able to route calls to DP1000.