Re: Direct Routing for Microsoft Phone System with Cisco Unified Borde - Page 3

takahashi-ta · ‎02-23-2021

I am using vCUBE (on CSR1000v) to build direct routing for Microsoft Teams Phone System.
ITSP is using Twilio.
In this environment, I can make and receive outgoing calls from Microsoft Teams Client to Twlio direction, but the signaling is not working well from Twilio to Microsoft Teams Client direction.
I used TranslatorX to draw a SIP ladder sequence from the results of the debug ccsip messages and found that the CUBE is not receiving any response to the INVITE it sent to the Micorosoft Teams Phone System.

The TLS connection is normally working as shown below.

(ex)
Remote-Agent:52.114.76.76, Connections-Count:3
Remote-Port Conn-Id Conn-State WriteQ-Size Local-Address TLS-Version Cipher Curve
=========== ======= =========== =========== ============= =========== ============================== =====
5061 140 Established 0 x.x.x.x TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 P-384
3521 142 Established 0 x.x.x.x TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 P-256
3520 141 Established 0 x.x.x.x TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 P-256

However, the moment I make a call from Twilio to the Micorosoft Teams Client, I get the following message.

%SIP-3-INTERNAL: Connection failed for addr=52.114.76.76, port=5061, connId=140

For some reason, the TCP TLS connection seems to be failing momentarily, and I suspect this is why the signaling is not working properly.

Is there any configuration I should review?

By the way, I am referring to the following document for setting up the CUBE.
https://www.cisco.com/c/dam/en/us/solutions/collateral/enterprise/interoperability-portal/direct-routing-with-cube.pdf

Greg Brunn · ‎06-13-2023

Thanks for reaching out. Please reach out to your Cisco account team and ask them to have their dev team talk to ours to resolve this connection closure.

oeschger01 · ‎07-07-2023

Thank you, Greg, for your suggested solution. It looks like we will have to go down that road to make Cisco and Microsoft talk to each other.

To support us in doing so, could you please let us know why exactly Microsoft considers a TCP connection as idle (after 2 minutes), although sip OPTIONS are received over that connection on a regular interval from Cisco’s SBC (60s by default)?

Thanks to your link, we do understand Microsoft’s ruleset, but could you please explain how this causes a problem in combination with Cisco’s SBC? Does Cisco not adhere to certain Microsoft recommendations (Timers, RFCs, …) and are SBCs from other vendors not affected by these TCP timeouts?

Greg Brunn · ‎07-07-2023

Please reach out to your account team and have them engage the correct teams at Cisco with the Teams engineering team.
https://learn.microsoft.com/en-us/microsoftteams/direct-routing-protocols#timeout-policies.

"SIP proxy TCP idle timeout is two minutes. The timer resets when a SIP transaction occurs over the connection."

"After two minutes of idling, (FIN, ACK) is transmitted to the supplier SBC by SIP Proxy within approximately 10 to 20 seconds."

MARTIN STREULE · ‎08-07-2023

Anyone has an update on this?

mrvoipstuff · ‎06-07-2024

was there ever a fix to this ? got a CUBE direct routing setup with Teams and experiencing similar problem where transfer fails intermittent.

Alexander Gomer · ‎06-07-2024

what is the exact log line?

this: %SIP-3-INTERNAL: Connection failed for addr=

or that: %SIP-5-TLS_CONNECTION: Connection failed for addr=

Regarding this one "SIP-5-TLS_CONNECTION":

I never experience any issues with these lines, they just irritating.

May Cisco did aswell and changed wording, since IOS 17.9.2a to:

%SIP-5-TLS_CONNECTION: Connection closed due to SOCKET_REMOTE_CLOSURE for addr=

1.) Note

https://learn.microsoft.com/en-us/microsoftteams/direct-routing-protocols#timeout-policies.

"SIP proxy TCP idle timeout is two minutes. The timer resets when a SIP transaction occurs over the connection."
"After two minutes of idling, (FIN, ACK) is transmitted to the supplier SBC by SIP Proxy within approximately 10 to 20 seconds."
On the Cisco side of the house you can see these remote closures increment up using the following command.

sbccube1#show sip-ua connections tcp tls brief
Total active connections : 11
No. of send failures : 0
No. of remote closures : 0
No. of conn. failures : 0
No. of inactive conn. ageouts : 0
TLS client handshake failures : 0
TLS server handshake failures : 0

2.) you may check following if you experience issues:

https://bst.cloudapps.cisco.com/bugsearch/bug/CSCvz80171
DNS resolution to secondary does not happen if OOD SIP Options Ping configured using profile

https://bst.cloudapps.cisco.com/bugsearch/bug/CSCvx92872
SIP call fails egress dial-peer uses "session server-group" and "sip options-keepalive"

https://bst.cloudapps.cisco.com/bugsearch/bug/CSCwa81136
CUBE cannot create mid-call transaction based on the Record-Route header

mrvoipstuff · ‎06-07-2024

it's %SIP-5-TLS_CONNECTION messages I get on console

008051: Jun 8 11:25:32.040: %SIP-5-TLS_CONNECTION: Connection failed for addr=52.114.16.74, port=13388, connId=2594
008052: Jun 8 11:25:32.460: %SIP-5-TLS_CONNECTION: Connection successful for addr=52.114.16.74, port=5061, connId=2603
008053: Jun 8 11:26:25.461: %SIP-5-TLS_CONNECTION: Connection failed for addr=52.114.76.76, port=64449, connId=2595
008054: Jun 8 11:26:32.135: %SIP-5-TLS_CONNECTION: Connection failed for addr=52.114.16.74, port=13384, connId=2596

my issue is consult call transfer where intermittently it would fail. I think it's most related to the 3rd bug you mention

https://bst.cloudapps.cisco.com/bugsearch/bug/CSCwa81136
CUBE cannot create mid-call transaction based on the Record-Route header

Following SIP REFER - MSFT sends the INVITE to CUBE but it doesn't show on CUBE CCSIP trace until about 10 seconds later. We got MSFT network team to confirm via PCAP and it shows sent once as normal AND on time. On our side PCAP on SBC, Firewall & Nexus show several TCP retransmissions for that packet (TCP 5061). There is no delay/QOS/firewall misconfiguration on the path we can see. The only thing that comes to mind is if MSFT tore down TLS session during which initial call came in and now when mid-call event (transfer) occurs it's trying to somehow use the previous TLS session as only that would explain constant tcp retransmissions. I've tried "no conn-reuse" in CUBE's SIP tenant for MSFT but same result. my SIP-ua has conn-reuse still configured but I don't think it would get used. it should look at one under tenant.

omc · ‎06-10-2024

from current guide Direct Routing for Microsoft Phone System with Cisco Unified Border Element (CUBE)

you should not use conn-reuse.

on each category (tenant, sip-ua, dial-peer) set it to default (>default conn-reuse)

than disable it globally at voice service voip

try again and double check the guide

mrvoipstuff · ‎06-10-2024

I have now disabled it i.e. no conn-reuse at tenant, sip-ua & voice service voip level. i will have to make a few test calls as its intermittent (sometimes would work 100 transfers before it fails). will post here.

oeschger01 · ‎06-09-2024

As to my understanding, we need to distinguish between two things: flapping dial-peer status and TLS remote closures

Flapping dial-peer status
- Caused by bug CSCwd33038, to be fixed by IOS update

TLS closures
- Caused by Microsoft's DNS load balancing design.
- The URL sip.pstnhub.microsoft.com gets alternately and arbitrary resolved to an IP addresses of different Microsoft servers (let’s say x.x.x.x and y.y.y.y).
- A TLS session may therefore get established with server x.x.x.x, but subsequent SIP options keepalive messages may get sent to server y.y.y.y (if the DNS response had changed in the meantime).
- If this is the case, the TLS session on server x.x.x.x receives no sip options anymore, becomes idle and the connection gets closed by the MS server x.x.x.x after about 2 minutes.
- The same happens with the connection to server y.y.y.y after a while (after the DNS response got changed back to x.x.x.x). The connection to server y.y.y.y becomes idle and gets disconnected.

I opened a case with Microsoft, asking for their statement, but have not received any useful feedback yet
I think it would be helpful if Cisco could put a respective note in their Direct Routing guide, explaining the cause and relevance of these error messages

omc · ‎06-10-2024

in general to avoid dns issues, if using more then one name server, ensure using name servers from same provider.

e.g.

cisco name server from guide

ip name-server 208.67.222.222 208.67.220.220

or two internal name server with same relay

or two / three names server from pstn provider

also ensure dns query performed from correct WAN breakout so geolocation works properly

mrvoipstuff · ‎06-10-2024

thanks for the detail. my issue appears to be the TLS closures. But does it actually impact any mid call function like transfer, hold, resume in your environment ? In my case I have intermittent failure with my consult transfer function and going thru CCSIP it shows TLS tearing down messages around the time it fails.

omc · ‎06-10-2024

if your call flow is similiar to this:

a-party incoming PSTN SIP > CUBE SIP > CUCM SIP > mobility / SNR > CUBE SIP > MSFT SIP > ms client b-party

b-party transfer > MSFT SIP > CUBE SIP > PSTN SIP > any c-party

resulting in audio issues, or drop after a and c connected, enable [X] MTP require on SIP Trunk of CUCM towards CUBE

mrvoipstuff · ‎06-10-2024

thanks but no CUCM in my call flow so just ITSP => CUBE => Teams phone system

oeschger01 · ‎06-10-2024

So far it seems to me that the TLS closures are mainly a cosmetic issue (though that is still to be confirmed).

I experienced some mid-call issues in the past, but got them solved by adding the two rules below to the tenant’s outbound sip-profile 200 (not sure though if that helps in your case).

voice class sip-profiles 200
…
rule 420 request ANY sip-header Allow-Header modify ", REFER" ""
rule 430 response ANY sip-header Allow-Header modify ", REFER" ""