We are building an app that get the audio stream (live stream and recorded) from voip call and pass it to a speech to text engine. The speech to text engine accept websocket as input. Is there any way that we can get the realtime audio/recorded audio...