Capture audio stream for external processing? Amazon Lex integration

Quigath · ‎03-29-2018

I am researching whether it would be possible to use and external ASR service for all of the caller's spoken responses. I know it would require a Java code connection, that's not a problem. I'm wondering if there's a way to capture the audio stream before it would go to Nuance for the ASR processing.
Would this be technically possible?

Example

prompt1 says - "Hello, thanks for calling the ACME company. Would you like to make an order or check shipping status?"

caller says - "I want to check on my shipment"

The spoken phrase would then be sent off to Amazon Lex for natural language processing. Lex returns a CheckShipment intent which my callflow would branch on to find the next prompt.

david.macias · ‎03-30-2018

Ahh you and a hundred of us have thought about this every time one of the big tech companies releases something which would be awesome for the Cisco contact center. So, it's possible, but it's not pretty. First, to do it correctly you would need some sort of MRCP connection back to Amazon, which is not possible. Second, you could record the prompt in CVP and do a REST call back to Amazon to process it and provide a response, however this can be problematic as Amazon can't do a callback directly to CVP. So all requests need to be responded to in the same element.

While I don't know your use case other than your example. I would think long and hard about introducing speech. I work for a company which does speech and loves putting speech on anything and everything, but to do it right it's a fair bit of work. Why don't you proactively tell the customer the status of their shipment before they even ask? That way there is no selection for the customer to make.

david

Blog | Work