How to capture utterances in audio(using nuance), save and check if its more than 5 sec per call

smeegada1 · ‎09-27-2018

How to capture utterances in audio(using nuance), save and check if its more than 5 sec and make webservice request.We cant use Record element since its just like VoiceMail which doesn't have ability to integrate with Nuance ASR/TTS.

Pat Legate · ‎09-28-2018

Just got this one figured out (with a lot of help) as of about a week ago. Were you waiting on me? This is not a simple process it will take some custom code.

First of all Nuance will create wav files for all utterances. On our nuance server the location looks like this:

c:\ProgramData\Nuance\Enterprise\Nuance\callLogs\MyApp\2018\09September\27\15 (where 27 is the day and 15 is the hour)

The files will have unique names like:

NUAN-09-41-NUANCESERVER-FLKNDJAFAAAGFEJKAAAAAAAA-LOG
NUAN-09-41-NUANCESERVER-FLKNDJAFAAAGFEJKAAAAAAAA-utt001-POSTEP.wav
NUAN-09-41-NUANCESERVER-FLKNDJAFAAAGFEJKAAAAAAAA-utt002-POSTEP.wav
NUAN-09-41-NUANCESERVER-FLKNDJAFAAAGFEJKAAAAAAAA-utt003-POSTEP.wav

To match the utterance to a particular call/element you will need to examine the value returned in nbestInterpretation1 of the form of interest. It will return something like:

+SWI_meaning:{affirmation:Yes}+SWI_literal:Accept+affirmation:Yes+SWI_grammarName:session:field979@field.grammar

Parse out the fieldxxx (field979 in this case) and go look for that value in the -LOG file.

In that log file you will find a line with the wav file name along with a bunch of other stuff.

WVNM=NUAN-09-41-NUANCESERVER-FLKNDJAFAAAGFEJKAAAAAAAA-utt001-POSTEP.wav

In that same line there is a field called DURS=1117 which I think denotes the duration of (1.117 seconds) play around with it.

A couple of things to note.

The -LOG file is written when the license is released.

The license is released when the call terminates.

A call transferred to an agent is not terminated.

To release the license as explained to me: "In your ICM script after the CVP application, have a label node going to “NOLABEL”, with the redirect box checked and the “failure” path off of that node continuing the rest of your call flow."

Notice that multiple utterances may be represented in a single log file.

You can process this at the end of the call in the On Call End class but you will need your process to sleep for about 10 seconds.

If you have multiple elements then will have to keep track of multiple fieldxxxs

If you figure out a way to release the license upon exiting an element or after each utterance let me know.

Pat

smeegada1 · ‎09-28-2018

I really appreciate all the help you are doing.I also started looking at the nuance documents and noticed that audios are created for each FORM/MENU so in document i did see where we can capture all utterances in single audio(with implementing session.xml files) which i have not tested yet.Problem for us is that we have above 5 nuance servers(which is load balanced) so not sure on which server active call is on so have to iterate all servers pull the correct/active session wav file and RTP the user utterances to webservice.Problem is this all have to happen when call is active in IVR but in middle if we disconnect port with NOLABEL call will route to agent s since connection to MRCP is lost which we don't want.

Your info is very helpful and gave me idea to verify few things.I have completed first 2 points and have to look at the logs which you have mentioned.If you come across anything please let me know.

Pat Legate · ‎09-28-2018

Yes, the which server problem.. we have only two servers and I have a good
idea which one to look on. If all servers have their clocks synced (and I
suppose are in the same time zone) you will notice the nuance file begins
with NUAN-09-41 which indicates the minute and second when the element
begins the speech request. That may help narrow down which -LOG file to
look in.

Also ProgramData is a hidden folder on our system if you can't seem to
find it.

Pat