Tips and Tricks for Building Voice Assistants

Teneo is the perfect Conversational AI platform for both voice and text-based assistants. However, there are a few key differences to keep in mind when working with voice. The experience can differ somewhat depending on the developer’s abilities. This article will explain the main differences when working on voice as compared to text. We will go over various topics you should keep in mind when building flows in Teneo Studio, and general tips on writing answers that suit the voice users. At the end of your journey you will be able to either use Teneo Web Chat, or choose one of the Channel connectors we have available to test out your solution.

Differences between voice and text

The language we use in written form compared to speech is very similar. However, there are a few key differences that need to be kept in mind when working with speech. Speech tends to vary even more than text inputs due to different accents, regional dialects, tone of voice, timbre, volume, and more. The language and phrasing used in speech often differ from that used in a text, as well; in some instances, speech might be less formal, for example. A voice assistant must also have different timing than a text-based bot; whereas a text-based assistant can give an output and then wait for any amount of time for the user to read and reply to that output, a voice assistant may need to repeat outputs if the user didn’t hear them. In order to take all these factors into account, we have put together the following list of best practices for designing a voice assistant.

These tips are more directed towards voice-focused assistants. Please see Best Practices for more general tips when building bots in Teneo.

Flows

  • Avoid infinite loops
    • Doing this will not allow the user to leave the flow if they change their mind.
  • Add an option to rumble the use-cases
    • One example of this can be a flow that triggers with the following input, ‘What can you help me with?’. The reply should include things the assistant can actually help the user with.
  • Suggest topics
    • A common thing to do when creating bots is to show buttons as suggested replies for the user. As for voice, this might not be feasible for every bot. Therefore it is valuable to give the user an option to hear the suggested topics.
  • Combine Match Requirements
    • Humans do not always speak in a unified way but make use of accents and dialects. This can lead to mistakes in transcriptions, which may vary also depending on the chosen Speech-to-Text (STT) provider. Luckily, Teneo includes several features to smoothen this situation and achieve state-of-the-art results. Make sure to use Teneo’s powerful Match Requirements for Intent Recognition, which combines our Linguistic Modelling Language with LUIS to fully grasp the intention of the user.

data

Outputs

  • Confirm the request
    • If someone orders a cup of coffee, confirm in the reply that the request was understood by acknowledging it, e.g. ‘What type of coffee would you like to order?’. This is vital as users often do not have access to the transcript and this lets them know they are being understood.
  • Avoid long replies
    • Keep it short and to the point.
  • Option to repeat
    • Give the user time to think. If no reply is given after that time, repeat the same output in case the user did not hear everything.
  • Option to abort
    • If something wrong is triggered, allow the user to go back and ask the question again. This can be done with a single-word command, a script, or something else.
  • Repeat the topic
    • Repeat information when appropriate, keeping in mind that the user does not have access to the dialog transcript.

conversation

Building a Voice Assistant

Please see the following pages for more information on how to get started with your Voice Assistant,

voice

1 Like