Pekka Murto // August 05 2020
“Hey voice user interfaces, are we there yet?”
“Wait a minute, go right, stop. Enhance 57 to 19. Track 45 left. Stop. Enhance 15 to 23. Give me a hard copy right there.”
Science fiction books and movies are full of visions of technology. In the cult classic science fiction movie Blade Runner cited above, there is a scene where Harrison Ford’s character, detective Rick Deckard, inserts a photo into some sort of a device or computer to discover clues about a crime he is trying to solve. For the most part, the technology in the scene is somewhat outdated, crude or clumsy by today’s standards: the input and output photos are paper one’s and the machine itself looks like a small bulky television set with a poor screen.
However, the way how Deckard uses the device – with a voice user interface – is something that, still today, has considerable limitations that reduce their appeal and user experience. So, the question is not so much about a lack of penetration in the market but of functionality. In this blog post, we discuss some of the ins and outs of voice user interfaces and share some of our own experiences of working with voice in the world of digital products and services.
Voice interfaces are common and widely used
Voice user interfaces have become increasingly common in recent years. In fact, the widespread diffusion of smartphones and smart speakers has made voice user interfaces and voice assistants, in particular, one of the most common applications of artificial intelligence that people use frequently. WebGlobalIndex reports that up to 46% of internet users have used a voice search tool within the last month1 and similarly, in an AI study by OMD, approximately half of respondents mention using voice assistants2, such as Apple Siri, Google Assistant or Amazon Alexa.
Voice user interface usage is particularly high in large language areas, such as English and Chinese. Likewise, large technology companies – Google, Apple, Amazon – dominate the market as voice assistant and voice interface providers. Even then, useful applications of voice interfaces do exist for smaller language groups as well. For example, Finnish consumers can dictate their grocery shopping lists in Finnish with a surprisingly robust application provided by the retail giant S-Group and the Finnish Broadcasting Company uses dictation in creating Finnish subtitles for their programs.
What are voice user interfaces best suited for?
When it comes to usability and use, voice user interfaces are good for relatively simple tasks and actions. Voice assistant use often revolves around information search and support (eg. directions and weather), automation (eg. launching apps and setting alarms) and entertainment (eg. listening to music)3. Similarly, in the context of buying, voice interfaces are recommended for repeated, simple, pre-defined and/or cheap purchasing – anything more complex and the user is likely to resort to user interfaces where they can more easily grasp both the big picture and the nuances of what they are doing.
Especially for smaller languages (such as Finnish), limiting the context of voice commands and interfaces improves usability and the success of speech detection4. For a product or service provider, this practically means associating voice commands with the products and services they are selling and providing.
Voice interfaces are also important for reasons of accessibility as they enable technology use for the visually impaired. As such, voice interfaces are an excellent opportunity for service providers to build an accessible multi-channel user experience and increase the usability of their services. For example, in one of our client cases, a voice interface was considered to provide a more accessible, quick and complementary mean to use the service when the user is not capable of using the service in other ways (due to disability, being in a hurry or occupied with tasks of higher priority such as driving or cooking). Finally, voice can provide a means to add a human element to a digital service or brand and facilitate the creation of more personalized experiences.
When it comes to shortcomings, there are at least four issues that hold voice user interfaces back. First, voice user interfaces may often lack user instructions and feedback – it may be difficult for users to learn what they can do with voice and the speed, amount and potential lack of user feedback may cause difficulties with voice interface usability (studies have found this to be the case especially for elderly users)5. Second, users may find it awkward and embarrassing to use voice interfaces in the public6. Third, users often need to adjust the way they speak to improve the functionality of voice user interfaces2. As we found in one of our client projects, such adjustments to speech caused “cognitive load” for users as they had to think too much what to say. Fourth, voice user interfaces bring to the fore questions of privacy. Fully-fledged, voice-activated user interfaces must listen to their users to be functional, which causes concern for data protection and usage2.
A good first step in voice user interface projects is to start small by building a low fidelity proof of concept and testing it with customers and users. At Digitalist, we believe that such experimentation is key when dealing with emerging technologies with a lot of ripples to iron out before they can transform from technological visions to reality.
Interested in how you can use experimentation and prototyping to define where your business should be headed? Feel free to give us a shout: https://digitalist.global/contacts/helsinki/