Typing on a mobile phone is not fun. It works, but it's not very fast or accurate, particularly if you're thumb-typing or on the move.
Talking works better. Phones were designed for talking, after all. But many of us still don't feel comfortable addressing our phones directly. A 2014 Northstar Research study commissioned by Google surveyed 1,400 Americans and found that "only one-quarter of adults speak to their phones when in the company of others." It's as if we're ashamed to be caught talking with our imaginary phone friend.
Among teens, there's less stigma. Fifty-seven percent of them will query their phones when among other people. So, expect voice interaction to become more common as the population ages and people become accustomed to the idea.
Speech recognition is approaching its idealized depiction in science fiction. In 2011, Microsoft researchers considered an error rate of 18.5% "astonishing." At Google I/O last month, Sundar Pichai, SVP of product at Google, said the speech recognition error rate for Google's software had reached 8%, down from 23% in 2013.
The technology is certainly usable today, even if we tend to use it for specific types of queries, such as initiating calls and asking for directions. For example, only 9% of adults use voice search to check movie times, according to the Northstar Research study.
Ongoing technological advancements suggest voice interaction with software-based assistants will become even easier and more useful as companies create code that not only recognizes speech, but can also interpret complex questions.
SoundHound recently invited beta testers to an app called Hound, currently available on Android and soon on iOS, that attempts to turn speech recognition into meaning recognition. As with the Google app and Siri, Hound can recognize when a query refers to a previous query, as when the word "there" in the query, "Is there a seafood restaurant near there?" refers to a prior question about a location.
Hound appears to be particularly adept at figuring out how to interpret queries with multiple parameters, like this one: "Show me four- or five-star hotels in Seattle for three nights starting on Friday between a $150 and $200 dollars a night."
Or this one: "What is the mortgage on a $600,000 home using an interest rate of 4% over a 30-year period, with a down payment of $120,000."
Google can recognize these words and point you to a list of mortgage sites, but Hound will present detailed mortgage calculation figures as an answer. For the kinds of data and queries that Hound understands, Hound offers something approaching voice-based programming -- you're supplying data to a known function.
In the years to come, with speech recognition nearly a solved problem, the focus will be on improving the interpretation of meaning. And, once your devices understand your intention as well as your words, the possibilities of voice-based interaction become much more interesting.
You don't have to wait that long. Here are 10 voice apps worth trying out now.