Commentary

Alexander Wolfe
 

Is 'Voice-Over-Google' The Next Search Paradigm?

Everybody's wondering what'll be the next big thing in search-engine technology. From the looks of a patent awarded to Google, it could be speech-driven searches.

Everybody's wondering what'll be the next big thing in search-engine technology. From the looks of a patent awarded to Google, it could be speech-driven searches.The patent in question, number 7,027,987, was awarded to Google on April 11, 2006, for a Voice Interface For A Search Engine. It's described in the patent as "a system [which] provides search results from a voice search query."

If it seems on first glace that such a patent is simply taking an existing idea -- speech recognition -- and applying it to another existing technology -- search -- Google has a rejoinder to that. According to its patent application:


More Internet Insights

White Papers

More >>

Reports

More >>

Webcasts

More >>

"Current speech recognition technology has high word error rates for large vocabulary sizes. … Current voice interfaces to search engines address the problems by limiting the scope of the voice queries to a very narrow range. At every turn, the user is prompted to select from a small number of choices. For example, at the initial menu, the user might be able to choose from "news," "stocks," "weather," or "sports." After the user chooses one category, the system offers another small set of choices. By limiting the number of possible utterances at every turn, the difficulty of the speech recognition task is reduced to a level where high accuracy can be achieved.

This approach results in an interactive voice system that has a number of severe deficiencies. It is slow to use, since the user must navigate through may levels of voice menus. If the user's information need does not match a predefined category, then it becomes very difficult or impossible to find the information desired. Moreover, it is often frustrating to use, since the user must adapt his/her interactions to the rigid, mechanical structure of the system.

Therefore, there exists a need for a voice interface that is effective for search engines."

Interestingly, Google appears to have already conducted some small-scale testing on this approach, in the form of its Google Voice Search Demo. The technique promises to let users "search on Google by voice with a simple telephone call." Unfortunately, the test demo is currently unavailable. The status of the project is unclear; Google suggests users check back at some unspecified future time. There are some message threads on the technology, but all are low activity, suggesting not much is going on right now.

Another effort of note is apparently ongoing at Microsoft Research, where a project dubbed Speech Technology (Asia) is looking into voice-driven search for Chinese-speaking users. This is a particularly apt area for speech recognition, given the pictographic complexity of the Chinese character set.

Way off the beaten path, a search engine called Midomi purports to let you search for music files by humming or singing part of the song into your computer. (I'll try it after I close the door to my office.)

Obviously, efforts such as those of Google and Microsoft are flying below the radar screen (after all, the patent I'm so excited about above was awarded nearly a year ago. But the appeal of voice-driven search is intuitively obvious and merits, if nothing else, a big shout-out, at least from this blogger.


Related Reading




Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

InformationWeek encourages readers to engage in spirited, healthy debate, including taking us to task. However, InformationWeek moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing/SPAM. InformationWeek further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
T-Shirt Giveaway T-Shirt Giveaway: Each week we're selecting one great comment from our readers. The author of the comment will receive an InformaitonWeek Community t-shirt. So get posting!
Subscribe to RSS

Resource Links