Speech Recognition's Next Iteration

Plus delve arrest her!" That's how a call center equipped with speech-recognition technology might interpret a customer's request to "please deliver a red sweater." That's because systems can understand only precise, clear syntax that bears little resemblance to the way most people speak.

IBM Research is trying to change that. Later this year, it plans to launch the Super Human Speech Recognition Project that aims to solve common speech-recognition problems and deliver systems capable of not just linguistic comprehension but contextual understanding.


More Insights

White Papers

More >>

Reports

More >>

Webcasts

More >>

The development of software that uses a language model to predict which words are most likely to follow other words is among the numerous approaches the company is taking. IBM Research is also using an acoustic model in which software predicts all the ways a particular word might sound given various pronunciations, cadences, or background interference.

David Nahamoo, a department manager at IBM Research in Yorktown Heights, N.Y., says commercial software applications available today, including IBM's ViaVoice, are starting to incorporate these techniques. As a result, companies in a number of industries are beginning to use voice automation to deal with routine inquiries.

The real challenge, Nahamoo says, is building systems that can understand multifaceted conversations or respond to open-ended questions. To that end, IBM is working on an approach called domain-specific interpretation. Systems designed for use in, say, a travel agency would be programmed to minimize the relevance of conversational elements not related to travel to generate the best response.

But don't expect a talking machine to help you do a better job choosing a Christmas present for Uncle Bob anytime soon. Says Nahamoo, "That's getting into the realm of artificial intelligence, and I don't have a crystal ball for that."


Related Reading




Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

InformationWeek encourages readers to engage in spirited, healthy debate, including taking us to task. However, InformationWeek moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing/SPAM. InformationWeek further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
T-Shirt Giveaway T-Shirt Giveaway: Each week we're selecting one great comment from our readers. The author of the comment will receive an InformaitonWeek Community t-shirt. So get posting!
Subscribe to RSS

Resource Links