IBM Watson Cloud Gains Eyes, Ears, And A Voice

IBM Watson developer cloud adds speech-to-text, text-to-speech, visual recognition, and decision services. Will businesses build their own Jeopardy apps?

Doug Henschen, Executive Editor, Enterprise Apps

February 6, 2015

3 Min Read

10 Cloud Migration Mistakes To Avoid

10 Cloud Migration Mistakes To Avoid


10 Cloud Migration Mistakes To Avoid (Click image for larger view and slideshow.)

IBM Watson tantalized the world when it beat two grandmaster champions at the game of Jeopardy in 2011, but commercial applications spun off the technology since have lacked the same anthropomorphic sex appeal. On Thursday, IBM announced new Watson Developer Cloud services that promise more of Jeopardy Watson's human-like power to hear, speak, see, and make decisions.

The Watson Developer Cloud already offered eight services that could be described as human-like or even superhuman, such as the ability to identify the language of written input; the ability to answer written questions, drawing on deep knowledge repositories; and the ability to learn user preferences. With five new services, IBM said in a statement that it's "allowing people from diverse industries and disciplines to easily tap into the power of cognitive computing."

[ Want more on this topic? Read IBM Watson: 29 Signs Of Progress. ]

The five new services include:

Speech-to-Text. This "low-latency" service converts speech into text to power voice-controlled mobile applications, transcription services, and, along with others services, speech-to-speech translation. Speech-to-Text transcriptions of speech are sent back to the client and retroactively corrected as the system gains more speech input and context. More speech is heard, helping the system learn.

Text-to-Speech. Who wants to read responses? Most of the delight in using services such as Siri, Cortana, and Amazon Alexa is having a "conversation" with a computer. Watson's new Text-to-Speech service lets developers choose from among three English and Spanish voices, including the American voice used by Watson in the 2011 Jeopardy match.

Visual Recognition. This service analyzes images or video frames and interprets what's happening in the scene. The Visual Recognition service includes prebuilt classifiers, more than 2,000 trained labels, and taxonomies for different domains. A sports taxonomy, for example, recognizes more than 150 sports and can tag images or footage with a confidence level as to whether it's an example of soccer or baseball. Use-cases include organizing large collections of imagery and understanding consumers' shopping preferences based on the images they're viewing.

Concept Insights. When it comes to search, keywords are limited. Concept Insights lets users provide documents. The service then searches for related documents based on a graph of concepts established in Wikipedia. The service provides explicit links to content that directly mentions related concepts and implicit links, which might be relevant content that doesn't directly mention concepts in the user's document. Use-cases include improving search queries and locating expertise in large organizations.

Tradeoff Analytics. This service uses Pareto filtering techniques to weigh multiple, possibly conflicting decision alternatives based on multiple criteria. Tradeoff Analytics makes best-possible choices considering decision goals, and the benefits and drawbacks of various alternatives. Use-cases include enabling retailers or manufacturers to determine their product mix. A consumer service could help them compare products or services. IBM has used this service in Watson applications to help physicians select optimal treatment options.

These five new services are available immediately, along with eight other services, on the Watson Developer Cloud. Since its launch in October 2013, the Watson Developer Cloud has attracted more than 5,000 partners that have built some 6,000 apps to date, according to IBM.

Attend Interop Las Vegas, the leading independent technology conference and expo series designed to inspire, inform, and connect the world's IT community. In 2015, look for all new programs, networking opportunities, and classes that will help you set your organization’s IT action plan. It happens April 27 to May 1. Register with Discount Code MPOIWK for $200 off Total Access & Conference Passes.

About the Author

Doug Henschen

Executive Editor, Enterprise Apps

Doug Henschen is Executive Editor of InformationWeek, where he covers the intersection of enterprise applications with information management, business intelligence, big data and analytics. He previously served as editor in chief of Intelligent Enterprise, editor in chief of Transform Magazine, and Executive Editor at DM News. He has covered IT and data-driven marketing for more than 15 years.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like


More Insights