Software // Information Management
03:35 PM
Doug Henschen
Doug Henschen
Connect Directly
Core System Testing: How to Achieve Success
Oct 06, 2016
Property and Casualty Insurers have been investing in modernizing their core systems to provide fl ...Read More>>

What's At Stake In IBM's Jeopardy Challenge?

Advances in assisted Q&A could find their way into medicine, legal, tech support and compliance applications.

Serious Science

From a computing perspective, this grand challenge is very different than Deep Blue's chess matches. Watson must be able to decipher English-language clues it has never seen before. It must quickly search for possible answers in a fixed knowledge store and then apply myriad analytics to determine its certainty in an answer -- posed, of course, in the form of a question.

Language is often ambiguous, highly contextual and open to infinite meanings, idioms, slang and regional dialects. The domains of knowledge are completely unpredictable, with topics and questions known only to the producers of Jeopardy. The clues might be about history, literature, geography, politics, arts, science, or pop culture.

Watson can't just look things up on the Internet. That would be unfair to the humans. Watson is a stand-alone computer. Its memory is a fixed, 15-terabyte store containing the equivalent of 200 million pages or one million books on diverse topics (okay, maybe that's not so fair).

Rather than using a database, Watson relies on a content store based on the Unstructured Information Management Architecture (UIMA), something IBM developed and has since placed in the open source community. Watson "learned" all the information it stores in advance, meaning the content passed through an analysis/training stage in which it was marked up with metatags denoting entities such as people, places, things, dates and concepts.

Inside Watson, IBM's Jeopardy Computer
(click image for larger view)
Slideshow: Inside Watson, IBM's Jeopardy Computer

Search technology retrieves contextually appropriate information quickly, but the real secret sauce behind Watson is its battery of analytic scoring algorithms. That's what IBM Research spent years refining, and it's the key to a computer's ability to decipher the language used in the clue and score its confidence in having the right answer.

"The hard part for the computer is finding and justifying the correct answer," said Dr. David Ferrucci, the "Principal Investigator" behind the Watson project. "For each of thousand of plausible answers, Watson gathers evidence and uses thousands of algorithms to understand what's most likely to be the right answer."

Developing the knowledge set and analytic functionality was the major hurdle, but when that technology runs on a single processor, it takes as long as two hours to answer a single question with confidence. The next step was putting it all on steroids with a massively parallel processing (MPP) architecture.

Watson is essentially a workload-optimized system purpose built to play Jeopardy. It runs on 10 racks of IBM Power 750 Servers -- standard commercial hardware that's readily available. The total machine operates at 80 teraflops, or about 80 trillion operations per second. With that much processing power under the hood, Watson's performance went from two hours on a single node down to less than three seconds -- the threshold required to match wits with Jeopardy champions.

2 of 3
Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
The Agile Archive
The Agile Archive
When it comes to managing data, donít look at backup and archiving systems as burdens and cost centers. A well-designed archive can enhance data protection and restores, ease search and e-discovery efforts, and save money by intelligently moving data from expensive primary storage systems.
Register for InformationWeek Newsletters
White Papers
Current Issue
Top IT Trends to Watch in Financial Services
IT pros at banks, investment houses, insurance companies, and other financial services organizations are focused on a range of issues, from peer-to-peer lending to cybersecurity to performance, agility, and compliance. It all matters.
Twitter Feed
InformationWeek Radio
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.