Infrastructure // PC & Servers
Commentary
2/23/2011
07:09 PM
Charles Babcock
Charles Babcock
Commentary
Connect Directly
Twitter
RSS
E-Mail
50%
50%

Watson's Jeopardy Win A Victory For Mankind

The IBM supercomputer not only beat the quiz show's most successful human contestants, but also showed what skilled researchers will be able to do as they tackle big problems in the future.

That means Watson, when he had a question in hand, could seek an answer by using up to 90,360 virtual machines simultaneously, not to mention accessing 15 TB of random access memory. Given that Jeopardy questions are sometimes nuanced, or imply a direction other than the one that yields a correct answer, Watson had his work cut out for him, but he had a lot of horses with which to tame it down to size.

For example, each question had to be decomposed into subject, verb, and object, with relationships determined among them. This front-end digestion was accomplished through IBM's Unstructured Information Management Architecture (UIMA), said Anjul Bhambhri, IBM's VP of product development for big data, in an interview.

This phase yielded text and keyword leads that were fed into Hadoop. As a distributed system, Hadoop could map the leads to processors, or more likely, dozens of virtual machines, each located close to the relevant data in the server cluster, and get results back quickly. IBM has built a system on top of Hadoop, InfoSphere BigInsights, for processing its many results down into a set of weighted answers.

The most relevant results were then processed to become a logical set of possible answers, using finer and finer grained algorithms. Three answers were reached, with Watson rating the top one. It's a little eerie that when it came up with the wrong answer, Toronto as a U.S. city, it also took the unusual step of placing five question marks after it, as if severely second guessing its own logic. It's also telling that, when asked what material the quills of a hedgehog are made of, he came up with keratin, porcupine_2, and fur as possible answers and chose keratin, the correct one.

The whole process was called DeepQA, and ended with one result finally being selected in what would become, in most cases, Watson's human-beating conclusions.

Indeed, this DeepQA process was based on software whose major parts are now publicly available. Hadoop is an Apache open source project. UIMA, an open source framework for analysis of unstructured content, is also open source code donated by IBM and available under an Apache license. Getting them to work together to yield a correct answer on Jeopardy, I suspect, is still a stretch for the uninitiated.

IBM has taken some of the things it's learned from data warehousing and analytics based on DB2 and applied them to BigInsights, Bhambhri said.

"I was not surprised at the outcome," said Bhambhri. "Our research had indicated that [the Watson team] had built something that could withstand the test of a Jeopardy contest," she said.

In this sense, Watson didn't think like a human. He did something that a human couldn't do -- process terabytes of information across thousands of virtual machines in milliseconds, sorting results down to one answer. Humans, after hearing a question, often need a second or two to let old associations reactivate old memories and information, in this fashion often working their way to marvelously remote, correct answers. The process probably would look untidy next to a diagram of how Watson narrowed down his candidate results.

What I think Watson represents better than anything else is not a machine surpassing human intelligence so much as how humans will use computers to attack all sorts of problems, backed by the power to process masses of recently acquired, unstructured information. Big data, parallel processing, and the human mind are together engaged in a new era of data exploration.

It's not the human quiz show contestant who's in jeopardy. Rather, the target most likely to yield to this new power is the timeless problem that resisted solution by hiding in an amount of data formerly too large to grasp.

SEE ALSO:

Inside Watson, IBM's Jeopardy Computer

Can The Jeopardy Challenge Help IBM Compete With Apple And Google?

IBM's Watson Trounces Puny Humans At Jeopardy

IBM, Nuance Envision Watson Helping Doctors

IBM's Watson Puts Human Rivals In Jeopardy

IBM's Watson In Jeopardy Deadlock

See all stories by Charles Babcock

Charles Babcock is an editor-at-large for InformationWeek.

Previous
2 of 2
Next
Comment  | 
Print  | 
More Insights
Server Market Splitsville
Server Market Splitsville
Just because the server market's in the doldrums doesn't mean innovation has ceased. Far from it -- server technology is enjoying the biggest renaissance since the dawn of x86 systems. But the primary driver is now service providers, not enterprises.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Tech Digest - August 27, 2014
Who wins in cloud price wars? Short answer: not IT. Enterprises don't want bare-bones IaaS. Providers must focus on support, not undercutting rivals.
Flash Poll
Video
Slideshows
Twitter Feed
InformationWeek Radio
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.