The IBM supercomputer not only beat the quiz show's most successful human contestants, but also showed what skilled researchers will be able to do as they tackle big problems in the future.
Watson, the quiz show winning computer, doesn't think like his namesake, IBM founder Thomas J. Watson, or his two vanquished Jeopardy competitors.
Rather, Watson sorts and separates relevant information by parallel processing. It is a most un-human proceeding, unless you, too, know how to address a question using 2,880 brains.
Since his victory, Watson's been described as a computer that thinks like a human. But I think Watson's middle name should be "Hadoop," the underlying piece of cloud-based open source code that enabled Watson to frequently come up with answers to complex, nuanced questions in 10 milliseconds, or one hundredth of a second. Brad Rutter and Ken Jennings barely had time to put a finger to the forehead before Watson had sounded the buzzer.
That's because Watson had about 200 million pages of information loaded into 15 TB of random access memory, where it could be accessed at the speed of light. Hadoop normally draws information off of distributed server disks in 64-MB chunks. Watson drew it out of RAM in 500-GB gulps. It then could feed it into nearby CPUs, either the nearest server or at least a server in the same rack, avoiding backbone traffic.
Hadoop is a powerful finder, sorter, and organizer of masses of information because it functions both as a distributed file system, and a mapper of known data to the nearest processor in the cluster. Behind its non-emotive face, Watson was just a common ordinary server cluster, occupying a 10-rack unit that looked, one observer said, like a stack of library shelves. Actually, I thought it looked more like an Adirondack mountain hut with no door.
Inside the cluster were 90 IBM 750 Power servers, each with four CPUs; each CPU contained eight Power 7 chip cores, for a total of 32 per server, or a total of 2,880 in the cluster. Once the question was typed into Watson's memory, it was decomposed, analyzed, and processed by those 2,880 cores in parallel.
But the number of threads -- for example individual search processes, each tapping a 500-GB section of memory for references to come up with, say, Toronto is a U.S. city (heh, heh) -- was much higher than 2,880. A little-known fact is that Watson's cores were virtualized under Red Hat's kernel virtual machine (KVM).
So exactly how much parallel processing was going on inside Watson? Each core of the IBM Power 7 chip is capable of running multiple threads; further, it's been engineered for ease of virtualized operation. IBM's Sept. 21, 2009, Clipper white paper says one Power 7 "system" is capable of running up to 1,000 virtual machines. Since each server is a four-socket host, with eight cores per socket, that probably means each Power 7 core hosts 32 VMs, or 256 per socket and 1,004 per server. Those are extremely high numbers, even in intensely virtualized data center settings.
How Enterprises Are Attacking the IT Security EnterpriseTo learn more about what organizations are doing to tackle attacks and threats we surveyed a group of 300 IT and infosec professionals to find out what their biggest IT security challenges are and what they're doing to defend against today's threats. Download the report to see what they're saying.
2017 State of IT ReportIn today's technology-driven world, "innovation" has become a basic expectation. IT leaders are tasked with making technical magic, improving customer experience, and boosting the bottom line -- yet often without any increase to the IT budget. How are organizations striking the balance between new initiatives and cost control? Download our report to learn about the biggest challenges and how savvy IT executives are overcoming them.