Big Data. Big Decisions
InformationWeek
Special Coverage Series

Commentary

Paul Cerrato

IBM Watson Finally Graduates Medical School

Partnership with Memorial Sloan-Kettering Cancer Center suggests IBM's supercomputer is ready to help oncologists manage their most challenging cases.

IW 500: 10 Healthcare IT Innovators
IW 500: 10 Healthcare IT Innovators
(click image for larger view and for slideshow)
It's been more than a year since IBM's Watson computer appeared on Jeopardy and defeated several of the game show's top champions. Since then the supercomputer has been furiously "studying" the healthcare literature in the hope that it can beat a far more hideous enemy: the 400-plus biomolecular puzzles we collectively refer to as cancer.

A recent presentation by Martin Kohn, IBM's chief medical scientist, and Pat Skarulis, CIO at New York's Memorial Sloan-Kettering Cancer Center, suggests Watson's up to the challenge.

During last week's Digital Health Conference, sponsored by the New York eHealth Collaborative, Kohn and Skarulis outlined an impressive initiative their two organizations have embarked upon to use Watson "in the trenches" to treat oncology patients at Sloan-Kettering. Kohn outlined some of the basics of the project.

He was quick to point out that the supercomputer isn't just a "search engine on steroids," or even a massive database. It relies on parallel probabilistic algorithms to analyze millions of pages of unstructured text in patient records and the medical literature to locate the most relevant answers to diagnostic and treatment-related questions.

[ Is it time to re-engineer your Clinical Decision Support system? See 10 Innovative Clinical Decision Support Programs. ]

Kohn explained that 90% of the world's data has been created in the last two years, and 80% of that data is unstructured. As any clinician with a pile of unread medical journals knows, that massive collection of information includes far too many papers for any one human to read. Watson reads it for them. At the time of the Jeopardy competition, for instance, it was capable of reading 65 million pages of text per second.

With the help of natural language processing (NLP), the computer not only pulls out relevant terms to match the search terms in a clinician's query, but it also actually understands the idioms and other idiosyncratic gibberish we call English. Which means Watson can make sense of the fact that Americans park in driveways and drive on parkways, or the fact that noses run and feet smell.

Put another way, Watson does much more than just locate relevant keywords in its database. With the help of temporal, statistical paraphrasing and geospatial algorithms, it finds meaningful relationships between the clinician's question and its massive collection of medical facts and theories.

Armed with this skill, the supercomputer works through several logical steps to help physicians through their decision-making process. Once it understands the nature of the request, Watson generates a long list of hypotheses in response to the clinician's question. Then it assigns priority ratings to those hypothetical answers based on its analysis of millions of pages of stored data. Next, it generates a confidence level for each of the likely answers so that it can help physicians make an evidence-based decision.

In the final analysis, however, it's the clinician who must review the best solution and choose a course of action. He or she also has the option to ask Watson to supply all of the supporting literature upon which the computer based its answers. Similarly, Watson may ask for additional data, suggesting specific lab tests be done to improve the probability of arriving at a correct diagnosis or treatment regimen.

Of course all this impressive technology would only be an exercise in IBM bravado if there were no real patients and doctors to put Watson to the test. Enter Sloan-Kettering.

As Skarulis pointed out during her presentation, the medical center has about 2,000 order sets it can pull from when choosing a cancer treatment. Finding the best fit for each patient is no easy task. To help, Sloan-Kettering can tap its own massive database, called Darwin, which includes everything that has happened to all of its 1.2 million inpatients and outpatients over 20-plus years. In essence, that database embodies "the thinking patterns of all our experts," she explained.

The medical center decided to collaborate with IBM to "build an intelligence engine to provide specific diagnostic test and treatment recommendations," Skarulis says. The two organizations are now combining all of Darwin's intelligence with all of Watson's NLP capabilities. IBM is using all of the medical center's structured patient data and its NLP tools to convert the medical center's free text consult notes into usable data.

The team will first use this approach to tackle non-small-cell lung cancer. It has brought in Mark Kris, MD, one of MSKCC's top lung cancer experts, to help develop training cases for Watson to work on, focusing on 14 to 20 data elements, including the size and location of a patient's tumor, the presence of any genetic mutations (Sloan-Kettering does a full genomic analysis on all of its lung and colon cancer patients), and whether the tumor has spread to other tissues.

Watson's task is to follow the protocol that Kohn outlined above and come back with a list of diagnostic and treatment options for physicians to choose from, with confidence ratings for each option. Ideally, a treatment regimen that Watson concludes has a 95% confidence rating, for example, would help oncologists choose from the 28 different chemotherapy cocktails they have at their disposal.

Watson's training has prepared it for its role as a clinical decision support system. But now that it has graduated medical school, it's time for a real world residency. Skarulis hopes to launch a pilot program by the end of this year that will allow the supercomputer to work on real cases. It's hard to imagine an attending oncologist who would not want such a resident assisting him at the bedside.



Related Reading


More Insights




Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

BYTE encourages readers to engage in spirited, healthy debate, including taking us to task. However, BYTE moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing/SPAM. BYTE further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.

Follow InformationWeek

By The Numbers

What Are Your Primary Concerns About Using Big Data Software?

Base: 417 respondents at organizations using or planning to deploy data analytics, BI or statistical analysis software
Data: InformationWeek 2013 Analytics, Business Intelligence and Information Management Survey of 541 business technology professionals, October 2012

What Do You Think?

What's your attitude about SQL analysis on top of Hadoop?
We want fast, standard SQL analysis capabilities on Hadoop ASAP
Hadoop is for unstructured data; SQL is for relational databases
We'll give SQL on Hadoop a try, but relational DBs will remain the mainstay
Given strong SQL support on Hadoop, we'd nix the data warehouse
We're not interested in Hadoop
No opinion



Related Content

From Our Sponsor

Five Big Data Challenges and How to Overcome Them with Visual Analytics

Five Big Data Challenges and How to Overcome Them with Visual Analytics

Business leaders often need a visual snapshot of data to quickly grasp and use it. This paper identifies five challenges in presenting data and how visual analytics can resolve them. Solutions are suggested to overcome the challenges of: speed, data clarity, data quality, displaying meaningful results, and dealing with outliers.

Game-Changing Analytics: How IT Executives Can Use Analytics to Create Innovation and Business Success

Game-Changing Analytics: How IT Executives Can Use Analytics to Create Innovation and Business Success

Today's competitive advantage requires a deeper understanding of your business, your market and your customers. As an IT executive, you can drive that knowledge transformation. In this white paper, learn how to make decisions as a strategic business leader and three steps to begin an analytics initiative within your enterprise.

Data Visualization Techniques: From Basics to Big Data with SAS Visual Analytics

Data Visualization Techniques: From Basics to Big Data with SAS Visual Analytics

High-performance data visualization turns sophisticated analyses into meaningful graphics, leading to faster and smarter decision making. In this white paper, learn how visual analytics can transform big data, with additional features such as real-time functionality, mobile compatibility, robust applications for technical groups and accessibility for nontechnical users.

Big Data: Lessons from the Leaders

Big Data: Lessons from the Leaders

Financial performance, competitive advantage, operational efficiency, strategic decision making - every business goal can extract value from big data, and the time for doubt or inaction has long passed. In this Economist Intelligence Unit report, in-depth interviews with data pioneers reveal the link between the effective use of big data and the bottom line among other results.

Decision-Driven Data Management: A Strategy for Better Decisions with Better Data

Decision-Driven Data Management: A Strategy for Better Decisions with Better Data

Which came first, the data or the decision? This white paper makes the case for having a decision in mind, then tailoring big data's volume, variety and velocity to achieve business results such as overcoming customer dissatisfaction or creating well-informed strategies in real time.

Informationweek Reports

Research: The Big Data Management Challenge

Research: The Big Data Management Challenge

The challenge of big data is real, but most organizations don't differentiate 'big data' from traditional data, and nearly 90% of respondents to our survey use conventional databases as the primary means of handling data. We'll help you understand what constitutes big data (it's not just size) and the numerous management challenges it poses.