Big Data Disease Breakthroughs - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Government // Big Data Analytics

Big Data Disease Breakthroughs

Researchers spotted new ways to treat cancer when the National Institutes of Health enabled semantic searches of its huge Medline database of published medical articles.

7 Cool Products At Interop New York
7 Cool Products At Interop New York
(Click image for larger view and slideshow.)

The point of big data is to be able to extract usable information -- knowledge -- from large volumes of data that do not have any immediately apparent relationships. Even with advances in computing power, the task of searching to find correlations can be daunting and even impractical if the datasets are large enough.

The National Institutes of Health has enabled semantic searches of the data in its Medline database, allowing researchers to find correlations in published medical data between therapies and outcomes that had not been noticed before. In one case, cancer researchers using graph analysis were able to see that in some types of cancer cases immunotherapy produced better results than chemotherapy.

"It's a real discovery," said Brand Niemann, founder of the Federal Big Data Working Group and former senior enterprise architect and data scientist that the Environmental Protection Agency. "It's like finding a needle in the haystack of medical literature."

[Learn more about how government is boosting big data. See Government Toils To Create Big Data Infrastructure.]

The haystack is Medline, the bibliographical database of the National Library of Medicine, which contains more than 21 million references to medical journal articles dating back to 1946. The database contains an embarrassment of riches, with 2,000 to 4,000 new references added daily, five days a week, in 2013 alone. These entries have been enhanced with 65 million semantic predications -- entries using semantic markup standards -- resulting in 2.2 billion Resource Description Framework statements.

(Image: Wikimedia)
(Image: Wikimedia)

To make the search practical, researchers used the Urika graph analytics appliance from YarcData. Urika works with existing data warehouses to handle graph workloads, which allow relationships within the data to be plotted graphically. All resources to be searched are stored on the appliance's shared memory, so data does not have to first be partitioned or formed in data models. The team was able to identify connections between outcomes of therapies for different types of cancers from the 10 million semantic predications.

By creating a practical way to extract visual relationships from the data, the researchers were able to find the correlations quickly and without first developing a hypothesis about them. Making the data semantically searchable enables analysis that can make better use of existing data to drive future research, Niemann said.

The owners of electronic health records aren't necessarily the patients. How much control should they have? Get the new Who Owns Patient Data? issue of InformationWeek Healthcare today.

William Jackson is writer with the <a href="" target="_blank">Tech Writers Bureau</A>, with more than 35 years' experience reporting for daily, business and technical publications, including two decades covering information ... View Full Bio

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
User Rank: Ninja
10/27/2014 | 5:03:28 PM
Perfect example
I love when these instances garner attention because its attention to the potential that will ultimately fuel the investments needed to help organizations grow in their data capabilities.  As a recent SAS poll demonstrated organizations need to mature in how they make the step from analysis to action. Understandably this step takes skill development, and development requires invests.  It's a never ending cycle, but successes show the potential. 


Peter Fretty
Li Tan
Li Tan,
User Rank: Ninja
10/4/2014 | 12:38:09 AM
Not a surprise
It's a breakthrough in healthcare area but not a surprise to me. Previously we just have data and we can process and transform them into information. But with big data analytics, we gain knowledge from it and can help doctors to do more precise diagnostics. In the near future we should be able to make it into wisdom.
10 Ways to Transition Traditional IT Talent to Cloud Talent
Lisa Morgan, Freelance Writer,  11/23/2020
Top 10 Data and Analytics Trends for 2021
Jessica Davis, Senior Editor, Enterprise Apps,  11/13/2020
Can Low Code Measure Up to Tomorrow's Programming Demands?
Joao-Pierre S. Ruth, Senior Writer,  11/16/2020
White Papers
Register for InformationWeek Newsletters
Current Issue
Why Chatbots Are So Popular Right Now
In this IT Trend Report, you will learn more about why chatbots are gaining traction within businesses, particularly while a pandemic is impacting the world.
Flash Poll