In its effort to cure pediatric cancer, the research hospital faced challenges associated with sharing terabytes of data with researchers. Here’s what they did.

Jessica Davis, Senior Editor

April 30, 2019

4 Min Read
Image: Gorodenkoff -

Sharing data in a way that it is useful is a key goal of organizations today. Whether you are expanding access to your analytics to a broader base of business users, giving your partners access to data insights, or even enabling your customers to see and interact with data -- sharing data can generate a lot of value.

At St. Jude's Children's Research Hospital, the effort to share data with other researchers has an even loftier and nobler goal: to help cure pediatric cancers and other diseases. Founded by actor and entertainer Danny Thomas, supported by his actor wife Rose Marie, Thomas built the hospital and research organization to cure pediatric cancers and other catastrophic diseases. You may remember the television commercials that featured Thomas, and later featured his daughter, Marlo Thomas (also an actor), to fulfill a pledge he made in prayers to St. Jude: "Help me find my way in life, and I will build you a shrine." Since the hospital opened in 1962, St. Jude's work has helped improve the survival rates for childhood cancers from 20% then to 80% today, according to the organization.

That mission continues today, and a new data sharing effort takes it to the next level. The organization launched its data sharing effort, St. Jude Cloud, in April 2018, to open its data sets in a useful way to other cancer researchers. It is the largest public repository of pediatric cancer and genomics data, according to the organization.

Just how do you put together such an effort? What are the technologies that can enable such a system? Edward Suh, the managing director of Research Information Services at St. Jude and the architect of that project, will explain it all during the session, St. Jude Uses Cloud Data Sharing Portals to Advance Cancer Research, at Interop 2019 in Las Vegas on May 22. He recently spoke with InformationWeek about the session, providing an advanced look at what attendees can expect to learn during the session.

Suh said that among the data that St. Jude has worked to share is something called the Pediatric Genome Project, developed from all the cancer research the organization had conducted over many years.

"Sharing the data was not that easy," Suh told InformationWeek. "It would take a long time to download the data set. It was very large. And there was a manual approval process. Researchers weren't readily able to access the data."


To alleviate these issues, St Jude decided to put this data -- hundreds of terabytes -- into the cloud. The challenges were three-fold: The huge volume of data; the network bandwidth required; and the tools that researchers would require to gain an understanding of the available data. St. Jude partnered with Microsoft Azure for its cloud capabilities and with DNAnexus, a specialist in biomedical informatics and data management. St Jude also enabled data discovery for outside researchers by offering data mining, analysis, and visualization tools, all accessible in the cloud via the web without the need for downloading data.

The challenges were not just technological. St. Jude also faced a similar challenge for many data and analytics organizations -- finding and recruiting the right combination of talent to work on the project. For St. Jude's initiatives, that meant finding people with a knowledge of both the compute side and of the specialized world of pediatric cancer medicine. These professionals helped create the system that allows researchers to explore St. Jude's pediatric cancer data. During his Interop session, Suh expects to share some of his tactics for identifying and recruiting the teams necessary to build these data cloud tools.

In the year since its introduction, the St. Jude Cloud has been a success, with 854 researchers from 500 different institutions exploring the data. In addition, 35,000 unique users have visited the cloud website.

While St. Jude's tools have enabled exploration of the data, researchers who want to take their work with specific data further can also request access to a specific cohort data set for analysis. Suh said there have been 95 official data requests so far that have gone to the hospital's committee for evaluation.

For more on the St. Jude Cloud project and other real world IT projects explained by the people who executed them, register for Interop 2019.

About the Author(s)

Jessica Davis

Senior Editor

Jessica Davis is a Senior Editor at InformationWeek. She covers enterprise IT leadership, careers, artificial intelligence, data and analytics, and enterprise software. She has spent a career covering the intersection of business and technology. Follow her on twitter: @jessicadavis.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like

More Insights