What's Your Flu Risk? Big Data Knows - InformationWeek
Data Management // Big Data Analytics
10:52 AM
Connect Directly
4 Keys to Improving Security Threat Detection
Dec 15, 2016
In this webinar, Ixia will show how to combine the four keys to improving security threat detectio ...Read More>>

What's Your Flu Risk? Big Data Knows

TwitterHealth mines Twitter posts for hints of the spread of seasonal flu, delivers real-time predictions on who will get the flu this year.

Want to know if--and when--you'll catch the flu bug this season? TwitterHealth, a research project at the University of Rochester in New York, can predict with uncanny accuracy which Twitter users will become ill--simply by studying their tweets.

The project, designed to show how researchers can use data mining and machine learning to build knowledge systems, was a topic of discussion at the Rochester Big Data Forum, a three-day conference of computer scientists held Oct. 4-6. The forum was the first event of the university's newly launched RocData, or Rochester Big Data Initiative, which is designed to inspire collaboration between data scientists and researchers in other fields, such as medicine and education.

TwitterHealth began as a research project to explore various ways of analyzing geographic information, such as GPS data from cellphones, noted Henry Kautz, chair of the University of Rochester's computer science department and organizer of the Big Data Forum.

"We realized that more social media sites are including geographic data automatically in the posts you make," said Kautz in a phone interview with InformationWeek. "So when people post to Twitter from their cellphones, by and large you get the location, and you can download that data."

[ Learn about a similar, government-led big data initiative. See Twitter App Tracks Illness Outbreaks. ]

Kautz' students set up a network of computers that could download tweets from major metropolitan areas. They then had to determine what actionable information was inside this big data volume, which included Twitter users' geographic locations.

"One thing we realized was that people often tweet about the state of their health," said Kautz. "They'll report they have a running nose. They have a cold. They're not feeling well. We said, 'Can we use this to track seasonal flu?'"

The group began training a series of machine-learning algorithms, starting with a few hundred tweets that were "hand-labeled examples, as in 'these are tweets about feeling sick,'" Kautz said.

The resulting system was able to determine with 99% accuracy whether a given Twitter user was reporting a flu-like illness. In fact, the automated, real-time model was nearly as accurate as humans who analyzed the text, and faster than the Centers for Disease Control (CDC).

"From this data, we can track the spread of seasonal flu, and do so with very good accuracy-- comparable accuracy that you get with the CDC data," Kautz said.

The success of TwitterHealth has led some students who work on the project to launch a startup company, which has licensed the technology from the university. Their goal is to take the same algorithmic approach to track other types of trends.

"There are commercial applications. Instead of health reports, it might track people's interest in fashion... and how ideas about popular culture spread from place to place," said Kautz.

But Kautz is particularly intrigued by the technology's potential in healthcare. "Gathering health data by surveys is very slow and expensive," he said. TwitterHealth also shows promise as a way to combat depression and suicide, and as a health alert system for cities.

"From analyzing these data sets, we've discovered that if you are on certain streets, and spend time in certain restaurants and at certain places, that greatly increases your chance of getting the flu," said Kautz.

The public nature of Twitter posts make them ideal for big data analytics, but Facebook's more private approach to social networking poses a problem. One option is to convince Facebook users to sign up for a TwitterHealth-style service; another is to convince Facebook to provide access to its members' private posts.

"One of our students has had some conversations with Facebook, but there's nothing that's come to fruition yet--maybe in the future," said Kautz.

InformationWeek is conducting a survey on the state of analytics, business intelligence, and information management deployments. Take our InformationWeek 2013 Analytics, Business Intelligence, And Information Management Survey now. Survey ends Oct. 12.

Comment  | 
Print  | 
More Insights
Oldest First  |  Newest First  |  Threaded View
How Enterprises Are Attacking the IT Security Enterprise
How Enterprises Are Attacking the IT Security Enterprise
To learn more about what organizations are doing to tackle attacks and threats we surveyed a group of 300 IT and infosec professionals to find out what their biggest IT security challenges are and what they're doing to defend against today's threats. Download the report to see what they're saying.
Register for InformationWeek Newsletters
White Papers
Current Issue
Top IT Trends to Watch in Financial Services
IT pros at banks, investment houses, insurance companies, and other financial services organizations are focused on a range of issues, from peer-to-peer lending to cybersecurity to performance, agility, and compliance. It all matters.
Twitter Feed
InformationWeek Radio
Archived InformationWeek Radio
Join us for a roundup of the top stories on InformationWeek.com for the week of November 6, 2016. We'll be talking with the InformationWeek.com editors and correspondents who brought you the top stories of the week to get the "story behind the story."
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.
Flash Poll