Twitter data mining could provide governments with an alternative means of measuring unemployment, researchers say.

Thomas Claburn, Editor at Large, Enterprise Mobility

November 20, 2014

4 Min Read
(Image: Wikimedia)

 8 Doomsday Predictions From Yesterday And Today

8 Doomsday Predictions From Yesterday And Today


8 Doomsday Predictions From Yesterday And Today (Click image for larger view and slideshow.)

Researchers at universities in Australia, Spain, and the US, in conjunction with UNICEF, have found that Twitter posts can be mined to measure unemployment.

In a study published through Cornell's arXiv.org, researchers Alejandro Llorente, Manuel Garcia-Herranz, Manuel Cebrian, and Esteban Moro "demonstrate that behavioral features related to unemployment can be recovered from the digital exhaust left by the microblogging network Twitter."

"Digital exhaust" is a curious choice of words, because it implies that social media data is a worthless byproduct of online interaction, something to be cast aside. Yet the researchers' findings suggest the very opposite: Social media exhaust is the primary product. It's gold, rather than garbage, and social media users don't realize the value they're throwing away. The social media realm is a charity to benefit businesses.

Three decades ago, the technologist and writer Stewart Brand famously observed the tension inherent in how we value information.

[Does big data need a social media approach? Read Does Big Data Need A 'LinkedIn For Analytics'?]

"Information wants to be free," Brand said in one of his several variations on this theme. "Information also wants to be expensive. Information wants to be free because it has become so cheap to distribute, copy, and recombine -- too cheap to meter. It wants to be expensive because it can be immeasurably valuable to the recipient. That tension will not go away."

Facebook, Google, Twitter, and the rest of the social media and advertising industry want people to believe that their online work -- their posts, pictures, and associated data -- isn't worth anything, in order to capture the full value of this free bounty of insight. Pollute the world with your digital exhaust; we'll clean up, all the way to the bank.

Llorente and his colleagues used a data set of 19.6 million geolocated Twitter messages in Spain from Nov. 29, 2012, to June 30, 2013, and a data set detailing unemployment across various regions of the country to uncover a relationship between economic metrics and social behavior. Interestingly, they noted a correlation between misspellings in tweets, which they take as a proxy for education level, and unemployment.

The researchers consider their success in using Twitter posts to assess employment status to be a "a proof of concept for how a wide range of behavioral features linked to socioeconomic behavior can be inferred from the digital traces that are left by publicly available social media." And they argue that governments might be able to adapt social media surveillance as an alternative to more costly traditional data gathering methods related to public policy.

"The immediacy of social media may also allow governments to better measure and understand the effect of policies, social changes, natural or man-made disasters in the economical status of cities in almost real-time," the researchers state.

Others have already reached this conclusion and are monitoring social media for threats to the powers that be, at home and abroad. Intelligence agencies have been wise to the value of social media exhaust for years, both for the overtly expressed sentiment and for the web of personal relationships exposed through social network links.

In fact, data mining to assess aspects of society has become commonplace. Google began using search queries to assess flu infections in 2008. This year, researchers have demonstrated that Twitter posts can be used to infer whether the person posting is ill. And web pages are laden with tracking scripts to gather data.

Interpreting that data correctly, however, remains a challenge. Google's flu tracking system turned out to be inaccurate. Worse, it would not be difficult to construct a social media study to support a predetermined conclusion for political or economic gain. So Twitter used as a way to measure employment should be assessed with caution.

Twitter has long been wise to the value of its data. In its IPO filing, it disclosed that it had made $47.5 million selling data to other companies in 2012, up from $28.6 million the year before.

Companies want information to be free, so they can sell it at great cost.

Apply now for the 2015 InformationWeek Elite 100, which recognizes the most innovative users of technology to advance a company's business goals. Winners will be recognized at the InformationWeek Conference, April 27-28, 2015, at the Mandalay Bay in Las Vegas. Application period ends Jan. 16, 2015.

About the Author(s)

Thomas Claburn

Editor at Large, Enterprise Mobility

Thomas Claburn has been writing about business and technology since 1996, for publications such as New Architect, PC Computing, InformationWeek, Salon, Wired, and Ziff Davis Smart Business. Before that, he worked in film and television, having earned a not particularly useful master's degree in film production. He wrote the original treatment for 3DO's Killing Time, a short story that appeared in On Spec, and the screenplay for an independent film called The Hanged Man, which he would later direct. He's the author of a science fiction novel, Reflecting Fires, and a sadly neglected blog, Lot 49. His iPhone game, Blocfall, is available through the iTunes App Store. His wife is a talented jazz singer; he does not sing, which is for the best.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like


More Insights