Data Scientists Want Big Data Ethics Standards - InformationWeek
Data Management // Big Data Analytics
09:06 AM
Connect Directly

Data Scientists Want Big Data Ethics Standards

Nearly half of data scientists surveyed last month say Facebook's controversial "mood manipulation study" was unethical, and many support ethics guidelines for big data research.

10 Big Data Online Courses
10 Big Data Online Courses
(Click image for larger view and slideshow.)

The vast majority of statisticians and data scientists believe that consumers should worry about privacy issues related to data being collected on them, and most have qualms about the questionable ethics behind Facebook's undisclosed psychological experiment on its users in 2012.

Those are just two of the findings from a Revolution Analytics survey of 144 data scientists at JSM (Joint Statistical Meetings) 2014, an annual gathering of statisticians, to gauge their thoughts on big data ethics. The Boston conference ran Aug. 2-7.

The survey results show data scientists are largely a principled bunch concerned over the lack of ethical guidelines for big data research, at least in some industries.

The Facebook study is a case in point. In January 2012, the social network placed positive or negative posts and images in nearly 700,000 of its users' news feeds to gauge whether the information would sway people's emotions. The Facebook users were unaware they were subjects in the study.

[New sources of data raise new privacy issues. Read Mining WiFi Data: Retail Privacy Pitfalls]

The JSM survey found that 47% of respondents found the Facebook study unethical; another 40% said they "don't know" if the mood manipulation study was ethical.

Big data researchers can glean an important lesson from the Facebook study and the criticism it received, said David Smith, chief community officer at Revolution Analytics, one of leading commercial providers of software and services based on the open-source R programming language. Smith is responsible for developing relationships with the statistician and data scientist community that uses and develops R.

In a phone interview with InformationWeek, Smith said data scientists and statisticians working in the scientific and health science fields already have "a lot of regulation around how data is collected and analyzed."

One example involves medical research conducted for the US Department of Health and Human Services' National Institutes of Health (NIH). "If you want to run a study, say, a psychological study through the NIH with actual patients or human subjects, you need to go through an ethics review before you go ahead and do that," said Smith.

In the tech industry, however, big data ethical guidelines are far more opaque.

"I think what's interesting about the Facebook [study] is that there's this whole new Wild West, if you like, of data coming from Internet applications, Internet services, the Internet of Things, where these practices and procedures aren't really in place yet," said Smith.

When asked if there should be an ethical framework for collecting and using data, 42% of JSM survey respondents agreed that an industry standard should be in place, while 43% said that ethics already plays "a big part" in their research.

If people feel there isn't an ethical standard in place for data collection and analysis, "then naturally they should worry about privacy issues associated with that data," Smith said. "Statisticians and data scientists have an important role to play in the practices and standards around handling and analyzing data in the world at large. I think the Facebook example should teach us a lesson, and my hope is that web and technology companies will involve data scientists more in analyzing the data that they collect."

Do you need a deeper leadership bench? Send your most promising leaders to our InformationWeek Leadership Summit, Sept. 30 in New York City, for a day of peer learning and strategic speakers.

Jeff Bertolucci is a technology journalist in Los Angeles who writes mostly for Kiplinger's Personal Finance, The Saturday Evening Post, and InformationWeek. View Full Bio

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
User Rank: Ninja
9/29/2014 | 2:58:33 AM
Nice try, but I don't buy the soul searching from data scientists
Ethical standards and laws are two very different things;  No one ever went to jail for violating ethics. 
User Rank: Ninja
9/22/2014 | 12:37:18 PM
Ethics should be part of research
Not surprised that this debate has come up. Big Data, the Internet of Things and so on have created new sources for significant research. Even in other areas of research where there are standards and regulations, some researchers bend the rules. In this situation, a body of standards needs to exist to be a baseline for appropriate usage of new sources of vast data.
Why Your Company's AI Strategy May Not Be Its Own
Lisa Morgan, Freelance Writer,  3/18/2019
Q&A: Deloitte's Lisa Noon on Inclusivity and Cloud Evolution
Joao-Pierre S. Ruth, Senior Writer,  3/15/2019
Empowering Women in the Workplace 365 Days a Year
Guest Commentary, Guest Commentary,  3/19/2019
White Papers
Register for InformationWeek Newsletters
Current Issue
Security and Privacy vs. Innovation: The Great Balancing Act
This InformationWeek IT Trend Report will help you better understand and address the growing challenge of balancing the need for innovation with the real-world threats and regulations.
Flash Poll