Big Data // Big Data Analytics
News
9/16/2013
11:05 AM
Connect Directly
Google+
RSS
E-Mail
50%
50%
Repost This

Data Scientists Talk Privacy Worries

88% of data scientists surveyed say yes, you should worry about all that personal data being collected. They also strongly support ethics codes for data use.

A sizable majority of data scientists believe consumers should worry about the privacy implications of big data, the personal information collected on them, and how this data is used, according to a recent survey of statisticians attending the JSM (Joint Statistics Meetings) Conference in Montreal.

Revolution Analytics, a Palo Alto, Calif.-based software developer and major proponent of the open source R programming language, conducted the poll of conference attendees in early August. Some 865 respondents offered their views on privacy and ethics in data collection and on the statistical software used to analyze this information.

The survey's findings show that data scientists are clearly concerned about the impact of data collection on personal privacy. Overall, 88% of respondents said consumers should worry about privacy issues in the big data era, as more organizations stockpile personal -- and often sensitive -- information on all of us.

David Smith, VP of corporate marketing for Revolution Analytics, summarized the survey results in a recent blog post. Earlier this year, Forbes selected Smith as one of the top 20 influencers in the big data sector.

[ Will Facebook be allowed to proceed with its planned privacy changes? Read Facebook Privacy Changes: FTC Steps In. ]

"It's always important for consumers to understand what information companies have about them, and how it's being used," Smith told InformationWeek in a phone interview. "Speaking as a data scientist, statisticians and data scientists are uniquely positioned to understand the privacy implications of data, especially the implications of combining lots of different data sources together, which is happening a lot today."

Four of five respondents said there should be an ethical framework for collecting and using data. In fact, some business sectors already have such frameworks in place. More than half of respondents agreed that ethics play a significant role in their data research.

"For example, in the pharmaceutical industry collecting data around clinical trials is very well regulated, and there are ethical frameworks in place for how data is collected and used," said Smith. "But there's not an industry-wide framework for doing so."

Data scientists working in the healthcare and life science fields showed the greatest support (92%) for a code of ethics. "Out of all the industries, life science and healthcare is the one, I think, that is most advanced in setting up frameworks for using data in an ethical way," Smith said.

Overall, the survey results didn't surprise Smith. "It's really a confirmation of what we would have expected to see," he noted. "These are people who have their hands working with data in and out every day. They understand the power of data and the importance of it being used in an appropriate fashion."

It would appear that data scientists in general have a strong code of ethics. Just 10% of survey respondents said there should not be an ethical framework for data research, with 1% agreeing that ethics should not play a role in data science.

"You can put data to very powerful and good uses, and you can put data to nefarious uses," said Smith. "And that's something that statisticians and data scientists recognize."

He added that it's important for consumers to understand what data companies (and government agencies) are collecting on them, and how it's being used. But as the ongoing NSA controversies show, there's plenty of room for improvement here.

Another example: Facebook's new privacy policies are being criticized by privacy groups, which claim the social network will be able to use personal data in ads without compensating its members.

Making decisions based on flashy macro trends while ignoring "little data" fundamentals is a recipe for failure. Also in the new, all-digital Blinded By Big Data issue of InformationWeek: How Coke Bottling's CIO manages mobile strategy. (Free registration required.)

Comment  | 
Print  | 
More Insights
InformationWeek Elite 100
InformationWeek Elite 100
Our data shows these innovators using digital technology in two key areas: providing better products and cutting costs. Almost half of them expect to introduce a new IT-led product this year, and 46% are using technology to make business processes more efficient.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Elite 100 - 2014
Our InformationWeek Elite 100 issue -- our 26th ranking of technology innovators -- shines a spotlight on businesses that are succeeding because of their digital strategies. We take a close at look at the top five companies in this year's ranking and the eight winners of our Business Innovation awards, and offer 20 great ideas that you can use in your company. We also provide a ranked list of our Elite 100 innovators.
Video
Slideshows
Twitter Feed
Audio Interviews
Archived Audio Interviews
GE is a leader in combining connected devices and advanced analytics in pursuit of practical goals like less downtime, lower operating costs, and higher throughput. At GIO Power & Water, CIO Jim Fowler is part of the team exploring how to apply these techniques to some of the world's essential infrastructure, from power plants to water treatment systems. Join us, and bring your questions, as we talk about what's ahead.