6 Causes Of Big Data Discrepancies - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

IoT
IoT
Data Management // Big Data Analytics
News
6/4/2015
07:06 AM
Lisa Morgan
Lisa Morgan
Slideshows
Connect Directly
Twitter
RSS
E-Mail
100%
0%

6 Causes Of Big Data Discrepancies

The same data can yield wildly different results. Here are some of the reasons for these fascinating, frustrating, or even dangerous discrepancies.
Previous
1 of 7
Next

(Image: Geralt via Pixabay)

(Image: Geralt via Pixabay)

As the universe of big data continues to explode, organizations struggle to identify and leverage the data that matters most. As companies continue to add more data sources to the mix, the number of potential data sets grows, as do the opportunities for new and different insights. Because data can be combined in so many different ways, there are many possible outcomes.

"People have different choices from the same data set," said Kirk Borne, principal data scientist at Booz Allen Hamilton in an interview. "People will choose different things they think are informative, so the search is to find the most important variables."

As a result, the same data can result in very different interpretations.

"If I run a supernova simulation where the resolution is too low and two supernova scientists analyze that, if one knows the simulation is not a sufficient resolution and the other doesn't, they would come to very different conclusions," said Tony Mezzacappa, chair of theoretical and computation astrophysics at the University of Tennessee, in an interview. "Data completeness is part of data quality. People should understand what the dangers are in extracting conclusions based on such data."

Whether or not data is complete enough may not be obvious until later. For example, cosmic microwave background radiation confirmed the Big Bang theory, at least until the European Space Agency discovered that dust in the universe emits microwaves of its own that can introduce the same kind of polarization.

"That was a very big deal when it was announced. If [cosmic inflation] had been confirmed, it would have spoken volumes about the nature of our universe, its beginning, its evolution, and its end," said Mezzacappa. "Further analysis, and likely further astronomical data collection, will be required to include the effect of dust."

There are inherent uncertainties in algorithms, models, outcomes, and sometimes the data itself that can impact conclusions. Human nature also plays a part. Here, we explain six of the many reasons the same data can result in different interpretations.

Lisa Morgan is a freelance writer who covers big data and BI for InformationWeek. She has contributed articles, reports, and other types of content to various publications and sites ranging from SD Times to the Economist Intelligent Unit. Frequent areas of coverage include ... View Full Bio

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Previous
1 of 7
Next
Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
shamika
50%
50%
shamika,
User Rank: Ninja
6/6/2015 | 11:20:26 PM
Re: Small changes make big differences
"Some people will take the same data and not realize they have to do some data transformation before they can make sense of it". Very true. Sometimes it is represented as it is without any changes. The data dump only.
LisaMorgan
100%
0%
LisaMorgan,
User Rank: Moderator
6/4/2015 | 5:44:10 PM
Re: Small changes make big differences
Yep.  Excellent point.
kstaron
50%
50%
kstaron,
User Rank: Ninja
6/4/2015 | 5:31:21 PM
Small changes make big differences
One of the issues of using so much data, is that no matter how carefully you choose your variables, no matter how you select for randomness or representivness, small changes in variables or leaving out a variable that wasn't a concern before can lead to serious changes in the outcome. All you have to do to see that is adjust the interest rate earned over 40 years in a retirement planning algorithm. A mere half percent can tip the scale from balanced to broke or to that round the world trip you always wanted to take. The best thing to remembe with big data, is that even though it's big, it's only as good as the people using it and their assumptions.
Commentary
Enterprise Guide to Digital Transformation
Cathleen Gagne, Managing Editor, InformationWeek,  8/13/2019
Slideshows
IT Careers: How to Get a Job as a Site Reliability Engineer
Cynthia Harvey, Freelance Journalist, InformationWeek,  7/31/2019
Commentary
AI Ethics Guidelines Every CIO Should Read
Guest Commentary, Guest Commentary,  8/7/2019
White Papers
Register for InformationWeek Newsletters
Video
Current Issue
Data Science and AI in the Fast Lane
This IT Trend Report will help you gain insight into how quickly and dramatically data science is influencing how enterprises are managed and where they will derive business success. Read the report today!
Slideshows
Flash Poll