7 Common Biases That Skew Big Data Results - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

IoT
IoT
Data Management // Big Data Analytics
News
7/9/2015
08:06 AM
Lisa Morgan
Lisa Morgan
Slideshows
Connect Directly
Twitter
RSS
E-Mail

7 Common Biases That Skew Big Data Results

Flawed data analysis leads to faulty conclusions and bad business outcomes. Beware of these seven types of bias that commonly challenge organizations' ability to make smart decisions.
8 of 8

Non-Normality: The Bell Does Not Toll 

Some statistical tests, such as a t-test, assume that a bell curve (normal distribution) exists, but if that is not the case the results may be biased and misleading.
When ClearerThinking.org's Greenberg examines people's moods following the completion of a training program, the assumption of a bell curve proves to be highly inaccurate. If he tried to force-fit the data into a bell curve, the shape would not be evenly distributed. It would be skewed significantly.
'A t-test is a statistical examination of two population means. A two-sample test examines whether two samples are different, and it is commonly used when the variances of two normal distributions are unknown, and when an experiment uses a small sample size,' he said. '[F]or one of our interventions, we got p=0.03 using the t-test. On the other hand, we get a p=0.06 when we do a non-parametric analysis that doesn't assume that the data is normal.'
(Image: Geralt via Pixabay)

Non-Normality: The Bell Does Not Toll

Some statistical tests, such as a t-test, assume that a bell curve (normal distribution) exists, but if that is not the case the results may be biased and misleading.

When ClearerThinking.org's Greenberg examines people's moods following the completion of a training program, the assumption of a bell curve proves to be highly inaccurate. If he tried to force-fit the data into a bell curve, the shape would not be evenly distributed. It would be skewed significantly.

"A t-test is a statistical examination of two population means. A two-sample test examines whether two samples are different, and it is commonly used when the variances of two normal distributions are unknown, and when an experiment uses a small sample size," he said. "[F]or one of our interventions, we got p=0.03 using the t-test. On the other hand, we get a p=0.06 when we do a non-parametric analysis that doesn't assume that the data is normal."

(Image: Geralt via Pixabay)

8 of 8
Comment  | 
Print  | 
InformationWeek Is Getting an Upgrade!

Find out more about our plans to improve the look, functionality, and performance of the InformationWeek site in the coming months.

Commentary
New Storage Trends Promise to Help Enterprises Handle a Data Avalanche
John Edwards, Technology Journalist & Author,  4/1/2021
Slideshows
11 Things IT Professionals Wish They Knew Earlier in Their Careers
Lisa Morgan, Freelance Writer,  4/6/2021
Commentary
How to Submit a Column to InformationWeek
InformationWeek Staff 4/9/2021
White Papers
Register for InformationWeek Newsletters
Video
Current Issue
Successful Strategies for Digital Transformation
Download this report to learn about the latest technologies and best practices or ensuring a successful transition from outdated business transformation tactics.
Slideshows
Flash Poll