7 Common Biases That Skew Big Data Results - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

IoT
IoT
Data Management // Big Data Analytics
News
7/9/2015
08:06 AM
Lisa Morgan
Lisa Morgan
Slideshows
Connect Directly
Twitter
RSS
E-Mail

7 Common Biases That Skew Big Data Results

Flawed data analysis leads to faulty conclusions and bad business outcomes. Beware of these seven types of bias that commonly challenge organizations' ability to make smart decisions.
5 of 8

Simpson's Paradox 
A trend that is indicated in groups of data can reverse when the groups of data are combined. This is called Simpson's Paradox. It is one reason medical findings and other types of research first report one thing and then the opposite result at another point in time. It is also one reason why seemingly successful marketing campaigns prove not to be successful after all.
'The most common bias in data analysis is called the Simpson's Paradox,' said Rado Kotorov, chief innovation officer at business intelligence and analytics provider Information Builders. 'It's important to realize with big data, using descriptive statistics or just data visualization can lead to bias and wrong decisions. The data analysts need to know when to evaluate the trends statistically to determine that the trend is real and also that the factors that contribute to the trend are significant and not random.'
A sales and marketing campaign may result in little or no ROI when the customer incentives are based on faulty conclusions. Kotorov tried warning a former employer that more analysis was necessary to validate a trend upon which a marketing campaign was based, but the warning was ignored.
'Instead of driving sales up and increasing the profit, [the campaign] drove margin down and increased the loss. I couldn't convince anyone to take the time to investigate the trend and validate whether it was correct or wrong. When the testing control group measurements came, that's when we had to look at it and see what happened, and we found out that the trend was the wrong trend.' 
Today's marketers are using marketing analytics tools simultaneously with multivariate testing to slice and dice data which results in different levels of aggregation. However, the averages can be misleading. 
'The typical fallacy is if you do things at a fine level of aggregation, but do not find contradictions immediately, then you'll follow your instinct that the trend is a valid trend,' Kotorov said.
(Image: Geralt via Pixabay)

Simpson's Paradox

A trend that is indicated in groups of data can reverse when the groups of data are combined. This is called Simpson's Paradox. It is one reason medical findings and other types of research first report one thing and then the opposite result at another point in time. It is also one reason why seemingly successful marketing campaigns prove not to be successful after all.

"The most common bias in data analysis is called the Simpson's Paradox," said Rado Kotorov, chief innovation officer at business intelligence and analytics provider Information Builders. "It's important to realize with big data, using descriptive statistics or just data visualization can lead to bias and wrong decisions. The data analysts need to know when to evaluate the trends statistically to determine that the trend is real and also that the factors that contribute to the trend are significant and not random."

A sales and marketing campaign may result in little or no ROI when the customer incentives are based on faulty conclusions. Kotorov tried warning a former employer that more analysis was necessary to validate a trend upon which a marketing campaign was based, but the warning was ignored.

"Instead of driving sales up and increasing the profit, [the campaign] drove margin down and increased the loss. I couldn't convince anyone to take the time to investigate the trend and validate whether it was correct or wrong. When the testing control group measurements came, that's when we had to look at it and see what happened, and we found out that the trend was the wrong trend."

Today's marketers are using marketing analytics tools simultaneously with multivariate testing to slice and dice data which results in different levels of aggregation. However, the averages can be misleading.

"The typical fallacy is if you do things at a fine level of aggregation, but do not find contradictions immediately, then you'll follow your instinct that the trend is a valid trend," Kotorov said.

(Image: Geralt via Pixabay)

5 of 8
Comment  | 
Print  | 
InformationWeek Is Getting an Upgrade!

Find out more about our plans to improve the look, functionality, and performance of the InformationWeek site in the coming months.

Slideshows
11 Things IT Professionals Wish They Knew Earlier in Their Careers
Lisa Morgan, Freelance Writer,  4/6/2021
News
Time to Shift Your Job Search Out of Neutral
Jessica Davis, Senior Editor, Enterprise Apps,  3/31/2021
Commentary
Does Identity Hinder Hybrid-Cloud and Multi-Cloud Adoption?
Joao-Pierre S. Ruth, Senior Writer,  4/1/2021
White Papers
Register for InformationWeek Newsletters
Video
Current Issue
Successful Strategies for Digital Transformation
Download this report to learn about the latest technologies and best practices or ensuring a successful transition from outdated business transformation tactics.
Slideshows
Flash Poll