7 Common Biases That Skew Big Data Results - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

IoT
IoT
Data Management // Big Data Analytics
News
7/9/2015
08:06 AM
Lisa Morgan
Lisa Morgan
Slideshows
Connect Directly
Twitter
RSS
E-Mail

7 Common Biases That Skew Big Data Results

Flawed data analysis leads to faulty conclusions and bad business outcomes. Beware of these seven types of bias that commonly challenge organizations' ability to make smart decisions.
7 of 8

Confounding Variables 
Sometimes a perceived relationship between two variables may be proven partially false or entirely false because a confounding variable has been omitted (often because it has been overlooked).  
'It could be that different populations are collected or reported differently or by different people, a causal variable that affects the behavior of each population, or an inherent quality that leads to autocorrelation,' said Metrocosm's Max Galka. 
Schleicher once worked on a survey that asked respondents which credit card brands they would consider. Over a three-year period, the data indicated that the consideration numbers for one credit card company nearly doubled, while those of several other companies remained flat. The obvious conclusion turned out to be the wrong conclusion.
'A confounding variable is cardholders have higher consideration for their current credit card companies than people who are not customers,' said CenturyLink's Schleicher. 'The company had gone through several M&As, and their portfolio had grown enormously over that three-year period. They hadn't improved their consideration, or customer experience, or how their customers valued them. They just acquired more customers through portfolio mergers. 
'If you exploded it out by customers and non-customers, you'd see that non-customer considerations were flat and the only thing that had changed was their marketshare. It's easy to look at a couple of variables together. You're looking at correlations, you plot things, you see a pattern that looks promising, and you have to ask yourself whether it's the relationship or something else explaining it.'
(Image: ChadoNihi via Pixabay)

Confounding Variables

Sometimes a perceived relationship between two variables may be proven partially false or entirely false because a confounding variable has been omitted (often because it has been overlooked).

"It could be that different populations are collected or reported differently or by different people, a causal variable that affects the behavior of each population, or an inherent quality that leads to autocorrelation," said Metrocosm's Max Galka.

Schleicher once worked on a survey that asked respondents which credit card brands they would consider. Over a three-year period, the data indicated that the consideration numbers for one credit card company nearly doubled, while those of several other companies remained flat. The obvious conclusion turned out to be the wrong conclusion.

"A confounding variable is cardholders have higher consideration for their current credit card companies than people who are not customers," said CenturyLink's Schleicher. "The company had gone through several M&As, and their portfolio had grown enormously over that three-year period. They hadn't improved their consideration, or customer experience, or how their customers valued them. They just acquired more customers through portfolio mergers.

"If you exploded it out by customers and non-customers, you'd see that non-customer considerations were flat and the only thing that had changed was their marketshare. It's easy to look at a couple of variables together. You're looking at correlations, you plot things, you see a pattern that looks promising, and you have to ask yourself whether it's the relationship or something else explaining it."

(Image: ChadoNihi via Pixabay)

7 of 8
Comment  | 
Print  | 
Comments
Newest First  |  Oldest First  |  Threaded View
LisaMorgan
50%
50%
LisaMorgan,
User Rank: Moderator
7/25/2015 | 11:23:02 AM
Re: Selection Bias
I thought about including that but it is a subtype of confirmation bias.  Also, the cognitive biases are far more familiar to the general population than the others.

Selection bias is an issue as you say.
shamika
50%
50%
shamika,
User Rank: Ninja
7/24/2015 | 11:17:51 PM
Selection Bias
Well in my opinion I feel selection bias is needed.  This is very much important when it comes to population rather than a sample. However there is a concern on accuracy.
LisaMorgan
100%
0%
LisaMorgan,
User Rank: Moderator
7/19/2015 | 1:27:05 PM
Re: Confirmation bias is the big one
When I've written for business audiences only, I've avoided the term, "confirmation bias" and instead endeavored them get them to understand the differences between assumptions and hypotheses.  If you assume, you've baked in what you believe to be truth without proof.  A hypothesis is tested - proven or disproven.  In other words, be prepared to be wrong, embrace that and learn from it.  However, is it very common to cherry pick questions, engineer survey questions, and dismiss anything that does not go along with that which one set out to prove.  

Confirmation bias isn't always intentional; however.  It's sometimes done unconsciously.  And that's what people who are genuinely concerned about confirmation bias fear in their own work.  Working collaboratively with others, exchanging ideas and comparing results in an open environment is good, ideally if everyone is not out to prove the same point such as Brand X cola with a zillion grams of sugar is a viable form of health food.  :-)

 
jries921
50%
50%
jries921,
User Rank: Ninja
7/19/2015 | 1:15:47 AM
Re: Confirmation bias is the big one
As Mark Twain pointed out long ago, we all have axes to grind.  Given such is the case, the wise/honest thing to do is to recognize one's own biases and try to correct for them.  Free and open discussion tends to promote this end, which is why those most wedded to their own ideas or the orthodoxies of their own ingroups will often try to squelch it (unless, of course, they're committed to the concept), or simply withdraw to their own little comfortable groups (caucuses?), so they're not subject to the discomfort of cognitive dissonance.  And if partisan or other factional politics start playing a prominent role, it can be very difficult to reach any sort of reasonable consensus (witness what has happened to macroeconomics since the 1980s; or constitutional law since forever).

Another issue is that those most wedded to their ideas are often the very people most motivated to see where they lead.  This can actually be a good thing as the necessary research will often be very hard work the "impartial inquirer after Truth" (unless he is particularly dilligent) might well seek to avoid.  Would Charles Darwin have chosen evolutionary biology as his field of research if his grandfather had not been an early proponent of the concept?  I suspect not.  If such people can resist the urge to cheat, or perhaps can be persuaded to collaborate with those much more skeptical so they can "keep each other honest", then they can do very good work indeed.

it is natural for scientific journals and their readers to be much more interested in successes than failures (I don't expect that to ever change), and pride makes it difficult to let go of one's life's work when it reaches a dead end (perhaps this is part of why so many breakthroughs are made by younger researchers); but even experimental failures will often lead to discovery if one is willing to consider the implications (thus, failure can lead to success).  A classic example was the "failed" 1887 Michelson–Morley experiment that was a large part of the experimental basis for Einstein's Special Theory of Relativity.
LisaMorgan
100%
0%
LisaMorgan,
User Rank: Moderator
7/18/2015 | 7:14:16 PM
Re: Confirmation bias is the big one
Confirmation bias is everywhere.  Sometime it's deliberate, sometimes it isn't.  A data scientists and a machine learning specialist have been trying to get me to talk about the issue in terms of  scientific journals because scientific journals only publish positive results, they say.  When it comes to healthcare, confirmation bias can be very dangerous.  
jries921
50%
50%
jries921,
User Rank: Ninja
7/18/2015 | 3:50:07 PM
Confirmation bias is the big one
I've long thought that the most effective propaganda is that which tells people what they already think; which is why partisan talk radio excels at radicalizing those who already believe, but does little or nothing to persuade people who don't (if anything, the "shock jock" commentary turns them off).  Confirmation bias is also a very good reason to be skeptical of any research sponsored by a for-profit corporation; as even if the researchers and sponsors are working in good faith (a dubious proposition in an era in which profit maximization is widely thought to trump all other considerations), vendors will tend to disbelieve any research that makes their products look bad (few people want to believe they're selling junk).

 
Commentary
CIOs Face Decisions on Remote Work for Post-Pandemic Future
Joao-Pierre S. Ruth, Senior Writer,  2/19/2021
Slideshows
11 Ways DevOps Is Evolving
Lisa Morgan, Freelance Writer,  2/18/2021
News
CRM Trends 2021: How the Pandemic Altered Customer Behavior Forever
Jessica Davis, Senior Editor, Enterprise Apps,  2/18/2021
White Papers
Register for InformationWeek Newsletters
Video
Current Issue
2021 Top Enterprise IT Trends
We've identified the key trends that are poised to impact the IT landscape in 2021. Find out why they're important and how they will affect you.
Slideshows
Flash Poll