Data Outliers: 10 Ways To Prevent Big Data Damage - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

IoT
IoT
Data Management // Big Data Analytics
News
4/18/2016
07:06 AM
Lisa Morgan
Lisa Morgan
Slideshows
Connect Directly
Twitter
RSS
E-Mail

Data Outliers: 10 Ways To Prevent Big Data Damage

Most business decision-makers aren't trained to understand data outliers, but they can learn the basics. Executives, managers, and employees without math degrees can ask smarter questions about analyses they're basing crucial judgments on. Here are some things to know.
3 of 11

An Anomaly Should Be Investigated 

If the outlier is an anomaly (i.e. an unlikely but genuine piece of data, rather than a mistake) removing it as a matter of course may be unwise. Sometimes anomalies indicate the beginning of a future trend or something else that should be investigated. For example, some diseases are rare or non-existent in some parts of the world, but an isolated case or a cluster of cases may nevertheless appear as the result of a single person's exposure.     
'You shouldn't blindly assume that outliers are errors. Sometimes they are what you are looking for. For example, in fraud detection or cyber-security applications, outliers, or anomalies might signal undesirable activity, and are themselves of interest,' said Vadim Bichutskiy, director of data science at data analytics and technology solutions consulting company Innovizo, in an interview.
When an outlier is an anomaly, rather than the result of a mistake, it should be investigated.
(Image: geralt via Pixabay)

An Anomaly Should Be Investigated

If the outlier is an anomaly (i.e. an unlikely but genuine piece of data, rather than a mistake) removing it as a matter of course may be unwise. Sometimes anomalies indicate the beginning of a future trend or something else that should be investigated. For example, some diseases are rare or non-existent in some parts of the world, but an isolated case or a cluster of cases may nevertheless appear as the result of a single person's exposure.

"You shouldn’t blindly assume that outliers are errors. Sometimes they are what you are looking for. For example, in fraud detection or cyber-security applications, outliers, or anomalies might signal undesirable activity, and are themselves of interest," said Vadim Bichutskiy, director of data science at data analytics and technology solutions consulting company Innovizo, in an interview.

When an outlier is an anomaly, rather than the result of a mistake, it should be investigated.

(Image: geralt via Pixabay)

3 of 11
Comment  | 
Print  | 
Slideshows
Top-Paying U.S. Cities for Data Scientists and Data Analysts
Cynthia Harvey, Freelance Journalist, InformationWeek,  11/5/2019
Slideshows
10 Strategic Technology Trends for 2020
Jessica Davis, Senior Editor, Enterprise Apps,  11/1/2019
Commentary
Study Proposes 5 Primary Traits of Innovation Leaders
Joao-Pierre S. Ruth, Senior Writer,  11/8/2019
White Papers
Register for InformationWeek Newsletters
Video
Current Issue
Getting Started With Emerging Technologies
Looking to help your enterprise IT team ease the stress of putting new/emerging technologies such as AI, machine learning and IoT to work for their organizations? There are a few ways to get off on the right foot. In this report we share some expert advice on how to approach some of these seemingly daunting tech challenges.
Slideshows
Flash Poll