Data Outliers: 10 Ways To Prevent Big Data Damage - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

IoT
IoT
Data Management // Big Data Analytics
News
4/18/2016
07:06 AM
Lisa Morgan
Lisa Morgan
Slideshows
Connect Directly
Twitter
RSS
E-Mail

Data Outliers: 10 Ways To Prevent Big Data Damage

Most business decision-makers aren't trained to understand data outliers, but they can learn the basics. Executives, managers, and employees without math degrees can ask smarter questions about analyses they're basing crucial judgments on. Here are some things to know.
3 of 11

An Anomaly Should Be Investigated 

If the outlier is an anomaly (i.e. an unlikely but genuine piece of data, rather than a mistake) removing it as a matter of course may be unwise. Sometimes anomalies indicate the beginning of a future trend or something else that should be investigated. For example, some diseases are rare or non-existent in some parts of the world, but an isolated case or a cluster of cases may nevertheless appear as the result of a single person's exposure.     
'You shouldn't blindly assume that outliers are errors. Sometimes they are what you are looking for. For example, in fraud detection or cyber-security applications, outliers, or anomalies might signal undesirable activity, and are themselves of interest,' said Vadim Bichutskiy, director of data science at data analytics and technology solutions consulting company Innovizo, in an interview.
When an outlier is an anomaly, rather than the result of a mistake, it should be investigated.
(Image: geralt via Pixabay)

An Anomaly Should Be Investigated

If the outlier is an anomaly (i.e. an unlikely but genuine piece of data, rather than a mistake) removing it as a matter of course may be unwise. Sometimes anomalies indicate the beginning of a future trend or something else that should be investigated. For example, some diseases are rare or non-existent in some parts of the world, but an isolated case or a cluster of cases may nevertheless appear as the result of a single person's exposure.

"You shouldn’t blindly assume that outliers are errors. Sometimes they are what you are looking for. For example, in fraud detection or cyber-security applications, outliers, or anomalies might signal undesirable activity, and are themselves of interest," said Vadim Bichutskiy, director of data science at data analytics and technology solutions consulting company Innovizo, in an interview.

When an outlier is an anomaly, rather than the result of a mistake, it should be investigated.

(Image: geralt via Pixabay)

3 of 11
Comment  | 
Print  | 
Slideshows
IT Careers: 12 Job Skills in Demand for 2020
Cynthia Harvey, Freelance Journalist, InformationWeek,  10/1/2019
Commentary
Enterprise Guide to Multi-Cloud Adoption
Cathleen Gagne, Managing Editor, InformationWeek,  9/27/2019
Commentary
5 Ways CIOs Can Better Compete to Recruit Top Tech Talent
Guest Commentary, Guest Commentary,  10/2/2019
White Papers
Register for InformationWeek Newsletters
Video
Current Issue
Data Science and AI in the Fast Lane
This IT Trend Report will help you gain insight into how quickly and dramatically data science is influencing how enterprises are managed and where they will derive business success. Read the report today!
Slideshows
Flash Poll