IoT
IoT
Data Management // Big Data Analytics
News
11/27/2015
11:05 AM
100%
0%

Big Data & The Law Of Diminishing Returns

Right now, big data looks like it holds all the answers for any questions a person or company might have. However, in reality, big data eventually has to succumb to the law of diminishing returns. Here's what several experts see as the gap between promise and reality.

Data Science Skills To Boost Your Salary
Data Science Skills To Boost Your Salary
(Click image for larger view and slideshow.)

At the heart of big data is the search for "insight" -- some correlation or finding that eludes the seeker until he or she adds another terabyte or 10 of data, just in case it is lurking there.

At a certain point, the law of diminishing returns has to kick in. Adding another 100TB becomes redundant.

Vendors exercise their right to remain silent when asked, "How much data is too much?"

Big data skeptics don't have a precise answer, either. But they are more likely to speak of the limitations of Big data than to shout out its promise. They act as the agnostics questioning the IT theology of the Hadoop evangelists.

Know What You Don't Know

For Cathy O'Neil, who holds a doctorate in mathematics from Harvard University and has worked in academia and the private sector, the issue is less about the law of diminishing returns and more about people not understanding the data.

The technology is "encouraging people to use algorithms they don't understand," O'Neil said in an interview. "You don't need a lot of data [to not] know what you are doing."

O'Neil's skepticism is well grounded in her experiences, since her career wound its way through academia, to Wall Street, and then to the New York City startup scene. She has seen the plain gap between the technologists who craft the algorithms and the business people who rely on them.

Data is just a way of codifying information, O'Neil explained. Any data gathered should be relevant to a problem, otherwise useless data clouds the results of a query.

(Image: Shivendu Jauhari/iStockphoto)

(Image: Shivendu Jauhari/iStockphoto)

"If there are too many degrees of freedom, you are begging for a spurious correlation," O'Neil said.

"A good data scientist is a data skeptic and is pushing against group think," O'Neil continued. "Know what you don't know. It's hard." The business side "wants to come out with positive news," she continued. "What if you are wrong? Do we have a backup plan? Can we test against ground truth?"

O'Neil is in good company as she tries to balance out the various needs of big data.

Big Data Is Relative

"My joke is that the biggest innovation [in big data] was when Excel moved from 64,000 to 1 million rows," quipped Caribou Honig, a founding partner of venture capital firm QED Group, which is based in Alexandria, Va.

There are uses for big data in fields like genomics, Honig noted. But there are "tons of high impacts that companies can drive from small data techniques," he said. "Big data methods are substituting for actually thinking through the problem."

"I'd rather have five orthogonal modest data sets than one ginormous data set along a single axis," Honig added. That is where the law of diminishing returns kicks in.

Like any buzzword, big data passes through the stages of the Gartner hype cycle: Promise, excitement, oversell, trough of disillusionment, then some discovery of practical usage, Honig said. People are filtering out the promises, and "using big data to make a difference, not because we can."

But how big is big data?

It depends on who you ask. IDC reported November 24 that business analytics spending would reach $58.6 billion by the end of the year, and it would grow to $101.9 billion by 2019.

However, in Honig's view, the big data of five years ago is not big data today. New tools and techniques are making it possible to analyze big data sets that were simply too big five years ago. "The goal is constantly moving," he said.

Speak Softly and Know Your Stuff

There are three constraints to big data: It is hard to use, it is hard to find, and it is hard to find people who have the skills and judgment to use it, observed Andrew Horne, practice leader at CEB, a "best practice" insight and technology company.

About 62% of all people who work on big data solutions lack the skills and judgment to use the data, Horne said. It's like giving an unlicensed driver a powerful sports car -- the person doesn't know how to drive, and the extra horsepower is wasted.

CEB had surveyed about 5,000 employees at 30 companies to come up with that finding several years ago, Horne said. Since then, any gain in knowledge by the pool of users has been offset by the increasing complexity of the tools being developed to wrangle big data.

"There needs to be something in between," Horne said. There has to be enough confidence in the data, but also the ability to step back and use judgment. It is also difficult to find such people, since the search is a task that falls between the departmental cracks of the typical corporation, he continued.

Adding to the problem is the gap between the trainers and the trainees.

"When you bring in a new big data tool, you need to bring in the people on the data as well," Horne said.

[What does Nate Silver think of the presidential race? InformationWeek finds out.]

Data scientists should bridge the gap between the vendors and the users, because they know their way around the data. The way you access the data is the way you get the added value out of the data and get good results, Horne continued.

Data is not always in one place and may not be labeled consistently.

The challenge is to understand the quality of the data and determine what can be done with it. "You are helping people find data," Horne said.

**New deadline of Dec. 18, 2015** Be a part of the prestigious InformationWeek Elite 100! Time is running out to submit your company's application by Dec. 18, 2015. Go to our 2016 registration page: InformationWeek's Elite 100 list for 2016.

William Terdoslavich is an experienced writer with a working understanding of business, information technology, airlines, politics, government, and history, having worked at Mobile Computing & Communications, Computer Reseller News, Tour and Travel News, and Computer Systems ... View Full Bio

Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
pfretty494
100%
0%
pfretty494,
User Rank: Moderator
12/7/2015 | 2:03:36 PM
Returns don't have to be fleeting
As long as your use of big data continues to grow with your maturity level there is no reason why the returns would have to be fleeting. Even in many of the touted case studies, most businesses are only using a fraction of the potential within analytics. It really depends on organizations being open to the potential. Peter Fretty, IDG blogger for SAS
kstaron
50%
50%
kstaron,
User Rank: Ninja
11/30/2015 | 9:03:14 AM
3 lies of the world
A professor once told me there are 3 types of lies in this world; White lies, Damnable lies, and Statistics. Poor use of data can lead to a stack of lies your company ends up relying on. You might find correlations but until you've used proper judgement you don't know if those are causal or both rresulting from some unmeasured metric. Be careful with what you do based on collected data until you really know what you do know (and what you don't).
Broadway0474
50%
50%
Broadway0474,
User Rank: Ninja
11/29/2015 | 9:32:36 PM
Re: Like anything else that is new
Gary, any higher ed institution trying to find its purpose in the post-MOOC world ought to really consider what you said. They can't crank out enough PhDs. But they could create specialty MBA or master's of science programs to fill the void of data-driven decision makers.
Gary_EL
50%
50%
Gary_EL,
User Rank: Ninja
11/29/2015 | 6:58:30 PM
Re: Like anything else that is new
I'm certainly no data scientist myself, but I do some work researching general Internet of Things (IOT) topics. Businesses are now or soon will be deluged with a VAST amount of information from many, many sources. There probably aren't enough PhD mathematicians in the world to do that data justice. It is almost inevitable that poorly qualified individuals will end up trying to make sense out of that data, and do a poor job at it
Broadway0474
50%
50%
Broadway0474,
User Rank: Ninja
11/28/2015 | 9:59:29 PM
Re: Like anything else that is new
Wow, a lot of good nuggets of information in this article. Well done. For the 62 percent of people working on big data who don't know what they're doing ... are these people with the title "data scientist" or are they the managers getting the reports from the big data algorithms. If it's the former, that's even scarier. 
Gary_EL
100%
0%
Gary_EL,
User Rank: Ninja
11/27/2015 | 9:02:18 PM
Like anything else that is new
This doesn't surprise. Business people can be like children with a new toy - they just want to open the box and play. Never mind about reading the instructions, or allowing someone else to interpret them and read them to you. The good news is that unlike that new toy, big data won't be destroyed. But a lot of time will be wasted and much frustration will ensue.
6 Tools to Protect Big Data
6 Tools to Protect Big Data
Most IT teams have their conventional databases covered in terms of security and business continuity. But as we enter the era of big data, Hadoop, and NoSQL, protection schemes need to evolve. In fact, big data could drive the next big security strategy shift.
Register for InformationWeek Newsletters
White Papers
Current Issue
Top IT Trends to Watch in Financial Services
IT pros at banks, investment houses, insurance companies, and other financial services organizations are focused on a range of issues, from peer-to-peer lending to cybersecurity to performance, agility, and compliance. It all matters.
Video
Slideshows
Twitter Feed
InformationWeek Radio
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.