Big Data // Big Data Analytics
News
11/19/2013
08:06 AM
Connect Directly
Google+
LinkedIn
Twitter
RSS
E-Mail
50%
50%
Repost This

5 Big Data Myths, Busted

Be right, not fast. Think business, not IT. Don't worry about dirty data. A big data guru shares contrarian advice about the worst whoppers.

Forget about the cautions and clichés, the easy generalizations and the dark warnings. The big data domain is rife with persistent myths that block progress -- or worse, get people headed in the wrong direction.

Taking on some of the big data realm's worst whoppers, Arnab Gupta, CEO of analytics platform provider Opera Solutions, insists that successful big data projects aren't boil-the-ocean, years-long IT infrastructure projects in the mold of data warehouse deployments. Rather, they must be focused business initiatives. Here's his take on five leading myths.

Myth 1: Big data is the next paradigm, and if you don't make a change right away you'll get left behind.
Not so fast, says Gupta. It's this kind of thinking that gets people deploying Hadoop clusters and stockpiling data before they have any idea what they want to do with the information.

"The problem with first vs. last thinking is that you assume that if you're first, you're going to get a competitive advantage, but that won't be the case if you don't focus business results that will give you a business advantage," Gupta explains.

[ Maybe you already have the data guru you've been seeking. See Can Your SysAdmin Be Your Data Scientist? ]

Many big data initiatives seem to be experiments because people just aren't used to working big volumes and varieties of data. By starting with a specific known problem, you'll reduce the scope of change management and of pioneering required to get to a big data breakthrough.

Arnab Gupta, CEO, Opera Solutions

Myth 2: Big data is an IT problem.
Closely related to Myth 1, this kind of thinking can get you in trouble. The danger in starting with IT experimentation is ending up with boil-the-ocean IT infrastructure projects. Avoid the trap of "build it and they will come" thinking.

"Most of the investments in big data projects have gone into information management infrastructure. If you start with the business use case, you may still be investing in infrastructure, but it will be for precisely the tools you need to solve a specific business need."

Myth 3: Our data is so messed up we can't possibly master big data.
There's no doubt that enterprise data is often flawed, but data quality, master data management, and data governance tools have made it easier to clean up the mess. "The huge investments companies have made in data management are now paying massive dividends."

Where companies used to have to invent tools and come up with data management, data analysis, and data visualization systems on their own, they can now turn to packaged applications on all fronts. These tools have made it far easier to capture, clean, manage, and analyze information. So don't let fear of bad data become a mental stumbling block.

Previous
1 of 2
Next
Comment  | 
Print  | 
More Insights
Comments
Oldest First  |  Newest First  |  Threaded View
D. Henschen
50%
50%
D. Henschen,
User Rank: Author
11/19/2013 | 9:45:09 AM
Use the data you've already captured
Arnab Gupta makes the point that many successful big data deployments are taking advantage of data companies have already captured but aren't using. CRM comment fields about customers, for example, are easier to crack than the fire hoses of comments out there on social networks.
J_Brandt
50%
50%
J_Brandt,
User Rank: Ninja
11/19/2013 | 12:28:19 PM
Bite the Bullet
There is data that companies have "forgotten" they are capturing.  With a little work it be related to other data or analyzed in ways that make it truly valuable.  The bad/dirty data issue is just one that people have to bite the bullet on.  It's not going to get any better (or smaller) by waiting.
HM
50%
50%
HM,
User Rank: Strategist
11/19/2013 | 11:12:33 PM
Big Data.
Doug, great insight into these myths. We are seeing an increase in businesses seeking specialized skills to help address challenges that arose with the era of big data. The HPCC Systems platform from LexisNexis helps to fill this gap by allowing data analysts themselves to own the complete data lifecycle. Designed by data scientists, ECL is a declarative programming language used to express data algorithms across the entire HPCC platform. Their built-in analytics libraries for Machine Learning and BI integration provide a complete integrated solution from data ingestion and data processing to data delivery. HPCC Systems provides proven solutions to handle what are now called Big Data problems, and have been doing so for more than a decade. More at http://hpccsystems.com
InformationWeek Elite 100
InformationWeek Elite 100
Our data shows these innovators using digital technology in two key areas: providing better products and cutting costs. Almost half of them expect to introduce a new IT-led product this year, and 46% are using technology to make business processes more efficient.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Elite 100 - 2014
Our InformationWeek Elite 100 issue -- our 26th ranking of technology innovators -- shines a spotlight on businesses that are succeeding because of their digital strategies. We take a close at look at the top five companies in this year's ranking and the eight winners of our Business Innovation awards, and offer 20 great ideas that you can use in your company. We also provide a ranked list of our Elite 100 innovators.
Video
Slideshows
Twitter Feed
Audio Interviews
Archived Audio Interviews
GE is a leader in combining connected devices and advanced analytics in pursuit of practical goals like less downtime, lower operating costs, and higher throughput. At GIO Power & Water, CIO Jim Fowler is part of the team exploring how to apply these techniques to some of the world's essential infrastructure, from power plants to water treatment systems. Join us, and bring your questions, as we talk about what's ahead.