Software
News
6/24/2013
10:56 AM
Connect Directly
Google+
RSS
E-Mail
50%
50%

Hadoop: From Experiment To Leading Big Data Platform

6th annual Hadoop Summit, held this week in Silicon Valley, will highlight Hadoop's evolution from backroom science project to mainstream big data manager.

Apache Hadoop has come a long way since the first Hadoop Summit took place in 2007. From its humble origins as a promising open-source framework for managing data-intensive distributed applications, Hadoop has mushroomed into the leading big data platform, one doing real work at Fortune 500 corporations.

This year's Hadoop Summit, co-sponsored by Yahoo and Hortonworks, takes place June 26-27 in San Jose, Calif. The 2-day event is expected to draw 2,500 to 3,000 attendees and will feature more than 90 breakout sessions on all things Hadoop, according to John Kreisa, vice president of strategic marketing for Hortonworks.

"I've been working with the technology for three or four years now, and over that time Hadoop has gone from the experimental, 'We've got a test cluster set up,' to 'OK, here's what we're going to do with it,'" Kreisa told InformationWeek.

[ Cray puts Hadoop on its supercomputers. Read Cray Brings Hadoop To High-Performance Computing. ]

The theme of this year's conference is Hadoop's "maturation," spotlighting the platform as a key component of the next generation of data architectures. "Effectively, Hadoop has matured now as a technology such that mainstream enterprises are using it for a wide variety of workloads," Kreisa said. Summit attendees will hear presentations from major corporations, including Cardinal Health, Home Depot, and Kohl's, that are using Hadoop for real workloads.

Despite Hadoop's growing popularity in the enterprise, however, it has its shortcomings, most notably a reputation for being difficult to use. There's also the problem of what to do with all that big data once you've collected it.

As InfomationWeek's Doug Henschen writes, "In contrast to NoSQL, Hadoop seems to be getting all the credit it deserves and then some. By many accounts, it's the be-all and end-all of big data, despite the fact that the lion's share of deployments today are little more than digital landfills."

Kreisa counters that "digital landfill" is an interesting analogy, but not one that represents what he's seeing in the enterprise. "The term that we hear companies using, large financial services and telecommunications (firms), is 'data lake' or 'data reservoir,'" he said, adding that these organizations are able to "spin out" new analytic applications based on the data they're collecting.

Kreisa does acknowledge, however, that Hadoop has "a few rough edges that need to be sanded off," particularly in the areas of deployment and manageability. "These things continue to evolve," he said. "Hadoop is a large distributed system with lots of moving parts. A modern Hadoop platform will have 10 or 12 open-source projects as subcomponents."

Hadoop is arguably the best-known and most widely used big data management platform, but it certainly isn't the only option for enterprises. Should its proponents be worried?

"I don't see any serious competitors to Hadoop," Kreisa said. "There are lots of other technologies that fill different workload components, and part of it comes down to the underlying file system."

He continued, "Generally speaking, HDFS, the Hadoop Distributed File System, has almost really won the battle. If you look at other architectures, where people may try to replace the query engine on top of it … HDFS is still the underlying place where that data is coming to rest."

There's still a significant need for Hadoop training, Kreisa added, which in part is what this week's Summit is all about. "There needs to be growth in skills, because again, it's a complex distributed storage system that's not like the other things that people are using today."

To understand how to secure big data, you have to understand what it is -- and what it isn't. In the Security Implications Of Big Data Strategies report, we show you how to alter your security strategy to accommodate big data -- and when not to. (Free registration required.)

Comment  | 
Print  | 
More Insights
Comments
Threaded  |  Newest First  |  Oldest First
Sudhir774
100%
0%
Sudhir774,
User Rank: Apprentice
2/21/2014 | 4:25:38 AM
Comment regarding the post
Nice information, I had come to know about your blog from one of my friends, I have read at least 6 posts of yours by now, and let me tell you, your website gives the best and the most interesting information. This is just the kind of information that i had been looking for, i would regularly watch out for the new post, once again hats off to you! Thanx a ton once again, Regards, Hadoop Online Training Hyderabad
Google in the Enterprise Survey
Google in the Enterprise Survey
There's no doubt Google has made headway into businesses: Just 28 percent discourage or ban use of its productivity ­products, and 69 percent cite Google Apps' good or excellent ­mobility. But progress could still stall: 59 percent of nonusers ­distrust the security of Google's cloud. Its data privacy is an open question, and 37 percent worry about integration.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Tech Digest September 18, 2014
Enterprise social network success starts and ends with integration. Here's how to finally make collaboration click.
Flash Poll
Video
Slideshows
Twitter Feed
InformationWeek Radio
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.