Hadoop: From Experiment To Leading Big Data Platform - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

10:56 AM
Connect Directly

Hadoop: From Experiment To Leading Big Data Platform

6th annual Hadoop Summit, held this week in Silicon Valley, will highlight Hadoop's evolution from backroom science project to mainstream big data manager.

Apache Hadoop has come a long way since the first Hadoop Summit took place in 2007. From its humble origins as a promising open-source framework for managing data-intensive distributed applications, Hadoop has mushroomed into the leading big data platform, one doing real work at Fortune 500 corporations.

This year's Hadoop Summit, co-sponsored by Yahoo and Hortonworks, takes place June 26-27 in San Jose, Calif. The 2-day event is expected to draw 2,500 to 3,000 attendees and will feature more than 90 breakout sessions on all things Hadoop, according to John Kreisa, vice president of strategic marketing for Hortonworks.

"I've been working with the technology for three or four years now, and over that time Hadoop has gone from the experimental, 'We've got a test cluster set up,' to 'OK, here's what we're going to do with it,'" Kreisa told InformationWeek.

[ Cray puts Hadoop on its supercomputers. Read Cray Brings Hadoop To High-Performance Computing. ]

The theme of this year's conference is Hadoop's "maturation," spotlighting the platform as a key component of the next generation of data architectures. "Effectively, Hadoop has matured now as a technology such that mainstream enterprises are using it for a wide variety of workloads," Kreisa said. Summit attendees will hear presentations from major corporations, including Cardinal Health, Home Depot, and Kohl's, that are using Hadoop for real workloads.

Despite Hadoop's growing popularity in the enterprise, however, it has its shortcomings, most notably a reputation for being difficult to use. There's also the problem of what to do with all that big data once you've collected it.

As InfomationWeek's Doug Henschen writes, "In contrast to NoSQL, Hadoop seems to be getting all the credit it deserves and then some. By many accounts, it's the be-all and end-all of big data, despite the fact that the lion's share of deployments today are little more than digital landfills."

Kreisa counters that "digital landfill" is an interesting analogy, but not one that represents what he's seeing in the enterprise. "The term that we hear companies using, large financial services and telecommunications (firms), is 'data lake' or 'data reservoir,'" he said, adding that these organizations are able to "spin out" new analytic applications based on the data they're collecting.

Kreisa does acknowledge, however, that Hadoop has "a few rough edges that need to be sanded off," particularly in the areas of deployment and manageability. "These things continue to evolve," he said. "Hadoop is a large distributed system with lots of moving parts. A modern Hadoop platform will have 10 or 12 open-source projects as subcomponents."

Hadoop is arguably the best-known and most widely used big data management platform, but it certainly isn't the only option for enterprises. Should its proponents be worried?

"I don't see any serious competitors to Hadoop," Kreisa said. "There are lots of other technologies that fill different workload components, and part of it comes down to the underlying file system."

He continued, "Generally speaking, HDFS, the Hadoop Distributed File System, has almost really won the battle. If you look at other architectures, where people may try to replace the query engine on top of it … HDFS is still the underlying place where that data is coming to rest."

There's still a significant need for Hadoop training, Kreisa added, which in part is what this week's Summit is all about. "There needs to be growth in skills, because again, it's a complex distributed storage system that's not like the other things that people are using today."

To understand how to secure big data, you have to understand what it is -- and what it isn't. In the Security Implications Of Big Data Strategies report, we show you how to alter your security strategy to accommodate big data -- and when not to. (Free registration required.)

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
User Rank: Apprentice
2/21/2014 | 4:25:38 AM
Comment regarding the post
Nice information, I had come to know about your blog from one of my friends, I have read at least 6 posts of yours by now, and let me tell you, your website gives the best and the most interesting information. This is just the kind of information that i had been looking for, i would regularly watch out for the new post, once again hats off to you! Thanx a ton once again, Regards, Hadoop Online Training Hyderabad
Study Proposes 5 Primary Traits of Innovation Leaders
Joao-Pierre S. Ruth, Senior Writer,  11/8/2019
Top-Paying U.S. Cities for Data Scientists and Data Analysts
Cynthia Harvey, Freelance Journalist, InformationWeek,  11/5/2019
10 Strategic Technology Trends for 2020
Jessica Davis, Senior Editor, Enterprise Apps,  11/1/2019
White Papers
Register for InformationWeek Newsletters
State of the Cloud
State of the Cloud
Cloud has drastically changed how IT organizations consume and deploy services in the digital age. This research report will delve into public, private and hybrid cloud adoption trends, with a special focus on infrastructure as a service and its role in the enterprise. Find out the challenges organizations are experiencing, and the technologies and strategies they are using to manage and mitigate those challenges today.
Current Issue
Getting Started With Emerging Technologies
Looking to help your enterprise IT team ease the stress of putting new/emerging technologies such as AI, machine learning and IoT to work for their organizations? There are a few ways to get off on the right foot. In this report we share some expert advice on how to approach some of these seemingly daunting tech challenges.
Flash Poll