Hadoop: Pros And Cons For Enterprise Users
There's been plenty of buzz about how Hadoop can store, process, and analyze huge files and large volumes of files like no other technology before it. But is this open source distributed big data system really suited for enterprises? We look at the pros and cons.
![](https://eu-images.contentstack.com/v3/assets/blt69509c9116440be8/bltbbdec6ef1b6304bf/64cb4343128cb449628b5b21/Slide1_Intro_Hadoop-Pros-and-Cons_Sergey-Nivens_iStock_44347928_LARGE.png?width=700&auto=webp&quality=80&disable=upscale)
Hadoop celebrated a big birthday this year. The technology that was incubated inside Yahoo to handle and analyze large volumes of data in an economical, distributed way is now 10 years old, and an entire ecosystem of complementary technologies has grown around Hadoop, enabling faster processing, real-time data, and more.
Hadoop emerged as a way to fill a real need that arose in organizations -- to collect, process, and analyze large volumes of unstructured data. But enterprises are always rightfully cautious in their adoption of new technologies, and Hadoop adoption is no exception to this rule.
A 2015 survey by Gartner showed that enterprise investment in Hadoop remains tentative as organizations grapple with the business value of the technology and the lack of skills available in the market.
A more recent research report, Hadoop in Transition: From Proof-of-Concept to Production, released by DecisionWorx in July 2016, along with an accompanying webinar, cites some early enterprise success stories with Hadoop, but notes that "we are still in the early stages of Hadoop adoption."
[See Hadoop Ecosystem Evolves: 10 Cool Big Data Projects.]
Yet that adoption may prove to be a competitive edge for enterprises that embrace Hadoop. Forrester Wave: Big Data Hadoop report released in Jan. 2015 states that Hadoop "thoroughly disrupts the economics of data, analytics, and data driven applications. Enterprise adoption is mandatory for firms that wish to double-down on advanced analytics and create insights-driven applications to help them succeed in the age of the customer."
Those are strong words.
But the DecisionWorx survey of approximately 260 individuals with job titles including IT pros, consultants, analysts, and engineers, revealed that only 140 of their organizations had some level of involvement with Hadoop. More than one-third of respondents said that their organizations did not use Hadoop and had no plans to do so.
For enterprises that may be evaluating a pilot of Hadoop, what are some of the pros and cons of the technology? We take a look at some important ones to consider.
Apache Hadoop is an open source technology, which means that organizations can download and implement the software without having to pay for licenses. That makes it easier, cost-wise, for firms to experiment with this tech without making the kind of commitment required when investing in expensive enterprise software. This technology can also run on cost effective commodity hardware, or in the cloud. For production work, enterprises may want to work with a Hadoop distribution company, such as Cloudera, Hortonworks, or MapR, that can provide support for the technology and offer additional tools, too.
Hadoop was created to handle large volumes of data, so naturally that is one of the benefits of this technology. Hadoop enables the storage of files bigger than what can be stored on a particular node or server. And Hadoop enables the storage of many, many files, according to Mike Gualtieri, a principal analyst at Forrester Research.
Much of the data that has been stored by enterprise organizations in the decades before now has been structured data. But the Internet of Things (IoT), social media data streams, and other new sources of data, have created a need to accommodate large volumes of unstructured data. The relational database management systems of the past just weren't built to accommodate this kind of data. Hadoop was built with this new kind of data in mind.
Hadoop and the big data ecosystem that has grown up around it is made up of a set of continuously evolving technology, according to David Loshin, principal consultant with Knowledge Integrity and an author of the DecisionWorx survey and report. "You can turn around and there are parts of the ecosystem that weren't there three weeks ago," he told InformationWeek in an interview.
"What are they? Do they improve things, or is it just another someone else trying to get a foot in the door?" he said in reference to the open source project. It's tough for an enterprise to determine the answer to these questions. But some of this negative aspect of Hadoop has been resolved by distribution companies that offer stacks of tools that are tested together. These distribution vendors are in a better position to ensure that one new version of a certain tool doesn't break the functionality of all the other tools in the stack.
The technology is only 10 years old, and it's not that easy to use, so it's no big surprise that it's tough to find people skilled in Hadoop and big data technologies. Loshin notes that Hadoop requires highly skilled people to operate it. "There aren't a lot of people out there who have that level of experience." So while you may save up front when it comes to software license fees and hardware investments, finding the right personnel to make it work remains a challenge.
While each Hadoop distribution vendor offers security in its solution, not all of the security systems work with each other. Big data security lacks standardization across the various tools and stacks, according to Forrester VP and principal analyst Brian Hopkins. Cloudera and Hortonworks are going in different directions on their security developments, he told InformationWeek in a recent interview. For enterprises that require stable, strong security for governance and compliance, that could be a problem.
While each Hadoop distribution vendor offers security in its solution, not all of the security systems work with each other. Big data security lacks standardization across the various tools and stacks, according to Forrester VP and principal analyst Brian Hopkins. Cloudera and Hortonworks are going in different directions on their security developments, he told InformationWeek in a recent interview. For enterprises that require stable, strong security for governance and compliance, that could be a problem.
-
About the Author(s)
You May Also Like