Hadoop Players Question Forrester's Take On Leaders
Forrester's first-ever Hadoop market assessment draws mixed reactions, both for its leader rankings and for the players who were left out.
12 Hadoop Vendors To Watch In 2012
12 Hadoop Vendors To Watch In 2012 (click image for larger view and for slideshow)
Forrester published its first-ever Wave report for the emerging Hadoop market late last week and the results were surprising, both in terms of the "leaders wave" assessment and the absence of big vendors, including Microsoft, Oracle, and Teradata.
The first surprise in "The Forrester Wave: Enterprise Hadoop Solutions, Q1 2012" report is that it includes such a wide variety of vendors. In the "leaders" wave are, in rank order, cloud-services provider Amazon Web Services, broad information-management vendors IBM and EMC, and Hadoop distribution and services vendors, MapR, Cloudera, and Hortonworks. (You can download the report from EMC or IBM at no charge, though registration is required.)
The lone vendor in the "strong performers" wave is business intelligence and data-integration vendor Pentaho. In the "contender" category are DataStax, best known as the leading support provider for the open source Cassandra NoSQL database, analytics-focused Datameer, cluster-management players Platform Computing (recently acquired by IBM) and Zettaset , and indexing and search apps vendor Outerthought. The one vendor classified in the "risky bet" wave is middleware vendor HStreaming.
[ Want more background on how your peers are using Hadoop? Read Hadoop Spurs Big Data Revolution. ]
It's an "apples and oranges" collection of vendors, admitted analyst James Kobielus, the report's principle author, in an interview with InformationWeek. But that's because it's an "emerging markets" assessment, Kobielus explained, noting that it's very different from Wave reports that focus on more mature markets.
"In an immature market there may be no clear leader and the vendors might be differentiated quite widely into different segments," Kobielus said. What's more "the term 'solutions' is intended to make it clear that it's more than distributions and more than software; it's vendors with cloud, vendors with appliances and vendors that might have all of that in their portfolios."
The second surprise revolved around Forrester's Leader's Wave. Why, for example, are Amazon Web Services (AWS), IBM and EMC the three strongest players on Forrester's horizontal "strategy" axis, so far ahead of Cloudera, MapR and Hortonworks, all vendors closely associated with developing and supporting Hadoop? Cloudera, in particular, has the longest track record and largest customer base of any Hadoop software and services provider. But in focusing on enterprise needs, Kobielus said the analysis valued breadth of available software and services, noting that Hadoop is only part of what it takes to address big data.
"The vendors that have an enterprise data warehousing offering plus a Hadoop offering and possibly a big-data complex-event processing offering scored higher on strategy because they can address more applications and use cases," Kobielus said.
Amazon, for instance, offers a Relational Database Service and has strong support for real-time and low-latency applications, Kobielus said. (Creating an even broader portfolio, AWS last month introduced its DynamoDB NoSQL database service, though that was not considered in Forrester's assessment.)
IBM offers both the DB2 and Netezza relational platforms as well as Hadoop-integrated InfoSphere Streams complex event processing technology. With the latter IBM can deliver dashboards blending historical data from its BigInsights Hadoop platform as well as real-time information from Streams, Kobielus said.
EMC offers the Greenplum relational database and a modular data computing appliance that supports both structured data analysis and Hadoop MapReduce processing.
12 Hadoop Vendors To Watch In 2012
12 Hadoop Vendors To Watch In 2012 (click image for larger view and for slideshow)
Forrester's analysis and the weightings of the rankings are detailed in a scorecard on 15 criteria. The emphasis on considering more than just Hadoop made sense to Mike Maxey, senior director of product management EMC's Greenplum division, but he questioned EMC's low software packaging score given that it two Hadoop distributions: Greenplum HD based entirely on open source Apache Hadoop software and Greenplum MR based on MapR's partially proprietary distribution.
EMC was also surprised that two services-heavy vendors led the rankings, Maxey said. "Amazon doesn't even offer a [Hadoop] distribution... and IBM tends to push for a full, professional-services-led engagement," Maxey told InformationWeek. "There is value in services, but we find that many enterprises don't want to have all their data locked into a services engagement."
Kirk Dunn, Cloudera's Chief Operating Officer, said Forrester's strategy scoring, which counted for 50% of the vendor rankings, seemed to discount the importance of the vendor's Cloudera Manager admin and management software, which supports system deployment, ongoing monitoring, job and configuration management, and system optimization.
[ Want more background on how your peers are using Hadoop? Read Hadoop Spurs Big Data Revolution. ]
"Nobody has a product anything like [Cloudera Manager]," Dunn said. "We think that's highly strategic and one of the most important things that organizations are looking for when they try to make dependable, predictable use of Hadoop."
Dunn also quibbled with Forrester's placement of competitor MapR above Cloudera on the vertical "current offering" axis, questioning MapR's security capabilities and the fit and portability of the vendor's proprietary components in the "democratic environment" of open source Apache Hadoop.
A third surprise in the Forrester Wave report was the absence of Microsoft, Oracle, and Teradata, a deficiency noted in numerous Twitter comments since the report was released. Two of those absences are easily explained, as both Microsoft and Oracle failed to meet Forrester's August-September 2011 research deadline. Both companies announced their plans for Hadoop in early October.
Oracle recently partnered with Cloudera to bundle its Hadoop software distribution with its Big Data Appliance. Microsoft has introduced a cloud-based Hadoop service on its Azure platform, but release dates have not been set for the Windows-compatible Hadoop software distribution in plans to release in partnership with Hortonworks.
Teradata supports MapReduce processing by way of last year's acquisition of Aster Data. But that vendor's proprietary SQL-MapReduce approach is built on the vendor's relational database and did not fit Kobielus' definition because the mapping and reducing functions aren't performed on the Apache Hadoop code base.
Though they took exception to a few of the finer points of the ranking, both Dunn of Cloudera and Maxey of EMC gave Kobielus credit for his analysis of a very fragmented vendor set. "It's a snaphot in time of a fast-moving market, but [the report] has definitely created a fertile discussion," Dunn said. He predicted the report will lead to clearer vendor segmentations and much-needed definition around the term "big data."
Maxey of EMC advised researchers to look beyond the report's Wave ranking chart. "There's a lot of good information in there, so people should make a point of reading the entire report," he said.
About the Author
You May Also Like