Forrester published its first-ever Wave report for the emerging Hadoop market late last week and the results were surprising, both in terms of the "leaders wave" assessment and the absence of big vendors, including Microsoft, Oracle, and Teradata.
The first surprise in "The Forrester Wave: Enterprise Hadoop Solutions, Q1 2012" report is that it includes such a wide variety of vendors. In the "leaders" wave are, in rank order, cloud-services provider Amazon Web Services, broad information-management vendors IBM and EMC, and Hadoop distribution and services vendors, MapR, Cloudera, and Hortonworks. (You can download the report from EMC or IBM at no charge, though registration is required.)
The lone vendor in the "strong performers" wave is business intelligence and data-integration vendor Pentaho. In the "contender" category are DataStax, best known as the leading support provider for the open source Cassandra NoSQL database, analytics-focused Datameer, cluster-management players Platform Computing (recently acquired by IBM) and Zettaset , and indexing and search apps vendor Outerthought. The one vendor classified in the "risky bet" wave is middleware vendor HStreaming.
[ Want more background on how your peers are using Hadoop? Read Hadoop Spurs Big Data Revolution. ]
It's an "apples and oranges" collection of vendors, admitted analyst James Kobielus, the report's principle author, in an interview with InformationWeek. But that's because it's an "emerging markets" assessment, Kobielus explained, noting that it's very different from Wave reports that focus on more mature markets.
"In an immature market there may be no clear leader and the vendors might be differentiated quite widely into different segments," Kobielus said. What's more "the term 'solutions' is intended to make it clear that it's more than distributions and more than software; it's vendors with cloud, vendors with appliances and vendors that might have all of that in their portfolios."
The second surprise revolved around Forrester's Leader's Wave. Why, for example, are Amazon Web Services (AWS), IBM and EMC the three strongest players on Forrester's horizontal "strategy" axis, so far ahead of Cloudera, MapR and Hortonworks, all vendors closely associated with developing and supporting Hadoop? Cloudera, in particular, has the longest track record and largest customer base of any Hadoop software and services provider. But in focusing on enterprise needs, Kobielus said the analysis valued breadth of available software and services, noting that Hadoop is only part of what it takes to address big data.
"The vendors that have an enterprise data warehousing offering plus a Hadoop offering and possibly a big-data complex-event processing offering scored higher on strategy because they can address more applications and use cases," Kobielus said.
Amazon, for instance, offers a Relational Database Service and has strong support for real-time and low-latency applications, Kobielus said. (Creating an even broader portfolio, AWS last month introduced its DynamoDB NoSQL database service, though that was not considered in Forrester's assessment.)
IBM offers both the DB2 and Netezza relational platforms as well as Hadoop-integrated InfoSphere Streams complex event processing technology. With the latter IBM can deliver dashboards blending historical data from its BigInsights Hadoop platform as well as real-time information from Streams, Kobielus said.
EMC offers the Greenplum relational database and a modular data computing appliance that supports both structured data analysis and Hadoop MapReduce processing.