MapR's Hadoop Platform Hits Amazon

New multi-tenancy feature in 2.0 version of Hadoop distribution lets Amazon offer MapR as part of its Elastic MapReduce (EMR) service.
Amazon's 7 Cloud Advantages: Hype Vs. Reality
Amazon's 7 Cloud Advantages: Hype Vs. Reality
(click image for larger view and for slideshow)
MapR has announced a 2.0 version of its Hadoop software distribution that will incorporate a handful of important new features. But one key upgrade announced on Wednesday, support for multi-tenancy, has made it possible for Amazon to offer MapR as part of its Elastic MapReduce (EMR) service.

Of course, anybody can run software in Amazon's cloud, but in MapR's case, its M3 and M5 editions are available as optional services that can be selected from an EMR drop-down menu. The services also have been integrated with other Amazon services, including S3 storage and the DynamoDB NoSQL database. The M3-based service is available at no extra charge--beyond the standard Amazon EMR service fees. M5, which adds snapshotting, mirroring, and other high-availability (HA) features, involves what MapR described as "nominal" upcharges.

In both cases "the billing goes through Amazon and the level-one support is Amazon, so it's really an Amazon service," Jack Norris, MapR's VP of marketing, told InformationWeek. "We're pretty excited about this and think it's a testament to our product differentiation."

You won't find Cloudera or Hortonworks distributions integrated with EMR--even if they're available on AWS--so the affiliation with Amazon is a feather in MapR's cap. There are other EMR-integrated, Hadoop-related services on Amazon Web Services, such as Karmasphere Analytics, but no direct competitors to MapR.

[ Want more on Hadoop? Read Hortonworks Releases Its (Conservative) Hadoop Platform. ]

Since the launch of its first Hadoop distribution in June 2011, MapR has set itself apart as a high-performance Hadoop platform offering enterprise-oriented HA features. It has delivered these features by replacing the Hadoop Distributed File System (HDFS), known to be vulnerable to outages, with a derivative of Unix-based NFS (network file system).

The Apache Hadoop community has addressed HDFS weaknesses in a 2.0 release that has yet to reach general availability. Nonetheless, Cloudera last week introduced a new distribution based on the 2.0 software and its new HA features. Hortonworks took another tack with the long-awaited distribution it introduced at this week's Hadoop Summit. Hortonworks is sticking with the Hadoop 1.0 codeline for now, but has added HA features supported by VMware virtualization software.

MapR's 2.0 release won't reach general availability until the third quarter. Amazon currently is running the 1.28 versions of M3 and M5, but the software it's running has been upgraded with the multi-tenancy capabilities developed for MapR 2.0. Multi tenancy enables physical Hadoop clusters to be logically partitioned to provide separate systems administration, data placement, and job management.

Other significant upgrades to MapR 2.0 include a bevy of new administrative features within the MapR Control System management software. New job monitoring and management features provide a graphical view of the time and resources used by various jobs and tasks. Line charts and histograms give administrators a fine-grained understanding of system performance.

Central logging features introduced in 2.0 are designed to make it easier to diagnose problems such as failed jobs without aggregating logs and doing node-by-node investigation. MapR claims it is alone in offering visibility into cluster performance down to individual disk drives.

"We know when a drive is about to fail so you can replace it from the UI," said Tomer Shiran, MapR's director of product management. "Our competitors can't provide that lower level of management because HDFS is a Java application on top of a local Linux file system, so there is no visibility into the drives."

With Cloudera, HortonWorks, and MapR all announcing new distributions within the last week, it's a clear sign that competition is hot and heavy in the emerging world of Hadoop. Cloudera is looking to build on its marketshare lead, which stems from its early entry into the market in 2008 and its many partnerships. Hortonworks is toeing a conservative line, looking for big enterprise deals and banking on powerful partnerships with the likes of IBM, Microsoft, and Teradata. MapR is appealing to performance-minded Hadoop veterans that are looking to eke out more performance and scale.

Norris said that MapR is growing quickly and will be the market leader in terms of license "within the quarter," but he wouldn't say how many customers are paying for support. Perhaps the next sign of maturity in the Hadoop market will see these three vendors (and others, like EMC and IBM) getting together with Gartner, Forrester, and/or IDG to release authoritative figures on Hadoop deployments by distribution and perhaps even average cluster sizes.

Microsoft's ambitious new OS tackles servers, PCs, and mobile devices. On the server side, we dig into the latest offering: Microsoft has boosted the capabilities of Hyper-V, streamlined management, and made other changes that IT will appreciate. Download the Windows 8 Vs. The World report now. (Free registration required.)

Editor's Choice
John Edwards, Technology Journalist & Author
Carrie Pallardy, Contributing Reporter
Alan Brill, Senior Managing Director, Cyber Risk, Kroll
John Bennett, Global Head of Government Affairs, Cyber Risk, Kroll
Sponsored by Lookout, Sundaram Lakshmanan, Chief Technology Officer
Brandon Taylor, Digital Editorial Program Manager
Jessica Davis, Senior Editor
Richard Pallardy, Freelance Writer
Sponsored by Lookout, Sundaram Lakshmanan, Chief Technology Officer
Sara Peters, Editor-in-Chief, InformationWeek / Network Computing