Software // Information Management
02:21 PM
Connect Directly
Repost This

Amazon Web Services Launches MapReduce Beta

Open-source Hadoop clusters running on Amazon EC2 promise scalable services for data-intensive computing.

Bringing the Hadoop MapReduce framework to its Elastic Compute Cloud (EC2) environment, Amazon today released a beta version of Amazon Elastic MapReduce. The Web service is aimed at giving analysts and researchers a way to process vast amounts of data more cost effectively.

MapReduce is a framework for using a large number of computer nodes, or clusters, to tackle data-intensive analyses. In the "Map" step, the master node breaks the query up into smaller sub-analyses and distributes these to the many nodes. In the "Reduce" step, the master node consolidates the answers to the sub-analyses and combines them to yield the result. The advantage is faster, massively parallel processing. Amazon's chosen flavor of MapReduce is Apache Hadoop, which is an open-source, Java-based framework.

Amazon says customers can quickly provision as much or as little Elastic MapReduce capacity as required for data-intensive operations such as data mining, financial analysis, scientific simulation, machine learning, log file analysis or bioinformatic research. Amazon EC2 customers including Netflix and eHarmony were quoted in Amazon's press release on Elastic MapReduce.

"MapReduce is a key component of our matching infrastructure," stated eHarmony Vice President of Technology Joseph Essas. "Amazon Elastic MapReduce cuts down on configuration and management time, making the entire process much more efficient."

In conventional deployments, whether running on Hadoop or other MapReduce-based clusters, time-consuming set up, management and tuning are required, according to Amazon. "Some researchers and developers already run Hadoop on Amazon EC2, and many of them have asked for even simpler tools for large-scale data analysis," stated Adam Selipsky, vice president of product management and developer relations for Amazon Web Services. "Amazon Elastic MapReduce makes crunching in the cloud much easier because it dramatically reduces the time, effort, complexity and cost of performing data-intensive tasks."

The service automatically launches and configures the number and type of Amazon EC2 instances specified by customers. To assist customers in executing data-intensive applications, Amazon Web Services is providing a number of MapReduce application samples and tutorials.

Amazon Elastic MapReduce service fees of 1.5 cents to 12 cents per hour are added to the standard Amazon EC2 charges of 10 cents to 80 cents per hour, depending on data volumes. Reserved-instance pricing is also available.

Comment  | 
Print  | 
More Insights
The Agile Archive
The Agile Archive
When it comes to managing data, donít look at backup and archiving systems as burdens and cost centers. A well-designed archive can enhance data protection and restores, ease search and e-discovery efforts, and save money by intelligently moving data from expensive primary storage systems.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Elite 100 - 2014
Our InformationWeek Elite 100 issue -- our 26th ranking of technology innovators -- shines a spotlight on businesses that are succeeding because of their digital strategies. We take a close at look at the top five companies in this year's ranking and the eight winners of our Business Innovation awards, and offer 20 great ideas that you can use in your company. We also provide a ranked list of our Elite 100 innovators.
Twitter Feed
Audio Interviews
Archived Audio Interviews
GE is a leader in combining connected devices and advanced analytics in pursuit of practical goals like less downtime, lower operating costs, and higher throughput. At GIO Power & Water, CIO Jim Fowler is part of the team exploring how to apply these techniques to some of the world's essential infrastructure, from power plants to water treatment systems. Join us, and bring your questions, as we talk about what's ahead.