Software // Information Management
News
4/2/2009
02:21 PM
Connect Directly
LinkedIn
Twitter
Google+
RSS
E-Mail
50%
50%

Amazon Web Services Launches MapReduce Beta

Open-source Hadoop clusters running on Amazon EC2 promise scalable services for data-intensive computing.

Bringing the Hadoop MapReduce framework to its Elastic Compute Cloud (EC2) environment, Amazon today released a beta version of Amazon Elastic MapReduce. The Web service is aimed at giving analysts and researchers a way to process vast amounts of data more cost effectively.

MapReduce is a framework for using a large number of computer nodes, or clusters, to tackle data-intensive analyses. In the "Map" step, the master node breaks the query up into smaller sub-analyses and distributes these to the many nodes. In the "Reduce" step, the master node consolidates the answers to the sub-analyses and combines them to yield the result. The advantage is faster, massively parallel processing. Amazon's chosen flavor of MapReduce is Apache Hadoop, which is an open-source, Java-based framework.

Amazon says customers can quickly provision as much or as little Elastic MapReduce capacity as required for data-intensive operations such as data mining, financial analysis, scientific simulation, machine learning, log file analysis or bioinformatic research. Amazon EC2 customers including Netflix and eHarmony were quoted in Amazon's press release on Elastic MapReduce.

"MapReduce is a key component of our matching infrastructure," stated eHarmony Vice President of Technology Joseph Essas. "Amazon Elastic MapReduce cuts down on configuration and management time, making the entire process much more efficient."

In conventional deployments, whether running on Hadoop or other MapReduce-based clusters, time-consuming set up, management and tuning are required, according to Amazon. "Some researchers and developers already run Hadoop on Amazon EC2, and many of them have asked for even simpler tools for large-scale data analysis," stated Adam Selipsky, vice president of product management and developer relations for Amazon Web Services. "Amazon Elastic MapReduce makes crunching in the cloud much easier because it dramatically reduces the time, effort, complexity and cost of performing data-intensive tasks."

The service automatically launches and configures the number and type of Amazon EC2 instances specified by customers. To assist customers in executing data-intensive applications, Amazon Web Services is providing a number of MapReduce application samples and tutorials.

Amazon Elastic MapReduce service fees of 1.5 cents to 12 cents per hour are added to the standard Amazon EC2 charges of 10 cents to 80 cents per hour, depending on data volumes. Reserved-instance pricing is also available.

Comment  | 
Print  | 
More Insights
The Agile Archive
The Agile Archive
When it comes to managing data, donít look at backup and archiving systems as burdens and cost centers. A well-designed archive can enhance data protection and restores, ease search and e-discovery efforts, and save money by intelligently moving data from expensive primary storage systems.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Tech Digest September 24, 2014
Start improving branch office support by tapping public and private cloud resources to boost performance, increase worker productivity, and cut costs.
Video
Slideshows
Twitter Feed
InformationWeek Radio
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.