Software // Information Management
News
11/28/2012
02:30 PM
Connect Directly
Google+
LinkedIn
Twitter
RSS
E-Mail
50%
50%

Amazon Debuts Low-Cost, Big Data Warehousing

Amazon Redshift service promises ten times faster query performance than conventional on-premises data warehouses, at one-tenth the price.

Amazon Web Services (AWS) on Wednesday announced Amazon Redshift, a cloud-based data warehouse service that it says will deliver better scalability and performance than conventional on-premises data warehouses at dramatically lower costs.

"We did the math and found that it generally costs between $19,000 and $25,000 per terabyte per year, at list prices, to build and run a good-sized data warehouse on your own," stated AWS Evangelist Jeff Barr in a blog on the announcement. "Amazon Redshift, all-in, will cost you less than $1,000 per terabyte per year."

Promising more than a cost advantage, Amazon said its managed service approach also liberates data warehouse administrators from the tasks of monitoring, tuning, doing backups, patching software and recovering from faults. Users launch and manage Redshift nodes and clusters from the AWS Management Console, and Amazon said they can start with a few hundred gigabytes and scale up to more than a petabyte.

Redshift is based on relational database technology, so it uses SQL as its query language and is compatible with existing BI tools. It's pretty clear that the database in question is ParAccel, as Amazon is an investor in that company and statements about Redshift acknowledge licensing key technology from the company.

[ Want more on ParAccel, the database behind Redshift? Read ParAccel Jumps On Analytics Bandwagon. ]

ParAccel's database includes advanced features such as columnar data storage and advanced compression, but these are also offered by competitors including EMC Greenplum, HP Vertica and Teradata, and they are promised in the next release of Oracle Database. Despite Amazon's "ten times faster" claim, performance will clearly vary depending on the workload and the "conventional database" point of comparison.

The distinction between the previously available Amazon Relational Database Service (RDS) and Redshift is that the latter is exclusively for warehousing and analytics (as opposed to transactional database uses) and is capable of big-data scale. "RDS is based on Microsoft SQL Server, Oracle and MySQL, and those aren't systems that are designed to do petabyte-scale data warehousing," said Jaspersoft's Karl Van den Bergh, VP of product and alliances. Jaspersoft is one of two initial business intelligence partners on Redshift, along with MicroStrategy, though Amazon said that other BI partners will soon follow.

Despite the potential for big data analysis, Amazon seemed intent to highlight the potential for small and midsize companies to get into data warehousing at a very low cost. Customers can spin up two node types, including either 2 terabytes or 16 terabytes of compressed customer data per node. Pricing starts at $0.85 per hour for a 2-terabyte data warehouse. Reserved-instance pricing lowers the price to $0.228 per hour, or under $1,000 per terabyte, per year, according to Amazon.

"Like anything that Amazon does, they're disrupting the market and offering something that nobody else has been able to offer from a cost-value perspective," said Van den Bergh. "This is a big deal for the data warehousing space, so it will be interesting to see how much uptake it gets."

One thing Amazon doesn't address in detail on its Redshift site is just how companies large and small will upload and synchronize their data with Redshift. Uploading data from one source isn't complicated, but the delays and complexities of data movement multiply as the number of sources increases. Presumably, BI systems will also have to operate in the cloud in order to avoid the potentially time-consuming step of moving data back and forth between on-premises systems and the cloud.

Amazon representatives were not available for comment at press time, but InformationWeek will follow up with deeper analysis of Redshift capabilities and how it might impact the data warehousing industry.

Predictive analysis is getting faster, more accurate and more accessible. Combined with big data, it's driving a new age of experiments. Also in the new, all-digital Advanced Analytics issue of InformationWeek: Are project management offices a waste of money? (Free registration required.)

Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
Mike Lamble
50%
50%
Mike Lamble,
User Rank: Apprentice
11/30/2012 | 10:32:17 PM
re: Amazon Debuts Low-Cost, Big Data Warehousing
Amazon's Redshift announcement validates that enterprises are ready for cloud-based big data warehousing solutions. XtremeData, also available on Amazon as well as other clouds, is targeted for organizations that need a massively scalable DBMS solution for mixed read and write workloads, for example, with serious ELT. Redshift (a column-store licensed from ParAccel) is well-suited for read-only data marts of all sizes. The market is rapidly moving to a tipping point where the specialized solutions available on premise are becoming available on the cloud, Amazon and others.
The Agile Archive
The Agile Archive
When it comes to managing data, donít look at backup and archiving systems as burdens and cost centers. A well-designed archive can enhance data protection and restores, ease search and e-discovery efforts, and save money by intelligently moving data from expensive primary storage systems.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Government Tech Digest Oct. 27, 2014
To meet obligations -- and avoid accusations of cover-up and incompetence -- federal agencies must get serious about digitizing records.
Video
Slideshows
Twitter Feed
InformationWeek Radio
Archived InformationWeek Radio
Join us for a roundup of the top stories on InformationWeek.com for the week of October 26, 2014 and for the incredible Friday Afternoon Conversation that runs beside the program.
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.