Teradata Scales Up R Analytics On Aster - InformationWeek
Data Management // Big Data Analytics
09:06 AM
Connect Directly
IT Leader's Guide to the Cybersecurity Landscape
May 11, 2017
Thanks to a never-ending stream of major and well-publicized data breaches, security has become on ...Read More>>

Teradata Scales Up R Analytics On Aster

Teradata Aster already supports MapReduce, graph, and SQL analysis. Aster R feature adds faster, scaled, R-based analytics.

10 More Powerful Facts About Big Data
10 More Powerful Facts About Big Data
(Click image for larger view and slideshow.)

Teradata announced last week that it's bringing support for open-source, R-based analytics to its data-discovery-focused Aster database with a new feature called Aster R.  

Aster R is aimed at speeding and scaling R-based analytics on a massively parallel processing platform. R is the primary tool used by 24% of data miners, and 70% of data miners are using R in some way, according to the Rexer 2013 Data Miner Survey. Teradata is responding to that demand while addressing the fact that standard R tools are not terribly scalable.

"Users often can't handle big-data analyses with R because they can't bring all that data down to a desktop or server, and they run out of memory or processing power," explained Chris Twogood, Teradata's VP of product and services marketing, in an interview.

[The Aster strategy is reminiscent of Apache Spark. Read Will Spark, Google Dataflow Steal Hadoop's Thunder?]

Teradata has taken R's most popular algorithms and parallelized them to optimize their performance on Aster. This ensures that R programmers can spread analyses across nodes without a lot of programming, according to Twogood. In addition, Teradata has prepackaged more than 20 R functions into the Aster R library, and it also includes a Parallel Constructor utility that lets users run their favorite R functions and algorithms on Aster.

Given R's rising popularity, Teradata isn't the only company jumping on the bandwagon. InformationBuilders, Oracle, SAP, and even SAS and Tibco Spotfire have added support for R. And several vendors, including Alpine Data Labs and Revolution Analytics, also have addressed scale-out analysis on clustered servers. In fact, Teradata has a deal with Revolution whereby its R analytics can run in-database inside of Teradata.

Teradata's advantage with Aster R is that it has a broader data-discovery platform that addresses more than just R. Teradata acquired Aster Data in 2011 in large part because it offers SQL-based MapReduce processing, so it can handle techniques such as text, time-series, and statistical analysis as well as standard SQL analysis. Teradata added support for graph analysis to Aster last fall. With Aster R, the platform becomes even more comprehensive, yet all these engines are invoked with familiar SQL programming.

"You could have a query that invokes the Aster SQL engine, MapReduce engine, graph engine, and R, and the Aster optimizer will synchronize resources and ensure that all those engines share data and talk to each other," says Twogood. In a customer-churn analysis scenario, for example, companies might want to combine R-based logistic regression, MapReduce-based time-series analysis, SQL-based customer segmentation, and graph-based network-influencer analysis to spot influential, high-value customers likely to bolt.

As we pointed out in this week's extensive coverage of Apache Spark, Teradata seems to share the same strategy as this four-year-old open-source platform, albeit using Aster, a commercial database. Teradata's new R support in Aster is expected to become a beta offering later this month, and general release is expected in the fourth quarter.

InformationWeek's June Must Reads is a compendium of our best recent coverage of big data. Find out one CIO's take on what's driving big data, key points on platform considerations, why a recent White House report on the topic has earned praise and skepticism, and much more.

Doug Henschen is Executive Editor of InformationWeek, where he covers the intersection of enterprise applications with information management, business intelligence, big data and analytics. He previously served as editor in chief of Intelligent Enterprise, editor in chief of ... View Full Bio

Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
Doug Henschen
Doug Henschen,
User Rank: Moderator
7/3/2014 | 1:01:18 PM
I've had my doubts about Aster, but you have to give Teradata credit
I've gone from excited to disillusioned and back to somewhat interested in Teradata's strategy for Aster. When Teradata first acquired in 2011, I thought it was a good move that would broaden Teradata's capability to MapReduce-based analysis including time-series, text, and other approaches all expressed in SQL. When Hadoop really took off, I started thinking, why would anybody want to run Teradata and Aster and Hadoop? That's three MPP environments, and it's not trivial or inexpensive running any one of them.

Given the latest advancements in Aster -- adding graph and R-based analyses to the MapReduce and SQL options already available -- I now see Aster as an option for Teradata shops that want all those analysis options sooner, rather than later. The key draw is that all of these analysis approaches are handled in SQL. So Aster users can count on SQL expertise instead of hiring the types of experts that would be needed to master each of the engines available within, say, Apache Spark. That's the billing, anyway. Anybody have insight on the cost of Aster vs. running Spark or the ease of mastering all the analysis options in Aster vs. those available on Spark? The devil is in the details.
How Enterprises Are Attacking the IT Security Enterprise
How Enterprises Are Attacking the IT Security Enterprise
To learn more about what organizations are doing to tackle attacks and threats we surveyed a group of 300 IT and infosec professionals to find out what their biggest IT security challenges are and what they're doing to defend against today's threats. Download the report to see what they're saying.
Register for InformationWeek Newsletters
White Papers
Current Issue
IT Success = Storage & Data Center Performance
Balancing legacy infrastructure with emerging technologies requires laying a solid foundation that delivers flexibility, scalability, and efficiency. Learn what the most pressing issues are, how to incorporate advances like software-defined storage, and strategies for streamlining the data center.
Twitter Feed
InformationWeek Radio
Archived InformationWeek Radio
Join us for a roundup of the top stories on InformationWeek.com for the week of November 6, 2016. We'll be talking with the InformationWeek.com editors and correspondents who brought you the top stories of the week to get the "story behind the story."
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.
Flash Poll