Software // Information Management
Commentary
4/5/2011
07:44 PM
Doug Henschen
Doug Henschen
Commentary
Connect Directly
Google+
LinkedIn
Twitter
RSS
E-Mail
50%
50%
Repost This

SAS Teams With EMC And Teradata -- And Not IBM -- On High-Performance Analytics

New approach supports high-performance modeling and scoring against complete data sets, promising greater accuracy in predictive risk-analysis, fraud-detection and optimization scenarios.

Analytic software giant SAS on Tuesday announced what it describes as a breakthrough approach in high-performance big-data analysis.

And in a apparent realignment in the wake of last year's big-data consolidation, SAS is partnering most closely with EMC and Teradata. EMC last year acquired Greenplum while Teradata recently acquired Aster Data. EMC and Teradata are independents that do not compete head-on with SAS as does IBM, which is recent years has acquired SAS competitors Cognos and SPSS. Late last year IBM also acquired Netezza, a big-data appliance vendor (and SAS partner) that goes head to head with EMC Greenplum and Teradata.

In related news, EMC announced new appliances yesterday that will step up competition with IBM and Teradata, but more on that in moment.

SAS's new approach, called High-Performance Analytics (HPA), is an advance over the vendor's prior in-database processing approach in that it supports modeling as well as scoring procedures. Modeling and scoring are foundations of analytics, and in-database processing moves the scoring step to powerful databases rather than taking hours or even days to move huge data volumes into and out of the separate, low-powered servers used to run analytics software.

What's more, HPA supports modeling against complete data sets, eliminating the need for sampling and ensuring more accurate predictive models for risk analysis, fraud detection, optimization and other needs. It does so by relying on EMC and Teradata's powerful parallel-processing environments, which spread the analysis work across hundreds if not thousands of independent compute nodes.

"This gives statisticians an ad-hoc model-building environment where they can use all their data with all their variables," said Michelle Wilkie, high performance computing product manager. "Instead of making educated guesses about how to sample and do variable selection, they can develop the most accurate [prediction] of the truth across all their data and the variables they select."

The push from the business side these days is for predictive, analytic insight rather than after-the-fact reporting, but that requires analysis of the data. With terabyte-scale data volumes now being the rule in so many business domains -- banking, insurance, retail and marketing to name a few -- analytic professionals have been struggling to model and score the data.

In recent years SAS has partnered with Teradata, Netezza, IBM and Aster Data to address part of the problem by moving scoring to the processing platform with the in-database approach. SAS HPA builds on, rather than replaces in-database processing, but it adds techniques to support modeling as well as scoring.

SAS described the advance in a statement as "going beyond in-memory databases or in-memory OLAP," a clear swipe at developments such as SAP's HANA (High-Performance Analytic Appliance) -- with SAP being another player that's competing head-on with SAS by way of BusinessObjects.

Working with EMC and Teradata, SAS said it will enable customers to solve "big analytics" problems that formerly took hours or days in a matter of seconds.

In a lab-based test involving a single logistic regression, for example, a modeling procedure that took more than a day to run previously was processed in about 80 seconds, according to Wilkie. For statisticians who are constantly tuning and tweaking models, "this completely changes their world," she said.

As promising the development sounds, SAS HPA won't be released until the fourth quarter, and not all observers are convinced it will be a breakthrough.

"SAS has been trying for years to deliver in-database parallelized modeling," said independent analyst Curt Monash of Monash Research. "Maybe SAS will have more success this time, but we need to wait and see."

Previous
1 of 2
Next
Comment  | 
Print  | 
More Insights
The Agile Archive
The Agile Archive
When it comes to managing data, donít look at backup and archiving systems as burdens and cost centers. A well-designed archive can enhance data protection and restores, ease search and e-discovery efforts, and save money by intelligently moving data from expensive primary storage systems.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Elite 100 - 2014
Our InformationWeek Elite 100 issue -- our 26th ranking of technology innovators -- shines a spotlight on businesses that are succeeding because of their digital strategies. We take a close at look at the top five companies in this year's ranking and the eight winners of our Business Innovation awards, and offer 20 great ideas that you can use in your company. We also provide a ranked list of our Elite 100 innovators.
Video
Slideshows
Twitter Feed
Audio Interviews
Archived Audio Interviews
GE is a leader in combining connected devices and advanced analytics in pursuit of practical goals like less downtime, lower operating costs, and higher throughput. At GIO Power & Water, CIO Jim Fowler is part of the team exploring how to apply these techniques to some of the world's essential infrastructure, from power plants to water treatment systems. Join us, and bring your questions, as we talk about what's ahead.