SAS Teams With EMC And Teradata -- And Not IBM -- On High-Performance Analytics
New approach supports high-performance modeling and scoring against complete data sets, promising greater accuracy in predictive risk-analysis, fraud-detection and optimization scenarios.
Analytic software giant SAS on Tuesday announced what it describes as a breakthrough approach in high-performance big-data analysis.
And in a apparent realignment in the wake of last year's big-data consolidation, SAS is partnering most closely with EMC and Teradata. EMC last year acquired Greenplum while Teradata recently acquired Aster Data. EMC and Teradata are independents that do not compete head-on with SAS as does IBM, which is recent years has acquired SAS competitors Cognos and SPSS. Late last year IBM also acquired Netezza, a big-data appliance vendor (and SAS partner) that goes head to head with EMC Greenplum and Teradata.
In related news, EMC announced new appliances yesterday that will step up competition with IBM and Teradata, but more on that in moment.
SAS's new approach, called High-Performance Analytics (HPA), is an advance over the vendor's prior in-database processing approach in that it supports modeling as well as scoring procedures. Modeling and scoring are foundations of analytics, and in-database processing moves the scoring step to powerful databases rather than taking hours or even days to move huge data volumes into and out of the separate, low-powered servers used to run analytics software.
What's more, HPA supports modeling against complete data sets, eliminating the need for sampling and ensuring more accurate predictive models for risk analysis, fraud detection, optimization and other needs. It does so by relying on EMC and Teradata's powerful parallel-processing environments, which spread the analysis work across hundreds if not thousands of independent compute nodes.
"This gives statisticians an ad-hoc model-building environment where they can use all their data with all their variables," said Michelle Wilkie, high performance computing product manager. "Instead of making educated guesses about how to sample and do variable selection, they can develop the most accurate [prediction] of the truth across all their data and the variables they select."
The push from the business side these days is for predictive, analytic insight rather than after-the-fact reporting, but that requires analysis of the data. With terabyte-scale data volumes now being the rule in so many business domains -- banking, insurance, retail and marketing to name a few -- analytic professionals have been struggling to model and score the data.
In recent years SAS has partnered with Teradata, Netezza, IBM and Aster Data to address part of the problem by moving scoring to the processing platform with the in-database approach. SAS HPA builds on, rather than replaces in-database processing, but it adds techniques to support modeling as well as scoring.
SAS described the advance in a statement as "going beyond in-memory databases or in-memory OLAP," a clear swipe at developments such as SAP's HANA (High-Performance Analytic Appliance) -- with SAP being another player that's competing head-on with SAS by way of BusinessObjects.
Working with EMC and Teradata, SAS said it will enable customers to solve "big analytics" problems that formerly took hours or days in a matter of seconds.
In a lab-based test involving a single logistic regression, for example, a modeling procedure that took more than a day to run previously was processed in about 80 seconds, according to Wilkie. For statisticians who are constantly tuning and tweaking models, "this completely changes their world," she said.
As promising the development sounds, SAS HPA won't be released until the fourth quarter, and not all observers are convinced it will be a breakthrough.
"SAS has been trying for years to deliver in-database parallelized modeling," said independent analyst Curt Monash of Monash Research. "Maybe SAS will have more success this time, but we need to wait and see."
The Agile ArchiveWhen it comes to managing data, donít look at backup and archiving systems as burdens and cost centers. A well-designed archive can enhance data protection and restores, ease search and e-discovery efforts, and save money by intelligently moving data from expensive primary storage systems.
2014 Analytics, BI, and Information Management SurveyITís tried for years to simplify data analytics and business intelligence efforts. Have visual analysis tools and Hadoop and NoSQL databases helped? Respondents to our 2014 InformationWeek Analytics, Business Intelligence, and Information Management Survey have a mixed outlook.