SAS Gets Hip To Hadoop For Big Data

SAS High-Performance Analytic Server heads for Hadoop, bringing SAS data mining, text mining, optimization, and forecasting capabilities beyond the relational database world.

Doug Henschen, Executive Editor, Enterprise Apps

October 14, 2012

3 Min Read
InformationWeek logo in a gray background | InformationWeek

SAS didn't divulge the cost of its software or discuss whether licensing models would change to adapt to the open source, big data world, but the vendor clearly expects to do more on the Hadoop platform. A SAS High-Performance Marketing Optimization application announced last week will run on Hadoop with a release expected next April. For now that application, which is built on the HPA Server, runs on EMC Greenplum and Teradata.

High-Performance Marketing Optimization is designed to speed marketing campaign analysis when organizations are dealing with thousands of campaigns and millions or even billions of customer records. At this scale it can take eight to 12 hours or more to run optimizations -- either to identify the most successful marketing offer or the most responsive target audience -- but running on EMC Greenplum or Teradata, that analysis might take just a few minutes.

Once the marketing app can run on Hadoop it will presumably be able to handle campaigns with even bigger and more varied data sets, and users will be able to quickly analyze new data sets without the delays associated with transforming data to meet a rigid, predefined data model as required in relational environments.

Among the other announcements made by SAS last week were a new stream processing engine and updates of the SAS Text Analytics module and SAS Model Manager. SAS DataFlux Event Stream Processing (ESP) is designed to bring the vendor's fraud and risk-analysis chops to the world of real-time data processing.

[ Want more on analytics alternatives? Read Low-Cost Options For Predictive Analytics Challenge SAS, IBM. ]

Stream processing is mostly practiced by financial institutions for their trading-floor applications. These firms usually have real-time infrastructure in place, but ESP is designed to deliver a callable, embeddable service that these operations can bring into their operational environment to handle tasks such as trade surveillance. Other possible applications include anti-money-laundering surveillance for banks, just-in-time inventory and purchasing for retailers, customer intelligence in e-commerce scenarios and customer-churn analysis for telcos.

The SAS Text Mining update incorporates automated learning and rule-generation features that will back up domain expertise with machine-learning algorithms. For example, a domain expert might have some idea what to expect from a batch of warranty claims, insurance claims, customer surveys, or a stream of social comments.

Machine-analysis offers a second opinion that uncovers human content-categorization errors, oversights, and hidden relationships between, say, a particular model of car and warranty claims, accidents, and customer dissatisfaction. Once machine learning has uncovered latent patterns, domain experts can work with and customize system-generated classification rules to develop more accurate text-mining results and applications, according to SAS.

In keeping with SAS' push into big data, the SAS Model Manager updates are partly aimed at embracing non-SAS predictive models including those written in PMML (Predictive Model Markup Language) and R. Where the model manager previously focused on SAS models, this administrative tool can now be used to import, validate, publish, deploy, and monitor the ongoing performance of third-party models. The upgrade also delivers new statistical measures around probability-to-default and loss measures tied to Basel II compliance for financial institutions.

If you review the bulk of SAS' many announcements last week, you can see two clear patterns. One, it's delivering a steady stream of updates for customers using conventional tools and tried-and-true approaches to do analysis at less-than-big-data scale. Second, at the cutting edge of big data and high-performance computing, SAS is acknowledging that the same old tools and years-long product-delivery timelines just won't cut it with this new class of customers.

About the Author

Doug Henschen

Executive Editor, Enterprise Apps

Doug Henschen is Executive Editor of InformationWeek, where he covers the intersection of enterprise applications with information management, business intelligence, big data and analytics. He previously served as editor in chief of Intelligent Enterprise, editor in chief of Transform Magazine, and Executive Editor at DM News. He has covered IT and data-driven marketing for more than 15 years.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like


More Insights