In-Database Analytics: A Passing Lane for Complex Analysis - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Software // Information Management
12:06 PM
Connect Directly

In-Database Analytics: A Passing Lane for Complex Analysis

What once took one company three to four weeks now takes four to eight hours thanks to in-database computation. Here's what Netezza, Teradata, Greenplum and Aster Data Systems are doing to make it happen.

In-Database is Nothing New

The in-database initiatives build on capabilities first commercialized in mid-'90s object-relational (OR) database systems from IBM, Illustra/Informix (now IBM), and Oracle. The OR systems let users create custom data types, index methods, and functions — for geospatial, textual, and time-series data, for instance. The vendors provided packaged DBMS extension modules, but user coding, which entailed SQL or C-language programming, never caught on.

The recent releases are designed to be easier to program than earlier OR implementations, and the parallelization greatly speeds code execution. Netezza, for example, has come up with an object-oriented development environment "that allows developers to concentrate on getting the algorithms right," says Netezza Vice President of Marketing Phil Francisco. The vendor furnishes an applications test bed, and it also provides development facilities and access to technology to customer, partner, and academic members of the Netezza Developer Network (NDN), which the company launched in September 2007. As of October 2008, Netezza counted more than 100 NDN members and 250 individuals trained to develop on the Netezza platform.

Teradata was founded in 1979, 21 years before Netezza's launch, and Teradata Chief Development Officer Scott Gnau boasts that the company has provided a framework for C-language database extensions since 1993. He points to high-performance, database-embedded encryption/decryption capabilities implemented by Teradata partner Protegrity as an illustration of the speed that can be realized with database-embedded processing. Protegrity claims in-database performance of more than 6 million decryptions and more than 9 million encryptions per second.

Teradata is partnering with SAS to implement in-database analytics. The companies' partnership, announced a year ago, will bear fruit with first-half 2009 general-availability of the Teradata 13 platform and major parts of the SAS 9.2 release in the first quarter. The releases will include versions of a number of SAS algorithms recoded, some with SQL and others with user-defined functions and types, to take full advantage of Teradata's parallel architecture. Gnau adds that the partnership has already led to the release of in-database scoring of data-mining models exported from SAS.

"The SAS-Teradata partnership is such a cool thing because it bridges a cultural divide between the humans and the Vulcans — the database guys and the analytics types," says Gnau.

SAS says neural networks and linear regressions, as well as a series of base-SAS procedures, are slated for Teradata optimization with SAS 9.2. A shift in data-integration from extract-transform-load (ETL) to extract-load-transform (ELT) is also planned. That change will push computationally intensive data manipulation into the database. Other candidates for Teradata optimization include SAS Risk Management, which would be ported to use Teradata's Financial Services Logical Data Model (FSLDM). Credit/risk analyses for money laundering detection are also ideal candidates for database-embedded analytics, according to SAS.

Both SAS and Teradata have other development partners. For instance, Teradata's Gnau mentions data-mining vendors KXEN and Visual Numerics in the analytics arena, while SAS Global BI Product Marketing Manager Tammi Kay George says a long-standing SAS relationship with Teradata-competitor Netezza could potentially result in similar embedding of analytics. For now, the SAS-Netezza analytics-DBMS interface is limited to use of SAS/Access, optimized for the Netezza platform, to tap Netezza data sources.

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
2 of 3
Comment  | 
Print  | 
More Insights
10 Top Cloud Computing Startups
Cynthia Harvey, Freelance Journalist, InformationWeek,  8/3/2020
How Enterprises Can Adopt Video Game Cloud Strategy
Joao-Pierre S. Ruth, Senior Writer,  7/28/2020
Conversational AI Comes of Age
Guest Commentary, Guest Commentary,  8/7/2020
White Papers
Register for InformationWeek Newsletters
Current Issue
Special Report: Why Performance Testing is Crucial Today
This special report will help enterprises determine what they should expect from performance testing solutions and how to put them to work most efficiently. Get it today!
Flash Poll