Performing ETL With SQL - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

IoT
IoT
Software // Information Management

Performing ETL With SQL

Who needs an engine? Not Sunopsis V3, which employs SQL to bring a fresh approach to ETL.

The lifeline for any business intelligence solution is the process called extract, transform, and load (ETL). According to a recent survey by The Data Warehousing Institute (TDWI), ETL design and development work consumes 60 to 80 percent of the resources of an entire BI project. Selecting an ETL solution that fits your needs is an essential first step in building a successful data warehouse. Today, products with proprietary engines dominate the ETL market. Sunopsis brings a fresh perspective to the ETL architecture with its SQL-based product, Sunopsis v3. Instead of using an engine, Sunopsis uses the power of the database to process the data.

A Breed Apart

The ETL market is highly varied. It includes traditional players such as Informatica and Ascential Software, analytic tool specialists such as Business Objects and Cognos, and even big database vendors such as Oracle, IBM, and Microsoft. Major players such as Informatica PowerCenter and Ascential DataStage use dedicated engines to process data. The engine is a single point of control and thus can be administered and tuned for optimal ETL performance. On the other hand, products such as ETI Solution from Evolutionary Technologies International and Oracle Warehouse Builder generate code that can be run on source or target systems. Conceptually, Sunopsis embraces this approach too in that it avoids the overhead of using an engine. However, its uniqueness lies in the fact that it's able to generate SQL code for most databases available today.

Over the years, SQL has evolved from being just a query language into a rich development platform. Today, it encompasses complex analytic and data transformation functions such as rank, case, and rollup within its fold. Moreover, database vendors have invested considerable time and money in improving the performance of their queries. Sunopsis takes advantage of these developments and uses the database as its platform of choice and SQL as its primary language. A transformation can be processed where it works best — either on the source or on the target database. It can even be run on an independent staging database if need be. Furthermore, if you have existing database code in the form of stored procedures and functions, Sunopsis can exploit this work by incorporating the code into its workflow.

Pros

1. No ETL engine means less overhead — the engine doesn't become a bottleneck.
2. Easy to install.
3. Shorter learning curve, no propietary language to learn.
4. A well-integrated MOM tool along with the ETL tool.
5. Platform independent (written in Java).
6. A favorable price tag.

Cons

1. Can't tune a centralized engine to improve performance — the performance depends on the underlying databases.
2. Data quality features not fully evolved.
3. Doesn't ship with native database drivers.
4. Complex transformation functions not included.

Data Integration Platform

A good ETL tool should provide efficient interfaces to the heterogeneous systems that make up your environment. Sunopsis uses JDBC type 4 drivers to connect to databases and XML files. This approach offers a two-fold advantage over the earlier JDBC drivers. A type 4 driver provides improved performance because it connects directly to the database listener, with no client software needed. It also allows existing JDBC applications to use XML documents as true JDBC data sources. Sunopsis also provides Java Connectivity Architecture (JCA) 2.0 connectivity to ERP, CRM, and other packaged applications.

The current trend in the data integration market is toward consolidation. Vendors are rushing to combine their ETL architecture with the enterprise application integration (EAI) framework, so that they can offer a complete data integration solution. Sunopsis does a commendable job in this arena by providing its own message queue product, Sunopsis MQ. This is a message-oriented middleware (MOM) component that's compliant with the Java Message Services (JMS) specification. The best part about it is that the ETL product is also well knit with the EAI software. For example, Sunopsis ETL can seamlessly use the message queue as either a data source or a target.

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Previous
1 of 2
Next
Comment  | 
Print  | 
More Insights
Slideshows
Top-Paying U.S. Cities for Data Scientists and Data Analysts
Cynthia Harvey, Freelance Journalist, InformationWeek,  11/5/2019
Slideshows
10 Strategic Technology Trends for 2020
Jessica Davis, Senior Editor, Enterprise Apps,  11/1/2019
Commentary
Study Proposes 5 Primary Traits of Innovation Leaders
Joao-Pierre S. Ruth, Senior Writer,  11/8/2019
White Papers
Register for InformationWeek Newsletters
Video
Current Issue
Getting Started With Emerging Technologies
Looking to help your enterprise IT team ease the stress of putting new/emerging technologies such as AI, machine learning and IoT to work for their organizations? There are a few ways to get off on the right foot. In this report we share some expert advice on how to approach some of these seemingly daunting tech challenges.
Slideshows
Flash Poll