Software // Information Management
Commentary
1/31/2008
05:10 PM
Rajan Chandras
Rajan Chandras
Commentary
Connect Directly
RSS
E-Mail
50%
50%

ELT vs. ETL: Much Ado about Something

There's no doubt that ELT - yes, that's extract-load-transform (also called "pushdown") not conventional extract-transform-load (ETL) - is now a mainstream capability. Informatica's inclusion of pushdown optimization in the recently released PowerCenter version 8.5 brings ELT the legitimacy it deserves... I fully expect pushdown will be come a new frontier in the battle for ETL supremacy.

My recent blog post on Informatica not only led to what an ITBusinessEdge.com blog called a "mini-buzz" about the fate of the company, it invited reader comments that, among other things, took up opposing points of view on Informatica's ELT capabilities - and, yes, that's extract-load-transform (also called "pushdown") not conventional extract-transform-load (ETL). There's no doubt that ELT is now a mainstream capability, and Informatica's inclusion of pushdown optimization in the recently released PowerCenter version 8.5 brings ELT the legitimacy it deserves.At first sight, ELT seems like a new twist on the conventional ETL approach at best (as well as a dyslexic's nightmare), but if done right, there's a lot to like about it (except the acronym). The central premise of ELT is the ability to send the ETL process, mid-stream, swooping down into the database, not unlike a bird that, mid-flight, takes a dip into a lake and resumes the flight without pause. While a bird slows down when it dips into a pond, ELT pushdown actually speeds up the overall data movement process. Perhaps the best known pure-play vendor for ELT was Sunopsis, which was acquired by Oracle in 2006.

Compared to pure ETL or pure ELT, I find mixed-mode ETL (mixing some pushdown capabilities into the overall ETL process) more appealing. Whereas ETL usually brings better organization of the data load process, in-database processing is generally faster than ETL, so the ability to send part of the process (part of an Informatica "mapping," for example) into the database is attractive. The standard work-around today, when we need to do some heavy lifting in the database, is to write, say, stored procedures invoked from an ETL process. The down side of this approach - besides having to develop logic in two environments, ETL and database SQL - is that often it is not easy making the mid-stream data available to the stored procedure, leading to convoluted architecture and ETL design compromises. If you can code uniformly in one environment and then just send part of that code down to the database, you gain flexibility as well as performance.

It's getting increasingly hard to stay excited about ETL (Indisputable Need + Near Ubiquity ≠ Enthrallment) but pushdown seems like the best thing that has happened to ETL since data quality and EII, and I can see mixed-mode options, such as that offered by Informatica, giving ETL solution architects and designers a needed boost in designing sophisticated ETL solutions and containing data load jobs within processing windows. I also fully expect that pushdown will become a new frontier in the battle for ETL supremacy - and once again, Informatica seems to have the edge.

But please, can we just continue calling all that ETL without coining a new acronym for a clever improvisation?There's no doubt that ELT - yes, that's extract-load-transform (also called "pushdown") not conventional extract-transform-load (ETL) - is now a mainstream capability. Informatica's inclusion of pushdown optimization in the recently released PowerCenter version 8.5 brings ELT the legitimacy it deserves... I fully expect pushdown will be come a new frontier in the battle for ETL supremacy.

Comment  | 
Print  | 
More Insights
The Agile Archive
The Agile Archive
When it comes to managing data, don’t look at backup and archiving systems as burdens and cost centers. A well-designed archive can enhance data protection and restores, ease search and e-discovery efforts, and save money by intelligently moving data from expensive primary storage systems.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Tech Digest - July10, 2014
When selecting servers to support analytics, consider data center capacity, storage, and computational intensity.
Flash Poll
Video
Slideshows
Twitter Feed
InformationWeek Radio
Archived InformationWeek Radio
Join InformationWeek’s Lorna Garey and Mike Healey, president of Yeoman Technology Group, an engineering and research firm focused on maximizing technology investments, to discuss the right way to go digital.
Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.