Software // Information Management
09:30 PM

Kimball University: Should You Use An ETL Tool?

You can still hand-code an extract, transform and load system, but in most cases the self-documentation, structured development path and extensibility of an ETL tool is well worth the cost. Here's a close look at the pros and cons of buying rather than building.


Software licensing cost.The greatest disadvantage of ETL tools in comparison to hand-crafted systems is the licensing cost for the ETL tool software. Costs vary widely in the ETL space, from several thousand dollars to hundreds of thousands of dollars.

Uncertainty. We've spoken with many ETL teams that are uncertain – and sometimes misinformed – about what an ETL tool will do for them. Some teams under-value ETL tools, believing they are simply a visual way to connect SQL scripts together. Other teams unrealistically over-value ETL tools, imagining that building the ETL system with such a tool will be more like installing and configuring software than developing an application.

Reduced flexibility. A tool-based approach limits you to the tool vendor's abilities and scripting languages. Build a Solid Foundation

There are some over-arching themes in successful ETL system deployments regardless of which tools and technologies are used. Most important — and most frequently neglected — is the practice of designing the ETL system before development begins. Too often we see systems that just evolved without any initial planning. These systems are inefficient and slow, they break down all the time, and they're unmanageable. The data warehouse team has no idea how to pinpoint the bottlenecks and problem areas of the system. A solid system design should incorporate the concepts described in detail in Kimball University: The Subsystems of ETL Revisited, by Bob Becker.

Good ETL system architects will design standard solutions to common problems such as surrogate key assignment. Excellent ETL systems will implement these standard solutions most of the time but offer enough flexibility to deviate from those standards where necessary. There are usually half a dozen ways to solve any ETL problem, and each one may be the best solution in a specific set of circumstances. Depending on your personality and fondness for solving puzzles, this can be either a blessing or a curse.

One of the rules you should try to follow is to write data as seldom as possible during the ETL process. Writing data, especially to the relational database, is one of the most expensive tasks that the ETL system performs. ETL tools contain functionality to operate on data in memory and guide the developer along a path to minimize database writes until the data is clean and ready to go into the data warehouse table. However, the relational engine is excellent at some tasks, particularly joining related data. There are times when it is more efficient to write data to a table, even index it, and let the relational engine perform a join than it is to use the ETL tool's lookup or merge operators. We usually want to use those operators, but don't overlook the powerful relational database when trying to solve a thorny performance problem.

2 of 3
Comment  | 
Print  | 
More Insights
The Agile Archive
The Agile Archive
When it comes to managing data, donít look at backup and archiving systems as burdens and cost centers. A well-designed archive can enhance data protection and restores, ease search and e-discovery efforts, and save money by intelligently moving data from expensive primary storage systems.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Tech Digest, Dec. 9, 2014
Apps will make or break the tablet as a work device, but don't shortchange critical factors related to hardware, security, peripherals, and integration.
Twitter Feed
InformationWeek Radio
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.