Kimball University: Should You Use An ETL Tool? - InformationWeek
IoT
IoT
Software // Information Management
News
4/6/2008
09:30 PM
50%
50%

Kimball University: Should You Use An ETL Tool?

You can still hand-code an extract, transform and load system, but in most cases the self-documentation, structured development path and extensibility of an ETL tool is well worth the cost. Here's a close look at the pros and cons of buying rather than building.

DISADVANTAGES OF ETL TOOLS

Software licensing cost.The greatest disadvantage of ETL tools in comparison to hand-crafted systems is the licensing cost for the ETL tool software. Costs vary widely in the ETL space, from several thousand dollars to hundreds of thousands of dollars.

Uncertainty. We've spoken with many ETL teams that are uncertain – and sometimes misinformed – about what an ETL tool will do for them. Some teams under-value ETL tools, believing they are simply a visual way to connect SQL scripts together. Other teams unrealistically over-value ETL tools, imagining that building the ETL system with such a tool will be more like installing and configuring software than developing an application.

Reduced flexibility. A tool-based approach limits you to the tool vendor's abilities and scripting languages. Build a Solid Foundation

There are some over-arching themes in successful ETL system deployments regardless of which tools and technologies are used. Most important — and most frequently neglected — is the practice of designing the ETL system before development begins. Too often we see systems that just evolved without any initial planning. These systems are inefficient and slow, they break down all the time, and they're unmanageable. The data warehouse team has no idea how to pinpoint the bottlenecks and problem areas of the system. A solid system design should incorporate the concepts described in detail in Kimball University: The Subsystems of ETL Revisited, by Bob Becker.

Good ETL system architects will design standard solutions to common problems such as surrogate key assignment. Excellent ETL systems will implement these standard solutions most of the time but offer enough flexibility to deviate from those standards where necessary. There are usually half a dozen ways to solve any ETL problem, and each one may be the best solution in a specific set of circumstances. Depending on your personality and fondness for solving puzzles, this can be either a blessing or a curse.

One of the rules you should try to follow is to write data as seldom as possible during the ETL process. Writing data, especially to the relational database, is one of the most expensive tasks that the ETL system performs. ETL tools contain functionality to operate on data in memory and guide the developer along a path to minimize database writes until the data is clean and ready to go into the data warehouse table. However, the relational engine is excellent at some tasks, particularly joining related data. There are times when it is more efficient to write data to a table, even index it, and let the relational engine perform a join than it is to use the ETL tool's lookup or merge operators. We usually want to use those operators, but don't overlook the powerful relational database when trying to solve a thorny performance problem.

Previous
2 of 3
Next
Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
How Enterprises Are Attacking the IT Security Enterprise
How Enterprises Are Attacking the IT Security Enterprise
To learn more about what organizations are doing to tackle attacks and threats we surveyed a group of 300 IT and infosec professionals to find out what their biggest IT security challenges are and what they're doing to defend against today's threats. Download the report to see what they're saying.
Register for InformationWeek Newsletters
White Papers
Current Issue
Top IT Trends to Watch in Financial Services
IT pros at banks, investment houses, insurance companies, and other financial services organizations are focused on a range of issues, from peer-to-peer lending to cybersecurity to performance, agility, and compliance. It all matters.
Video
Slideshows
Twitter Feed
InformationWeek Radio
Archived InformationWeek Radio
Join us for a roundup of the top stories on InformationWeek.com for the week of November 6, 2016. We'll be talking with the InformationWeek.com editors and correspondents who brought you the top stories of the week to get the "story behind the story."
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.
Flash Poll