Software // Information Management
News
7/17/2006
12:25 PM
Connect Directly
RSS
E-Mail
50%
50%
Repost This

Kimball University: Integration for Real People

These step-by-step guidelines will help dimension managers and users drill across disparate databases.

Ralph Kimball

"Integration" is one of the older terms in data warehousing. Of course, almost all of us have a vague idea that integration means making disparate databases function together in a useful way. But as a topic, integration has taken on the same fuzzy aura as "meta data." We all know we need it; we don't have a clear idea of how to break it down into manageable pieces; and above all, we feel guilty because it is always on our list of responsibilities. Does integration mean that all parties across large organizations agree on every data element or only on some data elements?

This article decomposes the integration problem into actionable pieces, each with specific tasks. We'll create a centralized administration for all tasks, and we'll "publish" our integrated results out to a wide range of "consumers." These procedures are almost completely independent of whether you run a highly centralized shop on one physical machine, or whether you have dozens of data centers and hundreds of database servers. In all cases, the integration challenge is the same; you just have to decide how integrated you want to be.

Defining Integration

Fundamentally, integration means reaching agreement on the meaning of data from the perspective of two or more databases. Using the specific notion of "agreement," as described in this article, the results of two databases can be combined into a single data warehouse analysis. Without such an accord, the databases will remain isolated stovepipes that can't be linked in an application.

It's very helpful to separate the integration challenge into two parts: reaching agreement on labels and reaching agreement on measures. This separation, of course, mirrors the dimensional view of the world. Labels are normally textual, or text-like, and are either targets of constraints or are used as "row-headers" in query results, where they force grouping and on-the-fly summarization. In a pure dimensional design, labels always appear in dimensions. Measures, on the other hand, are normally numeric, and, as their name implies, are the result of an active measurement of the world at a point in time. Measures always appear in fact tables in dimensional designs. The distinction between labels and measures is very important for our task of integration because the steps we must perform are quite different. Taken together, reaching agreement on labels and on measures defines integration.

Previous
1 of 6
Next
Comment  | 
Print  | 
More Insights
The Agile Archive
The Agile Archive
When it comes to managing data, donít look at backup and archiving systems as burdens and cost centers. A well-designed archive can enhance data protection and restores, ease search and e-discovery efforts, and save money by intelligently moving data from expensive primary storage systems.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Elite 100 - 2014
Our InformationWeek Elite 100 issue -- our 26th ranking of technology innovators -- shines a spotlight on businesses that are succeeding because of their digital strategies. We take a close at look at the top five companies in this year's ranking and the eight winners of our Business Innovation awards, and offer 20 great ideas that you can use in your company. We also provide a ranked list of our Elite 100 innovators.
Video
Slideshows
Twitter Feed
Audio Interviews
Archived Audio Interviews
GE is a leader in combining connected devices and advanced analytics in pursuit of practical goals like less downtime, lower operating costs, and higher throughput. At GIO Power & Water, CIO Jim Fowler is part of the team exploring how to apply these techniques to some of the world's essential infrastructure, from power plants to water treatment systems. Join us, and bring your questions, as we talk about what's ahead.