Kimball University: Eight Guidelines for Low-Risk Enterprise Data Warehousing

New data sources and BI delivery modes make it that much harder for EDW initiatives to succeed. Here are eight recommendations for controlling project costs and reducing risks.
Integrate Using Conformed Dimension

Enterprisewide integration has risen to the top of the list of EDW/BI technical drivers along with data quality and data latency. Dimensional modeling provides a simple set of procedures for achieving integration that can be effectively used by BI tools. Conformed dimensions enable BI tools to drill across multiple subject areas, assembling a final integrated report. The key insight is that the entire dimension (customer, for example) does not need to be made identical across all subject areas. The minimum requirement for a drill-across report is that at least one field be common across multiple subject areas. Thus, the EDW can define a master enterprise dimension containing a small but growing number of conformed fields. These fields can be added incrementally over time. In this way, we reduce the risk and cost of enterprise integration at the BI interface. This approach also fits well with our recommendation to develop the EDW/BI system incrementally.

Manage Quality a Few Screens at a Time

In our articles and books, Kimball Group has described an effective approach to managing data quality by placing data quality screens throughout the data pipelines leading from the sources to the targets. Each data quality screen is a test. When the test fails or finds a suspected data quality violation, the screen writes a record in an error event fact table -- a dimensional schema hidden in the back room away from direct access by end users. The error event fact table lets EDW/BI administrators measure the volume and source of the errors encountered. A companion audit dimension summarizes the error conditions and is exposed to the end users along with every dimensional fact table.

The data quality screens can be implemented one at a time, allowing development of the data quality system to grow incrementally.

Use Surrogate Keys Throughout

Finally, a seemingly small recommendation to reduce your EDW/BI development risk: make sure to build all your dimensions (even Type 1 Dimensions) with surrogate primary keys. This insulates you from surprises downstream when you acquire a new division that has its own ideas about keys. What's more, all your databases will run faster with surrogate keys.

More Advice on Low-Risk EDWs

Many of these ideas have been described in Intelligent Enterprise as well as the Kimball Group Toolkit series of books and our monthly Design Tips. We are always interested in hearing your opinions about low risk approaches to data warehousing. Write to me at [email protected]