Kimball University: Practical Steps for Designing a Dimensional Model

What does it take to develop a robust dimensional model? Here's how to get from requirements-gathering to final approval in a process that will ferret out the good, bad and ugly realities of your source data and help you avoid surprises, delays and cost overruns.
Dive Into the Details

With consensus on the bubble chart, the team then launches into table-by-table and column-by-column discussion, drilling into more detail about the requisite attributes and metrics, including definitions, transformation rules and data-quality concerns. The dimensional model will unfold through a series of design sessions, with each pass producing a more detailed design that's been repeatedly tested against your understanding of the business needs.

Don't attempt to schedule all-day design meetings; structure several one-hour morning and afternoon sessions so the lead modeler has time to update the documentation before the next meeting (and so team members can deal with the demands of their day jobs). Your initial sessions should focus on a more straightforward dimension table so the team experiences a quick win before dealing with the more controversial dimensions.

As you're iteratively fleshing out the design, team members will need to do more interrogation of the data. Thus, ongoing access to a profiling tool (or less sophisticated method) will be critical. Ferreting out the good, bad and ugly realities of your source data during the design process will minimize the surprises and corresponding overruns during the ETL design and development activity.

Throughout the design sessions, the data modeler or designated scribe should be filling in a detailed worksheet for each table with information such as the attribute or fact name, column description, sample values and decodes, change tracking rules for dimension attributes, and preliminary transformation business rules. These worksheets form the basis for the source-to-target mapping, which is further embellished by the physical designer and then ultimately handed off to the ETL team. In addition to the detailed worksheets, a member of the design team should also be logging open issues so they're captured in a single document to facilitate review and assignment at the end of every session.

Review the Results

The last phase of the modeling process involves reviewing and validating the model with interested parties, starting with the project team, extending to those in IT with intimate knowledge of the source systems, and concluding with the broader business community. Approval from this last group is critical before time and money is invested in the data implementation. Plan to devote much of this review to illustrating how the model will address sample questions from the requirements findings.

A typical design effort usually takes three to four weeks for a single business process dimensional model, but the time required will vary depending on the complexity of the business process, availability of preexisting conformed dimensions, experience of the modeling team, existence of well-documented business requirements, and the difficulty reaching consensus.

Designing a dimensional model with interested parties representing diverse skills requires commitment and cooperation, but the end result is a robust model that has been rigorously tested against both the business needs and data realities. That's exactly what you want before you move ahead with the implementation.

For more information about the dimensional modeling process, attend a Kimball University class such as the Dimensional Modeling in Depth or Data Warehouse Lifecycle in Depth courses. You can read more about the dimensional modeling process discussed above in The Data Warehouse Lifecycle Toolkit 2nd Edition, co-authored by Ralph Kimball, Margy Ross, Warren Thornthwaite, Joy Mundy and Bob Becker of the Kimball Group.