Slowly Changing Dimensions Are Not Always as Easy as 1, 2, 3

How do you deal with changing dimensions? Hybrid approaches fill gaps left by the three fundamental techniques

Series of Type 3 Attributes

Say you have a dimension attribute that changes with a predictable rhythm, such as annually, and the business needs to summarize facts based on any historical value of the attribute (not just the historically accurate and current, as we've primarily been discussing). For example, imagine the product line is recategorized at the start of every fiscal year and the business wants to look at multiple years of historical product sales based on the category assignment for the current year or any prior year.

This situation is best handled with a series of type 3 dimension attributes. On every dimension row, have a "current" category attribute that can be overwritten, as well as attributes for each annual designation, such as "2004 category" and "2003 category." You can then group historical facts based on any annual categorization.

This seemingly straightforward technique isn't appropriate for the unpredictable changes we described earlier. Customer attributes evolve uniquely. You can't add a series of type 3 attributes to track the prior attribute values ("prior-1," "prior-2" and so on) for unpredictable changes, because each attribute would be associated with a unique point in time for nearly every row in the dimension table.

Balance Power against Ease of Use

Before using hybrid techniques to support sophisticated change tracking, remember to maintain the equilibrium between flexibility and complexity. Users' questions and answer sets will vary depending on which dimension attributes are used for constraining or grouping. Given the potential for error or misinterpretation, hide the complexity (and associated capabilities) from infrequent users.

Hybrid SCDs can sometimes take you for a dizzying ride. Check with your ETL vendors to see if they support any of these techniques in their tools. Some do, which will greatly ease the burden on you.

Required Reading

Ralph Kimball, founder of the Kimball Group, teaches dimensional data warehouse and ETL design through Kimball University and critically reviews large warehouses. He has four best-selling data warehousing books in print, including The Data Warehouse ETL Toolkit (Wiley, 2004). Write to him at [email protected].

Margy Ross is president of the Kimball Group. she cowrote The Data Warehouse Lifecycle Toolkit (Wiley, 1998) and The Data Warehouse Toolkit, 2nd Edition (Wiley, 2002). Write to her at [email protected].