This article was originally published in November 2003.
Some data warehouse designers want to declare warehouse victory after merely replicating the organization's top five reports. They're satisfied with this level of deliverable because "that's what the users asked for." However, this approach is akin to paving the cow paths. In some communities, the roadways resemble a tangled web because early roads were built on preexisting cow paths. Unfortunately, the cows didn't meander along straight grid lines. Similarly, merely using the data warehouse to pave reporting "cow paths" doesn't push the organization beyond what it has today. This is where the analytic life cycle can help.
In "The Promise of Decision Support" (Dec. 5, 2002), we introduced the five-stage analytic life cycle.
Stage 1: Publish reports supports standard operational and managerial reporting on the current state of the business.
Stage 2: Identify exceptions pinpoints unusual performance situations that warrant further attention.
Stage 3: Determine causal factors seeks to understand the causal factors behind the exceptions.
Stage 4: Model alternatives synthesizes what's been learned to build a model for evaluating alternatives and trade-offs.
Stage 5: Track actions analyzes the effectiveness of the recommended actions and feeds the results back to the operational and data warehouse systems. We then return back to Stage 1 to report on these results, thereby closing the loop.
To move beyond replicating reports, you can use the analytic life cycle for gathering more in-depth business requirements. It provides a framework to collaborate with users to understand their analytic processes. It forces data warehouse designers to ask the second- and third-level questions, the "hows" and "whys," to understand how the organization could leverage the data warehouse for analysis.
Most analyses start with a report, which details business performance metrics. Our challenge is to push beyond into the more detailed analytic requirements.
Let's walk through a real-world experience buying a house in order to understand how the analytics life cycle guides the analytics requirements gathering process. Let's say that you've been transferred to a new city, and you have to find a new house. What sort of process do you use to find that ideal house? You might start with a couple of real estate listings (and the guidance of a knowledgeable real estate agent) and begin asking a lot of questions:
For the data warehouse designer, reporting requirements are the starting point. You need to take the time to identify and understand which reports the business relies on to monitor their performance. However, users can't possibly look at all the data. You need to take the analysis process to the next level.
When house hunting, you need to limit your search; otherwise you'll be inundated by all the housing options (especially considering that houses are constantly moving on and off the market). You can reduce the number of housing options by identifying only those properties that meet a certain set of criteria. You've now moved into the identify exceptions stage (stage 2). In the housing example, these critical criteria might include:
Stage 2 guides the data warehouse designer to look for requirements that focus on identifying the factors and thresholds that identify unusual situations worthy of further analysis. The exception identification factors typically manifest themselves as new facts and dimension attributes.
After identifying those factors that you'll use to scope your search, you need to understand why these drivers are critical to your housing decision. You need to understand the relationship between these driving factors what makes them important and the ultimate housing choice. You have now moved into the determine causal factors stage (stage 3). Here you refine your selection criteria, being more detailed in their definition and their corresponding acceptance criteria, such as:
During Stage 3, the data warehouse designer focuses on understanding why these variables are important, how they interrelate with each other, and how they'll be used in making the final decision. The results of this phase typically result in even more detailed dimension tables, new data sources (typically third-party or nonelectronic causal data), and statistical routines to quantify the cause and effect of the relationships.
After doing all the research and house tours, you can now create some sort of model to help you with the inevitable trade-offs in your final housing decision. You have now moved into the model alternatives stage (stage 4).
Models can be quite advanced statistical or spreadsheet algorithms or simple heuristics, rules of thumbs, or gut feeling. Whatever type of model used, its basic purpose is to provide a framework against which these different trade-off decisions can be evaluated. The model doesn't make the simple decision mundane, but helps make the seemingly impossible decision manageable.
You can employ your housing "model" to help you with the following types of housing trade-off decisions, perhaps using weighted averages in a spreadsheet to make the decision more quantitative vs. entirely qualitative:
For the data warehouse designer, the analytics requirements gathering process focuses on the "model" that will be used in evaluating the different decision alternatives. This includes the metrics that will drive the ultimate decision (independent variables) and their relationships to the ultimate decision (dependent variable).
And finally, once a decision has been made, you need to track the effectiveness of that decision in order to fine-tune the future decision process. That's the goal of the track actions stage (stage 5).
This stage is often skipped in the analytics process. Few people or organizations seem willing to spend the time to examine the effectiveness of their decisions. In our housing example, the same probably holds true. I'm not sure how many folks really consciously examine the effectiveness of their decision until it comes time to sell their house. Then you quickly learn if the general marketplace values the factors that you valued.