Kimball University: Extreme Status Tracking For Real Time Customer Analysis
Customer interactions create a wealth of timely data that marketing departments are eager to exploit. The customer status fact table provides a central switchboard for using this fast-moving data.
We live in a world of extreme status tracking, where our customer-facing processes are capable of producing continuous updates on the transactions, locations, online gestures, and even the heartbeats of customers. Marketing folks and operational folks love this data because real-time decisions can be made to communicate with the customer. They expect these communications to be driven by a hybrid combination of traditional data warehouse history and up-to-the-second status tracking. Typical communications decisions include whether to recommend a product or service, or judge the legitimacy of a support request, or contact the customer with a warning.
As designers of integrated enterprise data warehouses (EDWs) with many customer-facing processes, we must deal with a variety of source operational applications that provide status indicators or data-mining-based behavioral scores we would like to have as part of the overall customer profile. These indicators and scores can be generated frequently, maybe even many times per day; we want a complete history that may stretch back months or even years.
Though these rapidly changing status indicators and behavior scores are logically part of a single customer dimension, it is impractical to embed these attributes in a Type 2 slowly changing dimension. Remember that Type 2 perfectly captures history, and requires you to issue a new customer record each time any attribute in the dimension changes. Kimball Group has have long pointed out this practical conflict by calling this situation a "rapidly changing monster dimension." The solution is to reduce the pressure on the primary customer dimension by spawning one or more "mini-dimensions" that contain the rapidly changing status or behavioral attributes. We have talked about such mini-dimensions for at least a decade.
In our real-time, extreme status tracking world, we can refine the tried-and-true mini-dimension design by adding the following requirements. We want a "customer status fact table" that is...
a single source that exposes the complete, unbroken time series of all changes to customer descriptions, behavior, and status;
minutely time-stamped to the second or even the millisecond for all such changes;
scalable, to allow new transaction types, new behavior tags, and new status types to be added constantly, and scalable to allow a growing list of millions of customers each with a history of thousands of status changes;
accessible, to allow fetching the current, complete description of a customer and then quickly exposing that customer's extended history of transactions, behavior and status; and
usable as the master source of customer status for all fact tables in the EDW.
The Agile ArchiveWhen it comes to managing data, donít look at backup and archiving systems as burdens and cost centers. A well-designed archive can enhance data protection and restores, ease search and e-discovery efforts, and save money by intelligently moving data from expensive primary storage systems.
2014 Analytics, BI, and Information Management SurveyITís tried for years to simplify data analytics and business intelligence efforts. Have visual analysis tools and Hadoop and NoSQL databases helped? Respondents to our 2014 InformationWeek Analytics, Business Intelligence, and Information Management Survey have a mixed outlook.