Kimball University: Building a Foundation for Smart Applications

Off-the-shelf apps may offer built-in analytics, but the best approach to supporting operational decisions is to rely on a solid data warehouse that cleans, integrates.

Supporting Transactional Workloads

Let's say that you've designed the warehouse databases to support structured reports, as well as ad hoc query and analysis. Let's also assume that during the process of designing the smart enterprise apps, a statistician or analyst has made heavy use of the information infrastructure. Don't be surprised to see the model builder issuing queries that strain system resources. In production, however, most smart apps put a small load on the system. All the work has been done in advance; at transaction time all that's required is a simple query and some modest computation. This is good news, because it means that the same database used to support reports and analysis can serve up the information needed for the ongoing operation of the smart enterprise application.

However, just because the database can handle the load doesn't mean it's a good idea. This approach brings us back to one of the fundamental reasons we started building data warehouses in the first place: Analytic workloads are not compatible with transaction workloads--even the read-only workloads that are the BI portion of the smart enterprise application.

Smart designers will isolate the data needed to support the smart enterprise application in a managed environment that's not shared with the classic BI workload of reporting and analysis. This downstream database is usually much smaller than the data warehouse because it's focused on a particular problem. Queries are known in advance and can be carefully tuned. This new database is typically subject to a much higher level of service than the data warehouse since it's really part of the transaction system.

Some smart apps require that historical information be combined with very low-latency data. For example, a CRM application might present a call-center screen that combines a customer's historical information with details about today. The majority of data warehouses include information up to yesterday, so how do you present a unified time series that includes both historical and real-time data?

The most common solution is to have the application query both the data warehouse and the transaction system and then combine the information for presentation. This approach works best if the transactional information requires little integration or transformation. A second popular approach is to add the low-latency data directly to the application database. Adding low-latency data to the more static warehouse database should be a last resort because it significantly complicates the architecture.

Spreading Bi Everywhere

The promise of BI is that all decision-makers in an organization, no matter how strategic, tactical or operational, should have the information they need to do their jobs as well as possible. Data warehousing and BI started decades ago by focusing on the business analyst who advised senior management. We've done a pretty good job of reaching up to the executives and down to middle management by providing BI portals with rich reports and dashboards. The promise for the future is to reach throughout the enterprise, to make our operational systems use the information we've warehoused in an active and automated way.

Joy Mundy is a member of The Kimball Group. She's focused on DW/BI consulting and education for more than a decade. Write to her at [email protected].