The Internet has raised the bar for BI expectations. We want to click a report and get an answer now. Complex query? Millions of records? Full table scan? Those are details that users really don't care about. Gone are the days of users being thankful for a weekly print out (conveniently scheduled during non peak processing periods). Gone, too, are the days when users were willing to click on a report only to stare at an hour glass for more than a few seconds, let alone minutes or hours.
It's clear why user expectations continue to rise: people can view stock prices in near-real time, voting results, sport scores, even traffic. So it would seem perfectly reasonable that corporate data access should be equally instantaneous.Indeed, a number of innovations have allowed query processing speeds to increase, even with exponential increases in query complexity and data volumes: things like more addressable RAM, faster CPUs, increased caching, alternative index types and better data warehouse designs have all contributed to increased query speeds.
The sad reality is that data analysis for many is still not at the speed of thought. It's at the speed of the slowest query or the largest table. When all other optimization strategies are not the solution, DBAs will often create summary tables so that users do not inadvertently hit large detail tables. While this seems like a good technical solution, it's awful for usability. Now a user has to decide: should the query hit a summary table or a detail table?
In theory, databases that support materialized views with automatic query rewrite would figure this out for the user. A query like "Sales by Month" should automatically go to the summary table; "Sales by Month by Customer" should go to a detail table, determined automatically by the database. Yet this automatic query rewrite is supported only by a few leading databases (Oracle and IBM DB2 have support it for years; Microsoft SQL Server 2005 now does; Teradata does not), and even with that support, I find few companies leveraging it for a myriad of reasons. Some cite lack of DBA awareness, painful implementations, memory problems and slower query response time.
Enter the BI tools that can figure out the best table to access for the fastest query: Business Objects, MicroStrategy, even relative newcomer Oracle BI Enterprise Edition (formerly Siebel Analytics). All have the ability to navigate summary tables in ways that are transparent to the user. Users see a single metric "Sales" and the database decides which sales table to access, a summary or a detail table. MicroStrategy goes further to be able also to handle partitions (2006 summary vs. 2005 summary).
Puzzling to me, then, are the BI vendors and database vendors who force users to choose multiple tables and to know which summary or detail table to use: Cognos, IBI, SAS, Hyperion, Microsoft to name a few. From a theoretical point of view, these BI vendors have taken the strategy that aggregate navigation should be handled by the databases, and I do agree this is the sensible place to handle such tasks. However, sometimes theory and reality never converge and customers need a stop-gap solution. The BI vendors that don't provide such a solution will be the face of BI that gets blamed for poor response time or that suffers usability issues when users must explicitly choose multiple summary and detail tables. It should be telling to all that third-party companies like Netezza, DATAAllegro and HyperRoll that focus on BI query performance are enjoying such success. The current ways of solving performance problems are inadequate, yet BI user expectations continue to increase.
How do you ensure fast query times today? Please vote for your favored approach, and feel free to add comments below on where you think these problems should be resolved.
Cindi Howson is the author of BIScorecard product reviews. Write her at [email protected].The Internet has raised the bar for BI expectations. We want to click a report and get an answer now. Complex query? Millions of records? Full table scan? Those are details that users really don't care about. People can view stock prices in near-real time, voting results, sport scores, even traffic. So why can't access to corporate data be equally instantaneous?