Increasing volumes and varieties of data, combined with self-service data access, is overwhelming existing reporting tools and infrastructure. To scale for digital era demands, organizations are adopting new cloud Hadoop-based, data lake architectures and next generation OLAP semantic layers. As early adopters make this move, they are learning from expensive “lift and shift” failures. “Lift and optimize” approaches are proving to be far more cost-effective and successful.
OLAP is not dead
Contrary to the rumors of OLAP’s impending death when in-memory analytics entered the market years ago, OLAP is still not dead. The rules never changed. Exponential growth in data sources, varieties and types did rapidly surpass what traditional OLAP solutions were designed to handle. Thus, new OLAP on Hadoop technologies such as Apache Kylin, AtScale, Kyvos Insights, and other similar solutions were invented for the big data analytics world.
Today OLAP is in high demand. Last year, a CEO of a large system implementation firm in the United States expressed the unicorn was not a data scientist. Their firm received over 300 data scientist resumes for one open role. They could not find any OLAP talent with classic Kimball technique, dimensional modeling skills. Surprisingly, the unicorn was someone with OLAP design skills.
Assess and optimize for the cloud
There are many benefits for both the business and IT to use new OLAP on Hadoop solutions. Speed, scale, simplicity and saving money on cloud bills are usually cited as the top reasons for migration from legacy on-premises OLAP offerings. OLAP on Hadoop solutions make big data analytics easy and familiar. The business can use it within mainstream self-service BI tools such as Excel, SAS, Tableau, Qlik, and TIBCO Spotfire that they already know and love. Without a user-friendly, governed semantic layer that can intelligently cache and pre-aggregate, self-service big data analytics would be overwhelming, frustrating, and expensive. Keep in mind that cloud analytics can be billed by compute or query scan usage.
Implementing the right big data analytics infrastructure is vital for future success. It cuts capital expenditures, maintenance and operating costs while also improving reporting agility. Since most OLAP on Hadoop solutions support standard analytics tools, SQL, and MDX languages, inefficient data copying, ETL routines, and silo semantic model processes can be redesigned to use modern ingest pipelines, schema-on-read, and shared semantic layers across numerous BI and analytics tools. As you assess existing analytical environments, review the hidden recurring pains and costs caused by analytical silos that could be eradicated with a new approach when migrating on-premises analytics architectures to cloud analytics architectures.
Beware of breaking changes
Although the “lift and shift” strategies might sound quick and easy, they also might not work. After analysts connect reports to big data sources, they will usually notice errors and timeouts. Reporting errors are usually related to subtle big data query syntax differences, missing script functions, or other unavailable elements. Timeouts are where OLAP on Hadoop saves the day.
After interviewing several groups that rolled out OLAP on Hadoop, two key lessons were shared. The first lesson learned was to treat your big data analytics project as a real project. Don’t assume that you can just change a data source in your reports. As Benjamin Franklin was quoted, “an ounce of prevention is worth a pound of cure”. Don’t undermine big data analytics success by failing to understand and plan for the move.
The second lesson learned was to assess your analytics environment and look for optimization opportunities. Modern OLAP on Hadoop semantic layers can work with numerous analytical tools. As you invest in improving analytics architecture, also review the end-to-end analytical processes. You may uncover a plethora of ways to reduce non-value added work while also removing proprietary analytics silos.
[Jen Underwood returns to Interop ITX 2018 in Las Vegas on Monday, April 30, presenting a workshop Introduction to Machine Learning for Mere Mortals: Solving Common Business Problems with Data Science.]