The hype around the term "big data" has certainly fallen away, but that doesn't mean that organizations have stopped their work to incorporate volumes more data into their analytics practices, sometimes using technologies such as Hadoop and Spark. As with many other technologies that have passed their peak hype, big data technologies have come a long way since then. They are closer to enterprise production than they were just a couple years ago, although they certainly aren't mainstream yet.
The question is why. A full 26% of CIOs said that business intelligence and analytics would help them differentiate their businesses from their competitors, making it the top investment priority. But 91% of organizations had not yet reached a "transformational" level of maturity in data and analytics. That level is really where data and analytics are a central underpinning of the business, so important that the chief data officer will sit on the board of directors.
So where is the disconnect? Business intelligence to Hadoop/big data connection company AtScale has had a front row seat to the evolution of big data in organizations both big and small. Since 2016 the company has released regular research on organizations' big data maturity levels -- sort of the state of big data in organizations. AtScale has surveyed more than 5,593 data professionals at more than 429 companies globally, pulling from its own customer base and that of its survey partners, including all three Hadoop distribution vendors, plus Tableau, and the Linux and Apache Foundations. Given the base, the survey information is closer to representing organizations that are already using big data technologies such as Hadoop, or that are more inclined to do so. The 2018 report provides a snapshot of those organizations' top challenges, opportunities, and concerns are today.
Among the more interesting findings is that organizations are a bit overconfident about their efforts in 2018. This year, 78% of respondents ranked their big data maturity as medium or high. But according to AtScale's methodology rating those same organizations, only 12% have a high level of maturity.
Siloed, Decentralized Analytics
One of the top challenges these organizations are facing is one that's been present in enterprises for decades -- a siloed approach to data and analytics in the organization. A full 55% of respondents are still dealing with siloed, decentralized analytics. AtScale reports that online and utilities verticals are leading in this area, having established centers of excellence. Financial services and telecommunications verticals are lagging.
The cloud may play a bigger role in data and analytics in the months and years ahead. A full 77% of respondents said they would use the cloud for big data. Further, 11% said they are planning to put Google BigQuery into production, and 60% are investigating BigQuery. More than 40% of respondents said they would consider the cloud instead of an on-premises solution.
Self-service Benefits and Challenges
While big data may be headed for the cloud in many cases, that doesn't mean the benefits of that move are universal. AtScale's survey found that 59% of respondents had deployed big data in the cloud, up from 53% last year. But the move had disrupted their end-users' ability to access the data. Self-service access fell to 42% of organizations, down from 47% last year.
Microsoft Power BI Gains
Microsoft Power BI has gained a lot of ground in the past few years, and AtScale's survey shows just how much. Survey respondents were asked to name their top BI tool of choice for big data, and the top three were Tableau, Microsoft Excel, and Power BI. But that was a big jump in the rankings for Power BI, which had been in 7th place last year, AtScale said.
Fastest Growing Concern: Data Governance
Tools and platforms are proliferating in the enterprise, and decentralized, siloed data and analytics efforts are a concern for data professionals surveyed. Data governance ranked as the number two concern in 2018, up from the fifth position in 2016. Skill sets have remained in the number one position as the top challenge for the three years the survey has been conducted.
"On-premises Hadoop is hard," said AtScale CEO Dave Mariani, in an interview with InformationWeek. "It's hard to manage for all but a few enterprises that have the skill set. There are a number of companies that are skipping on-premises entirely."
With the popularity of cloud for deployment of big data analytics, next year's big concern may shift to another area entirely. Organizations may be concerned about the risk of cloud lock-in going forward -- aligning themselves with one cloud vendor and then finding it difficult to move to a different one if circumstances change or they find a provider that offers better alignment, according to Bruno Aziza, CMO at AtScale. These organizations are seeking a multi-cloud strategy, but as time goes on they may find themselves using more of one vendor's tools because they are better aligned with the platform.