2018 has been an important year for businesses across industries as they continue to digitize more of their operations. To do that well, they need good, quality data to improve the accuracy of their analytics and to train artificial intelligence (AI) applications.
Many organizations have been working hard to liberate data trapped in systems because they need access the datasets. While the problem has not been solved completely, companies continue to unearth new insights that enable them to better control costs, create new revenue streams and increase profitability.
According to MicroStrategy's 2018 Global State of Analytics Report, which surveyed 500 analytics and business intelligence professionals, 57% have achieved faster and more efficient decision making through data and analytics use. Nearly two-thirds (64%) plan to invest more in data and analytics talent going into the 2020s, and almost three in five (57%) already employ a chief data officer. The CDO role (or its equivalent) is now considered essential for overseeing everything associated with data, ranging from knowledge of data assets and their use to articulating a strategy for using data to achieve competitive differentiation.
Data privacy and security concerns affect use
The journey to realize a more intelligent enterprise is not without its challenges, however. This year, many enterprises had to scramble to ensure compliance with the European Union's General Data Protection Regulation (GDPR) and are now facing the additional overhead that will be associated with the California Consumer Privacy Act, slated to go into effect in 2020. In the MicroStrategy survey, 49% said data privacy and security concerns are holding their organization back from using data more effectively. In addition, 45% reported that less than half of their organization's data is governed, a sobering statistic. The lack of solid governance exposes enterprises to potential privacy and security risks that need to be managed more effectively moving forward.
Organizations are under tremendous pressure to understand the data assets they have and have access to, because without awareness of the inventory, the condition of data assets, data ownership, and the rules restricting use, enterprises have a difficult time driving maximum value from their data and managing the potential risks and liabilities.
AI requires a mindset shift
Meanwhile, there is a major push to supplement data and analytics capabilities with AI. More vendors are touting their latest generation tools that may or may not actually take advantage of AI, and more specifically, machine learning and deep learning. Sadly, enterprise buyers may be unable to discern the difference between some vendors' dubious claims and the actual capabilities of offerings because they lack a basic understanding of AI-related concepts.
In addition, there are risks associated with AI-powered systems that business and IT leaders need to consider, such as the bias that may be present in the datasets used to train those systems and -- in the case of deep learning -- the potential inability for the system to explain the result and the reasoning that went into the result. In addition, Gartner estimates that by 2020, 50% of new data transformation flows will integrate one or more machine learning algorithms resulting in erroneous interpretations of data.
Data quality and metadata matter
Data quality is hardly a new concept, but the need for it is vital, not only to ensure more accurate analytics, but also to ensure that AI training data is reliable. Enterprises need to place greater emphasis on data quality out of necessity or suffer the consequences since "garbage in, garbage out" can erode brand reputation and customer trust, or worse. AI requires lots of training data. If the data used to train the system is of poor quality, it negatively affects the reliability of the insights and recommendations the system renders.
Given the sheer volume of data, increasing regulation and the data-intensive competitive environment, organizations need metadata management tools to enable effective data governance, data management and regulatory compliance. Metadata management is essential for understanding data and also for recognizing important patterns in data. It's also necessary to achieve effective data security and target marketing. According to Gartner, by 2020, 50% of information governance initiatives will be enacted with policies based on metadata alone.
Our top 30 vendors
It's tough to choose just a sampling of vendors out of thousands when driving value from data requires so many tools. The following three categories of vendors were selected based on current market trends and their ability to address current and emerging enterprise requirements at scale. Note that some vendors may have products in more than one category, but are only listed once to allow us to acknowledge the leadership of a greater number of companies. Check out the hot vendors in other segments of IT in our full Vendors to Watch package.
Birst (an Infor company; @BirstBI) offers a cloud-native BI, analytics and data visualization platform for the enterprise.
IBM (@IBMAnalytics) providesan end-to-end ecosystem of data, analytics and cognitive capabilities.
MicroStrategy (@MicroStrategy) markets a unified platform for enterprise analytics and mobility.
Oracle (@Oracle) offers enterprise SaaS application suites, database PaaS and IaaS. Its analytics platform provides visual analysis, discovery advanced analytics, reporting, and forecasting capabilities.
Qlik (@qlik) markets a platform for self-service data visualization, reporting and guided and embedded analytics.
SAP (@SAP) offers analytics products including a cloud platform, databases, data warehousing, big data, BI and advanced analytics.
SAS (@SASsoftware) is an established vendor providing analytics, BI and data management software and services.
Sisense (@Sisense) sells BI and analytics tools that allow users to easily prepare, analyze and explore growing data from multiple sources.
Tableau Software (@tableau) is well-known for its data visualization capabilities, offering enterprise, cloud and embedded analytics.
ThoughtSpot (@thoughtspot) provides a business intelligence and big data analytics platform for exploring, analyzing, and sharing real-time business analytics.
Data Science and Machine Learning Platforms
Alteryx (@alteryx) enables self-service BI, data preparation, data blending and advanced analytics.
Anaconda (@anacondainc) is the most popular Python data science platform, available as an open source project and an enterprise product.
Amazon Web Services (@awscloud) offersa broad array of products and services for processing, analyzing and visualizing data easily and cost-effectively. It also offers several AI products and services spanning frameworks and infrastructure, API-driven services and machine learning platforms.
Databricks (@databricks) provides a platform for data science and engineering that handles all analytics processes from ETL to models training and deployment.
DataRobot (@DataRobot) offers a machine learning automation platform that allows anyone to build and deploy accurate predictive models, fast.
HAL24K (@Hal24k) is an intelligence lab that offers modeling, analysis and visualization through its SaaS-based platform.
H20.ai (@h2oai) is an open source data science and machine learning platform.
KNIME (@knime) provides open source software for creating data science applications and services.
Microsoft (@Azure) offers the Azure platform, which includes a broad range of data and analytics, machine learning, and cognitive computing products and services.
RapidMiner (@RapidMiner) markets a data science platform that combines data preparation, machine learning and predictive model deployment.
Data Management and Governance
Advanced Metadata (@AdvMetadata) enables enterprises to understand their data through solutions including data landscaping, data quality and data analytics.
Anonos (@anonos) provides data management, security, and privacy solutions.
BigID (@bigidsecure) helps organizations secure their customer data and satisfy privacy regulations.
Cloudera and HortonWorks (@cloudera; @hortonworks), which announced plans to combine their companies as leading distributors of big data platforms based on open source Apache Hadoop.
D-ID (@D_ID_) is a startup marketing a solution that protects organizations' photos and videos from facial recognition.
Innosec (@InnoSecCyRisk) provides automated tools for managing GDPR compliance and performing privacy impact and risk assessments.
MarkLogic (@MarkLogic) markets an operational and transactional enterprise NoSQL database.
Talend (@Talend) offers a data management platform that includes data integration and data quality capabilities.
Teradata (@Teradata) is a major provider of databases, data warehousing and analytics platforms.
TIBCO Software (@TIBCO) providesdata management that enables consistent accessibility, delivery, governance, and security of data to meet an organization's requirements.
Learn about more hot vendors in our full Vendors to Watch package.
(Cover image: Billion Photos/Shutterstock)