IBM responds to big data trendsetters, from Amazon and Cloudera to Splunk and Tableau. Here's what IBM offers now, what's coming and what stands out.
It's easy to lose track of the news streaming out of a large-scale event like this week's 11,000-attendee IBM Information On Demand (IOD) conference in Las Vegas. It's all the more difficult given IBM's penchant for spinning high-concept yarns about "smart," "predictive" and "cognitive" capabilities that can make sense of the unfathomable "2.5 quintillion bytes of data" that IBM says the world generates each day.
In contrast to Microsoft, Oracle and SAP, IBM is much less inclined to talk about discrete products than it is capabilities that can be assembled into solutions with the help of IBM Global Business Services consultants with deep industry expertise.
But announcements are kind of obligatory at big annual tech events, and IBM served up plenty of them at IOD, including a mix of recently released and soon-to-be-released big data and analytics services and capabilities. Here are key highlights of what's new, what's coming and what distinguishes IBM's offerings from similar-sounding offerings that already exist.
IBM SmartCloud Analytics Predictive Insights is software aimed at transforming the high-scale machine data spinning out of IT systems -- networks, servers, storage systems, applications and so on -- into business intelligence. In the past, these log files and event streams were either used for simplistic, stove-pipe monitoring and diagnosis or they were entirely ignored.
In the big data era, some have realized that IT monitoring and event data might reveal leading indicators that can help IT anticipate and prevent problems rather than diagnose failures after the fact. Where many IT monitoring systems are all about setting thresholds and alerts for one system at a time, the idea behind Predictive Insights is to combine large sets of information and find correlations and anomalies in data that yield predictive insights.
Consolidated Communications, a cable operator headquartered in Illinois, is using Predictive Insights to track some 80,000 streams of data across its systems to monitor the health of its video delivery network. By spotting anomalies that couldn't be seen by studying systems in isolation, the cable operator reports it has avoided service disruptions and related costs of approximately $300,000 per year.
Splunk has been a pioneer in doing big data analysis across myriad IT system sources, but IBM says the Predictive Insights service is different from Splunk and other offerings in that it's an analytic-correlation and pattern-detection environment rather than an open-ended search-and-discovery tool. In other words, it surfaces conditions worthy of investigation on its own rather than relying on humans to drive the analysis.
Also on the "what's new" list are an update to IBM's SmartCloud Virtual Storage Center and three advances tied to Hadoop. The Storage Center is software for your data center that applies machine learning and analytics to virtualized storage environments to automate complex migration and storage-tiering decisions.
Storage choices typically revolve around the tradeoffs between fast data-access speeds and cost of capacity. By analyzing usage patterns, the Storage Center identifies the best storage choice for a given set of data, automatically making the change without admin assistance or interruptions to data access. Storage Center reportedly helped IBM itself reduce per-terabyte storage costs by 50% at the company's Boulder, Colo., data center.
The three Hadoop-related introductions are:
-- IBM PureData System for Hadoop. Released in September, this is IBM's Hadoop appliance incorporating the IBM BigInsights Hadoop distribution and complementary software. IBM says the difference from Apache, Cloudera, Hortonworks and other "standard" Hadoop deployments is four times faster performance thanks to cluster-management and high-performance computing capabilities adapted from IBM's Platform Computing acquisition.
-- InfoSphere Data Privacy for Hadoop. Coming later this quarter, this is a data-masking and data-activity-monitoring system that works across Hadoop as well as NoSQL and relational data sources, according to IBM. Data masking conceals sensitive data such as social security numbers at points of replication so companies can go beyond access controls to ensure data privacy. The data activity monitoring capability tells administrators who is accessing data and when data-access patterns are atypical -- even for authorized users.
-- InfoSphere Governance Dashboard. Another tool that works across multiple data sources including Hadoop, relational and non-relational databases, this dashboard gives data-management professionals an understanding of the lineage, state of quality and state of governance of data sets under management. The software is said to work hand-in-hand with ETL, data-privacy and data-security tools to ensure that governance policies are enforced.
6 Tools to Protect Big DataMost IT teams have their conventional databases covered in terms of security and business continuity. But as we enter the era of big data, Hadoop, and NoSQL, protection schemes need to evolve. In fact, big data could drive the next big security strategy shift.
Big Data Brings Big Security ProblemsWhy should big data be more difficult to secure? In a word, variety. But the business wonít wait to use it to predict customer behavior, find correlations across disparate data sources, predict fraud or financial risk, and more.