It's a big week for big data, and IBM kicked off the news cycle Monday with bevy of related announcements from its annual Information On Demand conference in Las Vegas. IBM's BigInsight platform, InfoSphere Streams, a new appliance-based offering targeting CMOs, and new analytics offerings are at the center of the news.
What the announcements have in common is that they're all about parts of IBM's Big Data Platform, which is Big Blue's umbrella term for a diverse collection of data-management and analysis technology bridging the relational database and Hadoop worlds. The announcements are also being issued ahead of an expected series of Hadoop-related announcements that will emanate from this week's combined O'Reilly Strata and Cloudera Hadoop World conference in New York.
IBM's news starts with the BigInsights platform, which includes a distribution of open source Apache Hadoop software but also includes proprietary tools from IBM such as the BigSheets data-exploration interface. What's new in this release are built-in analytics accelerators for text analytics and social media analytics that are said to speed time to insight into social sources such as Twitter and Facebook.
"Accelerators give you prebuilt capabilities to get at the relevant data in Hadoop and then do value-added analysis of that data," explained Phil Francisco, IBM's VP of big data product management.
[ Want more on the IBM's new PureData systems? Read IBM Answers Oracle Exadata. ]
The social media accelerator, for example, is designed to support efforts such as customer segmentation, helping platform users find those customers "using MapReduce under the covers ... but without having to be an expert in MapReduce programming or Hadoop," Francisco said.
In another enhancement to BigInsights, IBM has added InfoSphere Data Explorer, a data exploration tool, to the software bundle. Unlike BigSheets, which lets you explore Hadoop data in a spreadsheet-like interface, InfoSphere Data Explorer can look across multiple data sources using data-federation and analysis technology from IBM's Vivisimo acquisition. With access to Hadoop as well as data warehouses, data marts, and possibly other sources, the software can automatically find correlations in data across these platforms.
"It's a way to discover and visualize relationships in the data across various sources," Francisco said.
IBM InfoSphere Streams is the company's software for event processing in real-time environments such as financial exchanges and communications networks. Companies in these markets need sub-second performance to spot trends and correlations in streaming data. The new news here is that IBM is adding a set of accelerators to help telcos and other communications service providers to better understand service usage patterns and spot mobile coverage dead zones, fraud, or customers likely to churn.
Having quickly spotted problems and opportunities, these analyses can then be correlated with historical analyses done in IBM's PureData System for Analytics (the latest release of what used to be known as the IBM Netezza appliance). The result is much faster analysis than would otherwise be possible with a data warehouse alone, so call center agents, for example, can see pertinent results much closer to real time.
"The Streams platform is used to do very high-speed analytics as the data is streaming in, it finds the patterns in call detail records and other data on the network side," Francisco explained. "The PureData system is then used for deeper analysis of how trends or conditions are changing over time."