IBM responds to big data trendsetters, from Amazon and Cloudera to Splunk and Tableau. Here's what IBM offers now, what's coming and what stands out.
Tableau Software might be the darling of data visualization, but IBM says it's working on a more powerful way to simplify data analysis through Project NEO, which it describes as bringing data discovery to the masses. Currently in IBM's labs, NEO starts by simplifying the hard part of data analysis, which is mixing together disparate data sets without creating a mess.
IBM says NEO makes the data-modeling step a self-service proposition for business users thanks to a built-in ontology engine that handles metadata mapping behind the scenes. Users simply drag and drop desired sources into the NEO framework -- from a data warehouse, operational systems, cloud sources like Salesforce.com, or third-party enrichment databases such as Acxiom or Experian -- and the ontology engine does the mapping work.
A number of vendors have introduced business user-friendly data-mashup capabilities -- Microsoft, Oracle Endeca and MicroStrategy to name a few -- but IBM's NEO project blend with natural-language query and data visualization is unique.
Once a user chooses desired data sets, NEO's next trick is natural-language query, whereby users can ask questions and make requests in plain English, such as: "Show me the top sellers by region," "Who are the top salespeople?" or "List top customers in the Northeast." The NEO technology automatically pulls in the right data sources and presents the requested data or analysis in a suitable visualization.
NEO's final trick is displaying a series of highlights that cut across the data selected. For example, in addition to the requested visualization, you'll see small visualizations across the top of the interface highlighting related insights such as top sellers, bottom sellers, top customers, average customer spend or distinct counts, such as number of customers or number of sales by region. Built-in algorithms continue to surface new highlights as you explore and drill down in the primary data-visualization window. If one of the highlights grabs your interest, you simply click on the item and it moves to the center of analysis for drill-down exploration.
Neo will show up as early as January as a beta release before becoming generally available as part of a Cognos release in mid 2014. External data sources are expected to include Salesforce.com, Excel/.CSV uploads and popular third-party enrichment-data sources.
In another preview featured at the Information On Demand event, IBM demonstrated BLU Acceleration for Cloud. This is a coming, cloud-based appearance of IBM BLU Acceleration for DB2 in-memory technology, which was announced in April and released in June.
BLU Acceleration for Cloud is not just a database-as-a-service, it's a complete in-memory data warehousing environment in the cloud. The obvious comparison here is the Amazon Web Services RedShift data warehousing service, but IBM insists there's no comparison given the in-memory, parallel processing and unique storage and compression capabilities of BLU. IBM says BLU can crunch 10 terabytes down to 1 terabyte, bring that 1 terabyte into memory, and effectively crunch it again down to 10 gigabytes. With data-skipping techniques, BLU then focuses on the 1 gigabyte that matters in a query without wading through the other 9 gigabytes of irrelevant data. The result is performance that is eight to 25 times faster than DB2 without BLU.
BLU compression will reduce cloud-storage costs while in-memory analysis and data skipping will ensure state-of-the-art performance. But the other distinction between BLU Acceleration For Cloud and other competitors is the inclusion of data warehousing tools including InfoSphere Data Click for data loading, InfoSphere data architect and data studio for data modeling, and IBM Cognos BI for ad hoc query, dashboarding, data-visualization and reporting. IBM did not disclose a release data for BLU Acceleration for Cloud.
As we've seen with other software giants including Microsoft and Oracle, IBM has often weighed in after the innovators on new market trends in recent years. But when it does weigh in, it brings as much of its enormous software portfolio as possible to bear. This is the pattern once again with this year's Information On Demand announcements.
Companies like Splunk were ahead of the game in connecting the dots across big data from IT systems. But IBM SmartCloud Analytics Predictive Insights, for example, blends IBM's InfoSphere Streams and Tivoli assets in with an analytics-driven, preemptive approach to IT systems analysis.
The likes of Tableau and Tibco Spotfire made data visualization a hot trend, but IBM is now simplifying back-end data blending as well as front-end analysis with natural-language query.
BLU wasn't the first option for in-memory analysis and it's following AWS and others into cloud-based data warehousing. But IBM has brought together an unprecedented collection of compression and performance-enhancing techniques in BLU, and it's adding complementary data-management and BI services for a more complete, single-vendor cloud environment.
In short, IBM's approach is to combine and refine the best of what's out there. The question is whether it is moving fast enough to prevent innovators from becoming entrenched among would-be customers.
Growth is a sign that new products and services are catching on, but growth has been sorely lacking in IBM's financial performance in recent quarters. True, it is hardware losses that have more than offset IBM's growth in categories including cloud and software. But IBM's software growth, at least, has been tepid compared to that of nimble innovators. We'll see if these new and coming offerings give it a much-needed shot in the arm.
6 Tools to Protect Big DataMost IT teams have their conventional databases covered in terms of security and business continuity. But as we enter the era of big data, Hadoop, and NoSQL, protection schemes need to evolve. In fact, big data could drive the next big security strategy shift.
Big Data Brings Big Security ProblemsWhy should big data be more difficult to secure? In a word, variety. But the business won’t wait to use it to predict customer behavior, find correlations across disparate data sources, predict fraud or financial risk, and more.
InformationWeek Must Reads Oct. 21, 2014InformationWeek's new Must Reads is a compendium of our best recent coverage of digital strategy. Learn why you should learn to embrace DevOps, how to avoid roadblocks for digital projects, what the five steps to API management are, and more.