With Hadoop, Big Data Analytics Challenges Old-School Business Intelligence - InformationWeek
Software // Information Management
12:13 PM
Doug Henschen
Doug Henschen
Connect Directly

With Hadoop, Big Data Analytics Challenges Old-School Business Intelligence

Datameer and Karmasphere say their Hadoop-based platforms are what's needed for the next era of data analysis. Are you buying it?

What good is a "big data" analytics system that's capped at 1 terabyte?

Hadoop is actually misperceived as a solution that's exclusively about big data, according to Groschupf, who contends that it's suitable for small deployments where variable-data analysis is required. Datameer says possible workgroup uses include the same sorts of analyses Hadoop users might contemplate--like finding correlations among clickstreams, online signups, and e-mail campaigns--but with a week's worth of data instead of a year.

Datameer Personal is $300 per year, limited to 100 gigabytes per year, and creates a mini Hadoop environment on a PC, giving power users a development and design environment to do small-scale social media analytics.

Karmasphere, too, provides a reporting, analysis and data-visualization platform for Hadoop, but there a few crucial differences in its approach. First, instead of using a spreadsheet-style interface, Karmasphere offers graphical user interface that powers a collaborative workflow that works with Hive, the data warehousing component built on top of Hadoop.

[ Want more on Apache's big-data platform? Read Why Hadoop Crowd Is Hearing Much About Hortonworks. ]

Karmasphere CEO Gail Ennis says Hive is standards based so people can move their reports to other Hadoop distributions, and it's more scalable than Datameer's spreadsheet approach. Datameer's Groschupf counters that its spreadsheet is just an analysis design interface so it doesn't have to scale. He also says that Hive (a tool also used by established BI vendors including Microstrategy and Tableau) lacks support and currently offers less than 30% of the analyses supported in SQL.

"Hive is a crutch compared to an EMC Greenplum, HP Vertica, or Teradata system," Groschupf says. "Those who try to make it a standard data warehouse will fail."

Ennis concedes that Hive has its flaws, but she says "the constraints are going to be addressed very soon because Hive adoption and community development work is moving quickly."

Where Datameer 2.0 introduced workgroup and personal editions, Karmasphere is moving the other way up from desktop software to a server-based product. Thus, Karmasphere 2.0 delivers new capabilities including a collaborative workspace that runs in browsers, a shared asset repository where users can access and version control analyses, and an administrative console for managing user roles, permissions, and security. Also new in version 2.0 is the ability to import SAS and SPSS models and run them on Hadoop.

Karmasphere's pricing is based on the number of nodes in the cluster and the number of named users. Small, five-node/five-user systems start at around $10,000 and average deployments for 30- to 40-node cluster with 10 to 20 users are $40,000. Truly large-scale deployments with hundreds of nodes cost $250,000 to $300,000.

For companies building on Hadoop that aren't invested in so-called old-school BI or relational data warehousing, Datameer and Karmasphere should clearly on the short list. If you're a SQL shop that's heavily invested in more conventional BI, it can't hurt to explore your Hadoop-integration options. Connectors to the Hadoop Distributed File System (HDFS) are commonplace. Less common are connectors to Hive, but keep your eye on growing maturity here.

There are also emerging HCatalog capabilities within Apache Hadoop software that have made it possible for data warehousing vendors including EMC and Teradata's Aster Data unit to tap Hadoop data as if they're indices in any conventional relational database.

There's always an imperative to leverage existing investments first until proven inadequate, so this might delay a quick embrace of Datameer and Karmasphere by BI and analytics pros deeply invested in their existing tools. Veterans also might not be very impressed by the state of evolution of these two very new products.

Can conventional BI tools match an analytic platform built for Hadoop? Groshupf says those just moving boiled-down result sets out of Hadoop and into conventional tools are simply perpetuating islands-of-transactions analysis. And tapping into Hive isn't much better because "you're losing the opportunity [for holistic analysis] because you're creating a static schema that only deals with structured information," he concludes.

The idea of holistic analysis is what the enterprise data warehouse was always about. For many, the enterprise data warehouse remains an elusive dream. Even for those who think they've achieved it, it has always been hard and expensive. We have yet to find out if "next-generation analytics" on top of Hadoop will fulfil the promise of doing so at a lower cost and across a wider variety and larger scale of data.

2 of 2
Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
How Enterprises Are Attacking the IT Security Enterprise
How Enterprises Are Attacking the IT Security Enterprise
To learn more about what organizations are doing to tackle attacks and threats we surveyed a group of 300 IT and infosec professionals to find out what their biggest IT security challenges are and what they're doing to defend against today's threats. Download the report to see what they're saying.
Register for InformationWeek Newsletters
White Papers
Current Issue
2017 State of the Cloud Report
As the use of public cloud becomes a given, IT leaders must navigate the transition and advocate for management tools or architectures that allow them to realize the benefits they seek. Download this report to explore the issues and how to best leverage the cloud moving forward.
Twitter Feed
InformationWeek Radio
Archived InformationWeek Radio
Join us for a roundup of the top stories on InformationWeek.com for the week of November 6, 2016. We'll be talking with the InformationWeek.com editors and correspondents who brought you the top stories of the week to get the "story behind the story."
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.
Flash Poll