Big Data // Big Data Analytics
News
6/24/2013
12:26 PM
Connect Directly
Google+
LinkedIn
Twitter
RSS
E-Mail
50%
50%
Repost This

Datameer Democratizes Advanced Big Data Analytics

Datameer 3.0 promises drag-and-drop machine learning with clustering, column-dependency, decision tree and predictive recommendations on top of Hadoop.

5 Big Wishes For Big Data Deployments
5 Big Wishes For Big Data Deployments
(click image for larger view and for slideshow)

There's storing big data and reporting against big data, and then there's gaining insights from big data with advanced analytics. The third level of maturity delivers the most value, and it's what Datameer is after with Datameer 3.0, announced Monday and set for general release this fall.

Datameer is a data-integration, data-management and self-service analytics platform that runs on top of Hadoop, and it's used by notable customers including Sears Holdings and Cardinal Health to bring together and analyze high-scale structured and unstructured data sets on Hadoop. The options for analysis have heretofore included a spreadsheet-style interface and a short list of data visualizations and packaged analytics.

Datameer 3.0 introduces four powerful options for advanced analytics: clustering, column-dependencies, decision trees and recommendation. What these four have in common is that they are machine-learning analyses driven by algorithms, and the data tells the analyst what's important.

[ Want more on Datameer in action? Read Why Sears Is Going All-In On Hadoop. ]

"With functional analytics, you as human being have to decide what you're going to look for, filter and analyze," Stefan Groschupf, CEO of Datameer, told InformationWeek. "As you integrate more diverse data and the larger the data sets become, the more you need machine learning to help you figure out what's important."

The four styles of analysis were chosen for their popularity. Clustering is used to find groups in data, as in segments of important customers. Column-dependency analysis uncovers important relationships among dimensions of data, such as age, income, location and product purchases, for example. Decision trees can be used to track conversion rates, for example, among different segments of customers in a sales funnel. And predictive recommendations are familiar to anyone who has seen Netflix movie recommendations or Amazon product-purchase suggestions.

Datameer calls the four new analysis options Smart Analytics because they don't require the complex data-preparation, sampling and scoring procedures associated with advanced analytics, according to Groschupf. With Datameer 3.0, users drag and drop data-set descriptions from a list of everything available on the Hadoop cluster. Preview analyses give users a sense of what they'll discover before the complete analysis is executed at scale behind the scenes. Datameer's software handles all the complexities of MapReduce processing without coding required by end users, according to Groshupf.

"One of our beta customers that was spending $1 million per month on Google Ad words used these analyses and found that they could cut that spend to $400,000 per month by focusing on the key words that were shown to be most likely to convert," Groshupf said.

The packaged functional analytics already available from Datameer include analyses such as Salesforce.com data in combination with Google Ad Words, Marketo leads, Web analytics or sentiment analysis against Twitter. More than 90 such packaged, template applications are available from Datameer's app store, with many having been developed by partners.

Datameer competes with Hadapt, Karmasphere, Platfora and other startups that offer business intelligence and analytics platforms designed to run on top of Hadoop. Groschupf said he isn't too worried about Cloudera Impala and other SQL-on-Hadoop options, such as Hortonworks Stinger, MapR-promoted Apache Drill or IBM Big SQL, because the universe of SQL-savvy professionals is in the low hundreds of thousands. Datameer's Smart Analytics, packaged analytics and spreadsheet tools, in contrast, are designed to be used by business analysts, he said.

"We're focused on the millions of business users who want easy-to-use tools and who don't want to have to wait for IT to help them make sense of that information," he said.

To understand how to secure big data, you have to understand what it is -- and what it isn't. In the Security Implications Of Big Data Strategies report, we show you how to alter your security strategy to accommodate big data -- and when not to. (Free registration required.)

Comment  | 
Print  | 
More Insights
InformationWeek Elite 100
InformationWeek Elite 100
Our data shows these innovators using digital technology in two key areas: providing better products and cutting costs. Almost half of them expect to introduce a new IT-led product this year, and 46% are using technology to make business processes more efficient.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Elite 100 - 2014
Our InformationWeek Elite 100 issue -- our 26th ranking of technology innovators -- shines a spotlight on businesses that are succeeding because of their digital strategies. We take a close at look at the top five companies in this year's ranking and the eight winners of our Business Innovation awards, and offer 20 great ideas that you can use in your company. We also provide a ranked list of our Elite 100 innovators.
Video
Slideshows
Twitter Feed
Audio Interviews
Archived Audio Interviews
GE is a leader in combining connected devices and advanced analytics in pursuit of practical goals like less downtime, lower operating costs, and higher throughput. At GIO Power & Water, CIO Jim Fowler is part of the team exploring how to apply these techniques to some of the world's essential infrastructure, from power plants to water treatment systems. Join us, and bring your questions, as we talk about what's ahead.