Big Data // Big Data Analytics
Commentary
2/13/2014
09:57 AM
Doug Henschen
Doug Henschen
Commentary
Connect Directly
LinkedIn
Twitter
Google+
RSS
E-Mail
50%
50%

9 Key Big Data Developments From Strata

We analyze the important news from SAS, Hortonworks, MetaScale, and others at the Strata conference, as big data seeks a productive next chapter.

O'Reilly's Strata 2014 conference is in full swing in Santa Clara, Calif., this week, and show organizers are turning a page with the conspicuous absence of the term "big data" from the major themes and conference tracks. It's another sign that people are ready to go beyond the comic book version of what's happening with data.

"Making Data Work" is the aspirational theme of this year's conference, and the tracks promise a more nuanced novella with topics including "Connected World" (Internet of Things), "Data in Action" (real-world case studies), "Data Science" (skills, techniques, and strategies), "Ethics, Policy, and Privacy" (can we actually do anything about these?), "Design" (data-visualization and interfaces), and "Hadoop and Beyond" (tools and technologies).

Many vendors making announcements at Strata have yet to pick up on the emphasis on productivity over hyperbole. The big-data buzz talk seems to be ladled into press releases in inverse proportion to what can be stated about specific capabilities and, more importantly, named customers citing real-world business benefits. 

[ Watch InformationWeek's Doug Henschen discuss "16 Top Big Data Analytics Platforms" with the editors of AllAnalytics (below). ]

We'll skip the news here, therefore, about venture capital rounds and stealth companies and focus instead on nine more notable announcements from Strata in three categories:

Analytics at Scale

SAS In-Memory Statistics for Hadoop: SAS has progressed from an Access connector to Hadoop to delivering SAS Visual Analytics and SAS High-Performance Analytics products capable of running on Hadoop. The new news this week is SAS In-Memory Statistics For Hadoop, which takes advantage of the vendor's capabilities to perform data analysis on high-scale, in-memory clusters.

SAS In-Memory Statistics For Hadoop, to be released in the first half of this year, will enable multiple users to "simultaneously and interactively manage, explore, and analyze data, build and compare models, and score massive amounts of data in Hadoop." Selected data from Hadoop is loaded into memory once for iterative analysis across multiple users, avoiding time-consuming rounds or writing to and reading from disk.

SAS also promises to eliminate "a patchwork of tools" and "the need for different analytic programming languages," but this hints at a SAS-only world that might not go down well with open-source-minded Hadoop fans. Analysis options are said to include clustering, regression, generalized linear models, analysis of variance, decision trees, random decision forests, text analytics, and recommendation systems. We're anxious to see how open this world might be and how it combines a memory cluster with a Hadoop cluster (or could they possibly be one and the same)?

Alpine Chorus: We gave you a preview of what Alpine Data Lab's new Alpine Chorus product offers in our recent "2014 Analytics, BI, and Information Management Survey." Alpine is calling it "The Sharepoint of Data Science."

The idea behind Chorus is to break down complex, iterative analytics workflows into discrete, understandable steps that can be shared with and controlled by business users. The goal is to eliminate the time-consuming back-and-forth between business users who know what they want and data wonks who were previously the only ones who could deliver results. Havas Media, the beta customer we interviewed in our report, said it gives business users and data analysts a shared workflow and "a common language" for analytic exploration. Chorus can do its distributed "in-cluster" work on top of Hadoop if you choose, avoiding data movement from your high-scale data store.

Next Page

Doug Henschen is Executive Editor of InformationWeek, where he covers the intersection of enterprise applications with information management, business intelligence, big data and analytics. He previously served as editor in chief of Intelligent Enterprise, editor in chief of ... View Full Bio

Previous
1 of 3
Next
Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
D. Henschen
50%
50%
D. Henschen,
User Rank: Author
2/13/2014 | 3:24:34 PM
Re: Seems smart, but late?
Good questions, Lorna. I'd say Autonomy and Vertica have capabilities that aren't matched by open-source products. Maybe parts of what they do, but not head-to-head competition from purely open source products. Maybe I'm missing an option, though, so I welcome comment on alternatives.
Lorna Garey
50%
50%
Lorna Garey,
User Rank: Author
2/13/2014 | 1:02:46 PM
Seems smart, but late?
HP opening up Autonomy and Vertica as platforms seems like a smart move, but what do you think are the odds it can get the developer ecosystem at this point that it needs to make a big play? What's the benefit for a developer to buy into Autonomy's code vs. going a more reliably open route?
RobPreston
50%
50%
RobPreston,
User Rank: Author
2/13/2014 | 11:30:44 AM
All Data Not Big Data
The industry has started to use "big data" as almost synonymous with analytics, no matter the size of the data pool being analyzed. Nice to see a conference organizer grounding things in reality.
Laurianne
50%
50%
Laurianne,
User Rank: Author
2/13/2014 | 11:15:37 AM
Big data talent
I am eager to hear about how the big data talent situation has evolved since last year's Strata conference. Anyone on the ground at the conference want to weigh in here?
6 Tools to Protect Big Data
6 Tools to Protect Big Data
Most IT teams have their conventional databases covered in terms of security and business continuity. But as we enter the era of big data, Hadoop, and NoSQL, protection schemes need to evolve. In fact, big data could drive the next big security strategy shift.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Tech Digest - July 22, 2014
Sophisticated attacks demand real-time risk management and continuous monitoring. Here's how federal agencies are meeting that challenge.
Flash Poll
Video
Slideshows
Twitter Feed
InformationWeek Radio
Archived InformationWeek Radio
A UBM Tech Radio episode on the changing economics of Flash storage used in data tiering -- sponsored by Dell.
Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.