Hortonworks Adds Cascading For Big Data App Development - InformationWeek
Data Management // Software Platforms
11:25 AM
Connect Directly

Hortonworks Adds Cascading For Big Data App Development

Hortonworks adds Concurrent's Cascading SDK to its Hadoop distribution to help developers operationalize big data applications.

Google's 10 Big Bets On The Future
Google's 10 Big Bets On The Future
(Click image for larger view and slideshow.)

Hortonworks wants to make it easier to build big data applications, so on Monday it announced that it will add software and support for the popular Cascading app-development framework to its Hadoop distribution.

Developed and supported by Concurrent, Cascading is a Java-based framework for app development popularized by Internet giants including eBay, LinkedIn, and Twitter, and increasingly used by more conventional enterprises to operationalize big data applications. Where data analysts tend to do interactive, ad-hoc analyses across Hadoop, the Cascading framework is geared to application developers who have to create repeatable big data systems that run day after day. For example, one of Cascading's cable company customers is using Cascading to develop applications based on set-top-box data now analyzed on Hadoop.

"This company brings in 19 terabytes of set-top-box data per day, and they need to build applications that consume that data, process it, and deliver data products to different constituents including marketing and sales," said Gary Nakamura, Concurrent's CEO in a phone interview with InformationWeek.

[Want more on the latest big data breakthroughs? Read MapR Brings Spark In-Memory Analysis To Hadoop.]

Cascading shields developers from the complexities of Hadoop programming, and with recent updates it has been certified by Hortonworks to work with Hadoop 2.0 and its YARN resource management framework. Cascading will also make use of Tez, a new feature of Hadoop 2.0 that eliminates the intermediate writes and delays associated with first-generation MapReduce programming.

"We've gone a lot deeper with Hortonworks with this announcement so that the 6,500-plus deployments that we have of Cascading can migrate from using MapReduce to Apache Tez without any code changes," said Nakamura.

Concurrent is the developer behind Cascading and the Driven app performance management system for Cascading apps.
Concurrent is the developer behind Cascading and the Driven app performance management system for Cascading apps.

Concurrent's partnership with Hortonworks is non-exclusive, according to Nakamura, but he described Hortonworks as having "open arms to other technologies that help with the broader ecosystem and enterprise adoption of Hadoop." Nakamura didn't elaborate, but one reason for the tighter partnership with Hortonworks might be Cloudera's efforts to go beyond Hive with its Impala offering, which offers an interactive SQL interface for Hadoop. Concurrent offers Cascading Lingual as a SQL-on-Hadoop interface for developers building analytic applications.

With this week's announcement, Hortonworks will ship the Concurrent SDK as part of its Hortonworks Data Platform distribution and it will also offer first- and second-tier support for the software.

IBM, Microsoft, Oracle, and SAP are fighting to become your in-memory technology provider. Do you really need the speed? Get the digital In-Memory Databases issue of InformationWeek today.

Doug Henschen is Executive Editor of InformationWeek, where he covers the intersection of enterprise applications with information management, business intelligence, big data and analytics. He previously served as editor in chief of Intelligent Enterprise, editor in chief of ... View Full Bio

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
D. Henschen
D. Henschen,
User Rank: Author
4/21/2014 | 2:09:28 PM
Hadoop as operating system analog or much, much more?
Given the divergent strategies of Hortonworks and Cloudera, we're going to see lots of examples of these partnerships playing out. Hortonworks is focused strictly on the platform where as Cloudera is going after the analytic opportunities on top of the platform. Hortonworks is partnered with Microsoft, SAP and Teradata, for example, whereas Cloudera is getting in the face of these and other incumbent data management companies including IBM and even Oracle (even though the latter is technically a partner). Cloudera has plenty of bucks, too, thanks in large part to its recently announced $780 million partnerhip with Intel, so look out world; who knows where its ambitions will end?
Register for InformationWeek Newsletters
White Papers
Current Issue
Cybersecurity Strategies for the Digital Era
At its core, digital business relies on strong security practices. In addition, leveraging security intelligence and integrating security with operations and developer teams can help organizations push the boundaries of innovation.
Twitter Feed
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.
Flash Poll