Hortonworks Updates Hadoop Data PlatformHortonworks Updates Hadoop Data Platform
HDP 2.2 adds more than 100 new enterprise-ready features, including faster SQL querying on Hadoop.
October 17, 2014
10 Robots Changing The World
10 Robots Changing The World (Click image for larger view and slideshow.)
Hadoop vendor Hortonworks has released an update to its Hortonworks Data Platform. HDP Version 2.2 features more than 100 new features across the Hadoop and Apache packages that comprise its distribution, the company said.
"This release represents six months of work within the [Apache Hadoop] community and is a major step forward for the enterprise-readiness of Hadoop," Hortonworks director of product marketing Jim Walker told InformationWeek in a phone call.
Hortonworks, which disdains proprietary extensions for Hadoop and markets itself as the only 100% open-source Hadoop distribution, puts Hadoop YARN at the architectural center of HDP. YARN (Yet Another Resource Negotiator) is a resource management layer introduced last year with Apache Hadoop 2.0.
[Azure HD Insight adds Apache Storm real-time data processing capabilities. See Microsoft Brings Storm Stream Analysis To Hadoop.]
As expected, HDP 2.2 will deliver Apache Spark on YARN. In addition, it'll support Apache Kafka on YARN. Kafka is a high-scale, fault-tolerant publish-subscribe messaging system, suitable for Internet of Things-type applications. And HDP 2.2 will bring automated cluster backup to the cloud for Microsoft Azure and Amazon S3.
But for enterprise users, the big news in HDP 2.2 is that it'll include phase 1 of the Stinger.next initiative. Stinger, a year-long project lead by Hortonworks, aims to increase the speed of SQL querying on Hadoop. With HDP 2.2, users will be able to perform SQL INSERTs, UPDATEs, and DELETEs in Hive.
"In other words, Stinger is transforming Hive from a read-only SQL layer to a read-write engine," Gigaom Research's Andrew Brust wrote in his note about the various Hadoop announcements this week.
"If you're an enterprise, you want an ANSI-compliant SQL on Hadoop," Forrester principal analyst Mike Gualtieri told InformationWeek during Forrester's Forum for Application Development & Delivery Professionals.
Enterprise customers, who may have relatively simple needs for Hadoop at the moment, nevertheless don't want to worry about their existing SQL statements running incorrectly, said Gualtieri, who believes SQL is Hadoop's "killer app in 2015."
Meanwhile, traditional database vendors have announced extensions for their systems that are able to interact with Hadoop clusters. These include Teradata's QueryGrid, IBM's Big SQL, Oracle's Big Data SQL, Microsoft's PolyBase, and Pivotal's HAWQ.
"The catch is, you have to run the SQL on their database, which then reaches out to Hadoop," Gualtieri said.
HDP 2.2 includes updated SQL semantics for ACID transactions in Apache Hive, as well as a cost-based optimizer for Hive that uses statistics to generate several execution plans and then chooses the most efficient path based on system resources.
Finally, like other Hadoop vendors this week, Hortonworks announced beefed-up security for its Hadoop product. Specifically, HDP 2.2 brings centralized approach to security policy via Apache Argus, now called Apache Ranger, which will handle security administration and policy enforcement across the cluster and across different engines.
Ranger is the open-source version of XA Secure's product. Hortonworks purchased XA Secure earlier this year and then proposed to the Apache Software Foundation as an incubator project.
Hortonworks' Walker said Ranger can use LDAP and Active Directory repositories, so organizations can leverage their existing security infrastructures. It then adds policy frameworks and enforcement to the various engines that run inside Hadoop (Hive, HBase, Storm, and Knox).
An HDP 2.2 preview is available today for download at Hortonworks.com; general availability to customers will be in November 2014. A complete list of HDP features and enhancements can be found at http://hortonworks.com/products/hdp/.
What will you use for your big data platform? A high-scale relational database? NoSQL database? Hadoop? Event-processing technology? One size doesn't fit all. Here's how to decide. Get the new Pick Your Platform For Big Data issue of InformationWeek Tech Digest today. (Free registration required.)
About the Author(s)
You May Also Like