Actian, HP Vertica Join SQL-On-Hadoop Bandwagon - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Data Management // Big Data Analytics
09:49 AM
Connect Directly

Actian, HP Vertica Join SQL-On-Hadoop Bandwagon

Actian and HP Vertica separately challenge Cloudera Impala, follow Pivotal in adapting their databases to run on the big data platform.

10 Big Data Pros To Follow On Twitter
10 Big Data Pros To Follow On Twitter
(Click image for larger view and slideshow.)

Actian on Tuesday joined the long list of companies that have introduced a way to support SQL access and querying on top of Hadoop. The announcement comes just a week after HP upgraded SQL-on-Hadoop functionality it introduced late last year through its Vertica database.

Actian and HP join Pivotal (with Greenplum-based HAWQ) and InfiniDB among companies extending existing relational database management systems to run on top of Hadoop's HDFS file system. Actian said it's going after Hadoop market-share leader Cloudera and its Impala offering, which was introduced last year as a faster, more SQL-compliant alternative to Hive.

[Want more on Pivotal's analysis options on Hadoop? Read Pivotal Subscription Points To Real Value In Big Data.]

The Actian Analytics Platform Hadoop SQL Edition, due out by the end of this month, beats Impala with even faster querying and ISO SQL 92 compliance, according to Actian CTO Mike Hoskins.

"We're offering full-functioning, SQL-complete functionality running natively on Hadoop, and we're also the highest-performing SQL database running on Hadoop," Hoskins told InformationWeek in a phone interview. "If you add those two together, we have an advantage that's hugely important for customers looking to empower their SQL users."

Actian internal research claims faster querying than Cloudera Impala.
Actian internal research claims faster querying than Cloudera Impala.

Actian has acquired and consolidated into its Actian Analytics Platform technologies including the ParAccel and Vectorwise databases and Pervasive DataRush data-integration software. The new SQL-on-Hadoop option uses what's now called the Vector engine for parallelized querying on HDFS. Actian's testing shows its query performance will be as much as 30 times faster than Impala, Hoskins said.

HP introduced SQL-on-Hadoop capabilities on its columnar Vertica database late last year by eliminating its proprietary storage layer so it could work with Hadoop-native file formats including JSON, Parquet, Thrift, and others. In last week's release, dubbed Dragline, HP eliminated all separation between Hadoop and Vertica clusters.

"That means Vertica can coexist with the Hadoop cluster, and we can access and query against HDFS data leaving it where it is," said Eamon O'Neill, HP's Vertica product manager in a phone interview with InformationWeek. Vertica is also capable of doing SQL queries against semi-structured data including clickstreams and Web session data, according to O'Neil.

Actian's architecture does not require a separate cluster, but it appears to be a step behind HP in that it has to load new data or convert existing data inside Hadoop into its proprietary database storage format to support SQL querying. Actian says support for Hadoop-native file formats are on the roadmap for a future release.

There's more to the Actian and HP announcements. Actian, for example, boasts 200 connectors to enterprise data systems and YARN-certified data processing and ETL on top of Hadoop. HP enhanced Vertica with live aggregate lookups for enhanced customer personalization analysis, sentiment analysis against short text streams such as Twitter tweets, and improved workload-management features. But the big news for both companies is clearly SQL-on-Hadoop support.

Despite the profusion of options for using SQL against big data, Hive remains the most widely used query tool with Hadoop. On that front Hortonworks says the latest generation of Hive offers greatly improved performance. Nonetheless, Hive and Impala both fall short of relational databases in SQL functionality, according to Forrester analyst Mike Gualtieri.

"Vendors have obsessed about performance, but the question is, can you run the queries you need to run?" Gualtieri told InformationWeek. "Impala still has work to do, but Actian, Pivotal, and Vertica are far more likely to support the queries that companies already have in use."

IBM, Microsoft, Oracle, and SAP are fighting to become your in-memory technology provider. Do you really need the speed? Get the digital In-Memory Databases issue of InformationWeek today.

Doug Henschen is Executive Editor of InformationWeek, where he covers the intersection of enterprise applications with information management, business intelligence, big data and analytics. He previously served as editor in chief of Intelligent Enterprise, editor in chief of ... View Full Bio

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
D. Henschen
D. Henschen,
User Rank: Author
6/3/2014 | 12:33:22 PM
SQL is one thing, not everything.
SQL is important, and that's why there have been so many announcements, but remember that the first and highest purpose for Hadoop is not to be an alternative platform for the same old structured data anlayses. Hadoop's higher use is correlating structured and unstructured data and finding new insights in variable data such as clickstreams, log files, mobile data, social data and more. YARN will enable multiple modes of analysis. Spark, for example, is aspiring to support machine learning, streaming analysis, SQL and other ways of analyzing data. So SQL is one thing, but not everything.
InformationWeek Is Getting an Upgrade!

Find out more about our plans to improve the look, functionality, and performance of the InformationWeek site in the coming months.

Pandemic Responses Make Room for More Data Opportunities
Jessica Davis, Senior Editor, Enterprise Apps,  5/4/2021
10 Things Your Artificial Intelligence Initiative Needs to Succeed
Lisa Morgan, Freelance Writer,  4/20/2021
Transformation, Disruption, and Gender Diversity in Tech
Joao-Pierre S. Ruth, Senior Writer,  5/6/2021
White Papers
Register for InformationWeek Newsletters
Current Issue
Planning Your Digital Transformation Roadmap
Download this report to learn about the latest technologies and best practices or ensuring a successful transition from outdated business transformation tactics.
Flash Poll