Big Data // Big Data Analytics
News
6/3/2014
09:49 AM
Connect Directly
LinkedIn
Twitter
Google+
RSS
E-Mail
50%
50%

Actian, HP Vertica Join SQL-On-Hadoop Bandwagon

Actian and HP Vertica separately challenge Cloudera Impala, follow Pivotal in adapting their databases to run on the big data platform.

10 Big Data Pros To Follow On Twitter
10 Big Data Pros To Follow On Twitter
(Click image for larger view and slideshow.)

Actian on Tuesday joined the long list of companies that have introduced a way to support SQL access and querying on top of Hadoop. The announcement comes just a week after HP upgraded SQL-on-Hadoop functionality it introduced late last year through its Vertica database.

Actian and HP join Pivotal (with Greenplum-based HAWQ) and InfiniDB among companies extending existing relational database management systems to run on top of Hadoop's HDFS file system. Actian said it's going after Hadoop market-share leader Cloudera and its Impala offering, which was introduced last year as a faster, more SQL-compliant alternative to Hive.

[Want more on Pivotal's analysis options on Hadoop? Read Pivotal Subscription Points To Real Value In Big Data.]

The Actian Analytics Platform Hadoop SQL Edition, due out by the end of this month, beats Impala with even faster querying and ISO SQL 92 compliance, according to Actian CTO Mike Hoskins.

"We're offering full-functioning, SQL-complete functionality running natively on Hadoop, and we're also the highest-performing SQL database running on Hadoop," Hoskins told InformationWeek in a phone interview. "If you add those two together, we have an advantage that's hugely important for customers looking to empower their SQL users."

Actian internal research claims faster querying than Cloudera Impala.
Actian internal research claims faster querying than Cloudera Impala.

Actian has acquired and consolidated into its Actian Analytics Platform technologies including the ParAccel and Vectorwise databases and Pervasive DataRush data-integration software. The new SQL-on-Hadoop option uses what's now called the Vector engine for parallelized querying on HDFS. Actian's testing shows its query performance will be as much as 30 times faster than Impala, Hoskins said.

HP introduced SQL-on-Hadoop capabilities on its columnar Vertica database late last year by eliminating its proprietary storage layer so it could work with Hadoop-native file formats including JSON, Parquet, Thrift, and others. In last week's release, dubbed Dragline, HP eliminated all separation between Hadoop and Vertica clusters.

"That means Vertica can coexist with the Hadoop cluster, and we can access and query against HDFS data leaving it where it is," said Eamon O'Neill, HP's Vertica product manager in a phone interview with InformationWeek. Vertica is also capable of doing SQL queries against semi-structured data including clickstreams and Web session data, according to O'Neil.

Actian's architecture does not require a separate cluster, but it appears to be a step behind HP in that it has to load new data or convert existing data inside Hadoop into its proprietary database storage format to support SQL querying. Actian says support for Hadoop-native file formats are on the roadmap for a future release.

There's more to the Actian and HP announcements. Actian, for example, boasts 200 connectors to enterprise data systems and YARN-certified data processing and ETL on top of Hadoop. HP enhanced Vertica with live aggregate lookups for enhanced customer personalization analysis, sentiment analysis against short text streams such as Twitter tweets, and improved workload-management features. But the big news for both companies is clearly SQL-on-Hadoop support.

Despite the profusion of options for using SQL against big data, Hive remains the most widely used query tool with Hadoop. On that front Hortonworks says the latest generation of Hive offers greatly improved performance. Nonetheless, Hive and Impala both fall short of relational databases in SQL functionality, according to Forrester analyst Mike Gualtieri.

"Vendors have obsessed about performance, but the question is, can you run the queries you need to run?" Gualtieri told InformationWeek. "Impala still has work to do, but Actian, Pivotal, and Vertica are far more likely to support the queries that companies already have in use."

IBM, Microsoft, Oracle, and SAP are fighting to become your in-memory technology provider. Do you really need the speed? Get the digital In-Memory Databases issue of InformationWeek today.

Doug Henschen is Executive Editor of InformationWeek, where he covers the intersection of enterprise applications with information management, business intelligence, big data and analytics. He previously served as editor in chief of Intelligent Enterprise, editor in chief of ... View Full Bio

Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
Li Tan
50%
50%
Li Tan,
User Rank: Ninja
6/4/2014 | 4:40:08 AM
Re: SQL is one thing, not everything.
SQL is important but definitely not everything, even not the core in big data era. The real purpose of big data is its analysis capability. We need to coorelate unstrctured data and draw meaningful conclusion from it. So SQL is a method, facility but not the goal by itself.
D. Henschen
50%
50%
D. Henschen,
User Rank: Author
6/3/2014 | 12:33:22 PM
SQL is one thing, not everything.
SQL is important, and that's why there have been so many announcements, but remember that the first and highest purpose for Hadoop is not to be an alternative platform for the same old structured data anlayses. Hadoop's higher use is correlating structured and unstructured data and finding new insights in variable data such as clickstreams, log files, mobile data, social data and more. YARN will enable multiple modes of analysis. Spark, for example, is aspiring to support machine learning, streaming analysis, SQL and other ways of analyzing data. So SQL is one thing, but not everything.
6 Tools to Protect Big Data
6 Tools to Protect Big Data
Most IT teams have their conventional databases covered in terms of security and business continuity. But as we enter the era of big data, Hadoop, and NoSQL, protection schemes need to evolve. In fact, big data could drive the next big security strategy shift.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Tech Digest - August 27, 2014
Who wins in cloud price wars? Short answer: not IT. Enterprises don't want bare-bones IaaS. Providers must focus on support, not undercutting rivals.
Flash Poll
Video
Slideshows
Twitter Feed
InformationWeek Radio
Archived InformationWeek Radio
Howard Marks talks about steps to take in choosing the right cloud storage solutions for your IT problems
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.