MapR Ships Drill For SQL Analysis Of Big Data - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Data Management // Big Data Analytics
09:26 AM
Connect Directly

MapR Ships Drill For SQL Analysis Of Big Data

MapR says Apache Drill SQL-on-Hadoop option supports flexible data exploration, more extensive SQL support than Cloudera Impala.

16 NoSQL, NewSQL Databases To Watch
16 NoSQL, NewSQL Databases To Watch
(Click image for larger view and slideshow.)

Plenty of Hadoop vendors and hangers-on are promising SQL-on-Hadoop capabilities, but in the process they're buying into the old, inflexible model-before-querying approach to data analysis.

Hadoop software distributor MapR on Tuesday announced it will start shipping Apache Drill software that it says delivers a more flexible, big-data-savvy data-exploration approach.

Unlike Apache Hive and Cloudera's Impala option for SQL analysis on Hadoop, MapR says Drill, which is based on Google Dremel, does not require IT people to anticipate queries and set up data models in advance. Instead, Drill is designed for data-exploration first, and the list of compatible big data includes Hadoop sources including HDFS, Hive, and HBase tables; NoSQL data from sources such as MongoDB and REST APIs; and self-describing data such as Avro, Parquet, and JSON files with nested structures.

[Want more on Cloudera's SQL option? Read Cloudera Impala Brings SQL Querying To Hadoop.]

"The model-first approach is the antithesis of the approach of exploring what big data is trying to tell you," said Jack Norris, MapR's chief marketing officer, in a phone interview with InformationWeek. "Drill allows schema discovery on the fly, support for modern data structures, and support for ANSI SQL."

Drill's approach is more flexible than that of Hive or Impala, said Norris, because data analysts can explore the data before they set up fixed schemas, ETL processes, or hardened production queries. Instead of fixing on a schema before the query engine can touch the data, Drill lets users explore first, and the engine automatically discovers source schemas and adjusts query plans accordingly as SQL queries are applied.

Source: MapR
Source: MapR

In addition to providing an SQL query interface, Drill exposes as an ODBC connector through which data sources can be explored with simple desktop tools, like Microsoft Excel or Tableau Software, or through more sophisticated business intelligence suites. Though it's currently in a 0.5 (pre-production-ready) beta release, Drill supports 15 of the 22 SQL queries used in the TCP-H performance benchmark whereas Cloudera Impala supports only two of those queries, according to MapR executives.

Though Drill is described by MapR as an open community, MapR is its chief advocate, and it is the only Hadoop vendor distributing the software. Cloudera, the leading Hadoop distributor by customer numbers, is pushing Impala, while Hortonworks is advancing the capabilities of Apache Hive, the most popular SQL-on-Hadoop tool available.

Currently in early beta, Drill is far from recommended production use, and MapR's announcement offered few beta customer references. Instead, partners and analysts offered their opinions on MapR's news.

"Apache Drill's ability to provide access to data in Hadoop without the need for centralized schemas and also NoSQL datasets with complex data structures including nested and repeated fields differentiates it from traditional approaches to SQL-on-Hadoop," stated Matt Aslett, research director, data platforms and analytics, 451 Research, in MapR's press release.

Cloud Connect (Sept. 29 to Oct. 2, 2014) brings its "cloud-as-business-enabler" programming to Interop New York for the first time in 2014. The two-day Cloud Connect Summit will give Interop attendees an intensive immersion in how to leverage the cloud to drive innovation and growth for their business. In addition to the Summit, Interop will feature five cloud workshops programmed by Cloud Connect. The Interop Expo will also feature a Cloud Connect Zone showcasing cloud companies' technology solutions. Register with Discount Code MPIWK or $200 off Total Access or Cloud Connect Summit Passes.

Doug Henschen is Executive Editor of InformationWeek, where he covers the intersection of enterprise applications with information management, business intelligence, big data and analytics. He previously served as editor in chief of Intelligent Enterprise, editor in chief of ... View Full Bio

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
D. Henschen
D. Henschen,
User Rank: Author
9/16/2014 | 1:28:57 PM
"Few" end-users, not "no" end users.
This article was revised to reflect that in this early beta stage there are few beta customers for Drill (not "no" end users, as I originally had it), it's just that they're tech industry insiders including CISCO and Solutionary. At the launch of Impala there were a handful of beta customers, including Monsanto, as I recall, who were able to discuss their use of tool. It's always good to get a sense of things from users of the technology rather than the filtered, canned insights of vendors, partners, and analysts. Watch the Apache Drill community for more real-world testimonials.
InformationWeek Is Getting an Upgrade!

Find out more about our plans to improve the look, functionality, and performance of the InformationWeek site in the coming months.

Becoming a Self-Taught Cybersecurity Pro
Jessica Davis, Senior Editor, Enterprise Apps,  6/9/2021
Ancestry's DevOps Strategy to Control Its CI/CD Pipeline
Joao-Pierre S. Ruth, Senior Writer,  6/4/2021
IT Leadership: 10 Ways to Unleash Enterprise Innovation
Lisa Morgan, Freelance Writer,  6/8/2021
White Papers
Register for InformationWeek Newsletters
Current Issue
Planning Your Digital Transformation Roadmap
Download this report to learn about the latest technologies and best practices or ensuring a successful transition from outdated business transformation tactics.
Flash Poll