Oracle: SQL Best For Big Data Analysis - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Data Management // Big Data Analytics
12:55 PM
Doug Henschen
Doug Henschen
Connect Directly

Oracle: SQL Best For Big Data Analysis

Oracle admits there's a place for Hadoop and NoSQL, but it's sticking with its relational-database-centric view of big data opportunity.

and update individual fields in a record," said Kelly Stirman, MongoDB's director of products. "Even if they get better, you still can't scale these systems."

The model for scaling relational is "almost always larger hardware," according to Stirman, and even when there's a distributed option, like Oracle RAC, "it still requires shared storage, and it's not designed to be deployed across data centers."

Oracle does have its own NoSQL product, the Oracle NoSQL Database, and it was updated in April to a 3.0 release that Mendelsohn said "can go head-to-head with any NoSQL product." But he touted schema flexibility, not scale, as its calling. Oracle also has a Hadoop distribution (based on Cloudera) that runs on the Oracle Big Data Appliance. Sheer scalability is Hadoop's calling in Mendelsohn's book. But when it comes to accessing data, Mendelsohn said NoSQL and Hadoop fans are "creating problems for themselves" because they now have data fragmented across multiple platforms with no common language.

[Want more on Oracle Big Data Discovery? Read Oracle Unveils Hadoop Data Exploration Tool.]

NoSQL products don't use SQL, so they offer "primitive, low-value APIs and simple filtering," Mendelsohn said. And Hadoop vendors started out by promoting MapReduce, "but it turned out to be too complicated for most people, and it's a slow, batch-processing environment."

Mendelsohn observed dryly that NoSQL and Hadoop vendors are "figuring out that SQL is not such a bad idea." NoSQL vendors are "inching toward table abstraction" while the Hadoop vendors have multiple "Little SQL" SQL-on-Hadoop projects.

"When you look at their SQL implementations and their maturity compared to SQL, there's a big difference in the power of the language, the performance, the query optimization, and so on."

Mendelsohn explains Oracle Big Data SQL, which queries across NoSQL, Hadoop, and Oracle Database.
Mendelsohn explains Oracle Big Data SQL, which queries across NoSQL, Hadoop, and Oracle Database.

There's a lot of truth in these statements, but Oracle has its own answer for these gaps with the Oracle Big Data SQL query tool, which is designed to run SQL queries across Hadoop, NoSQL databases (just Oracle's, currently), and Oracle Database. You don't have to move high-scale data from those other platforms to Oracle Database. You just query it in place.

"All your developers know how to program against it, all your standard BI tools and third-party tools just work, and it's how we've solved the problem of big-data analytics," Mendelsohn said. 

InformationWeek took a deep dive on Oracle Big Data SQL in July, and we came away impressed. Broad SQL access is a very good thing, and it's something other data-management vendors, namely Teradata with Query Grid and Microsoft with Polybase, are also working on. At Oracle OpenWorld the company also introduced a data-exploration and visualization tool for Hadoop called Oracle Big Data Discovery.

The good news for Oracle customers is that the company is acknowledging that there are other platforms in the world. You won't catch Mendelsohn or chairman and CTO Larry Ellison admitting to the cost advantages of NoSQL databases or Hadoop. But with Big Data SQL and Oracle Big Data Discovery, the company is providing tools that will help customers tap these platforms.

So there's progress, but the suggestion that SQL "solves big data analytics" doesn't do justice to all the data science, use of algorithms and machine learning, and other techniques unleashing big-data insight. Mendelsohn is a database champion, so it's no surprise to hear him touting SQL. Just keep in mind that SQL is important, but it's not the only important form of big-data analysis.

Avoiding audits and vendor fines isn't enough. Take control of licensing to exact deeper software discounts and match purchasing to actual employee needs. Get the Software Licensing issue of InformationWeek today.

Doug Henschen is Executive Editor of InformationWeek, where he covers the intersection of enterprise applications with information management, business intelligence, big data and analytics. He previously served as editor in chief of Intelligent Enterprise, editor in chief of ... View Full Bio
We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
2 of 2
Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
Li Tan
Li Tan,
User Rank: Ninja
10/4/2014 | 12:14:21 AM
Re: Our view
I concur with you. It's not so meaningful to argue if NoSQL and Hadoop are much better than traditional RDBMS and SQL. They are designed in different era and for different purposes. As IT professional, you need to be able to take the best weapon depending on your battle.
D. Henschen
D. Henschen,
User Rank: Author
10/3/2014 | 5:11:42 PM
Re: Our view
That's a good point on transactional vs. analytical and a distinction that didn't come up in Mendelsohn chat -- mostly because he was also talking about Hadoop, which is a high-scale platform for analytics. SQL does play as a language for transactional work, however, and I would say that Mendelsohn is correct in observing that NoSQL vendors need to mature their languages. Many have SQL-inspired languages, including Cassandra with CQL. Scalability and cost at scale are clearly problems for Oracle and other relational databases, so unless or until they can answer that calling, the coexistence of RDBMS and NoSQL is, as you say, assured.
User Rank: Apprentice
10/3/2014 | 2:34:37 PM
Our view
Nearly all of us here at DataStax have come out of the RDBMS world, have decades of experience with Oracle and other relational engines, and as such, we certainly respect the technology. 

But the enterprises we engage with today use a combination of RDBMS and NoSQL, with the former mostly serving system of record use cases, while the latter is optimized for system of engagement applications (although NoSQL is making strong inroads into system of record apps as well). The fact is, today's Internet Enterprise applications necessitate a new database foundation that provides a more flexible data model (which goes beyond just support for unstructured data and JSON), a simpler and more cost effective performance/scale model, and an optimized data distribution model that provides both full write/read anywhere capabilities as well as always-on availability. And these are things that a NoSQL database like Cassandra was built to provide. 

Robin Schumacher

VP of Products, DataStax
D. Henschen
D. Henschen,
User Rank: Author
10/2/2014 | 4:45:20 PM
The other side of analytics
Mendelsohn is not Oracle's analytics guru. That's a seperate unit that's working on R-based algos and other advanced analytics working on Exadata. Another tidbit of note from Mendelsohn: SAS is working on SaaS services based on the Oracle 12c multi-tenant database.
2021 Outlook: Tackling Cloud Transformation Choices
Joao-Pierre S. Ruth, Senior Writer,  1/4/2021
Enterprise IT Leaders Face Two Paths to AI
Jessica Davis, Senior Editor, Enterprise Apps,  12/23/2020
10 IT Trends to Watch for in 2021
Cynthia Harvey, Freelance Journalist, InformationWeek,  12/22/2020
White Papers
Register for InformationWeek Newsletters
Current Issue
2021 Top Enterprise IT Trends
We've identified the key trends that are poised to impact the IT landscape in 2021. Find out why they're important and how they will affect you.
Flash Poll