Oracle: SQL Best For Big Data Analysis

Oracle admits there's a place for Hadoop and NoSQL, but it's sticking with its relational-database-centric view of big data opportunity.

and update individual fields in a record," said Kelly Stirman, MongoDB's director of products. "Even if they get better, you still can't scale these systems."

The model for scaling relational is "almost always larger hardware," according to Stirman, and even when there's a distributed option, like Oracle RAC, "it still requires shared storage, and it's not designed to be deployed across data centers."

Oracle does have its own NoSQL product, the Oracle NoSQL Database, and it was updated in April to a 3.0 release that Mendelsohn said "can go head-to-head with any NoSQL product." But he touted schema flexibility, not scale, as its calling. Oracle also has a Hadoop distribution (based on Cloudera) that runs on the Oracle Big Data Appliance. Sheer scalability is Hadoop's calling in Mendelsohn's book. But when it comes to accessing data, Mendelsohn said NoSQL and Hadoop fans are "creating problems for themselves" because they now have data fragmented across multiple platforms with no common language.

[Want more on Oracle Big Data Discovery? Read Oracle Unveils Hadoop Data Exploration Tool.]

NoSQL products don't use SQL, so they offer "primitive, low-value APIs and simple filtering," Mendelsohn said. And Hadoop vendors started out by promoting MapReduce, "but it turned out to be too complicated for most people, and it's a slow, batch-processing environment."

Mendelsohn observed dryly that NoSQL and Hadoop vendors are "figuring out that SQL is not such a bad idea." NoSQL vendors are "inching toward table abstraction" while the Hadoop vendors have multiple "Little SQL" SQL-on-Hadoop projects.

"When you look at their SQL implementations and their maturity compared to SQL, there's a big difference in the power of the language, the performance, the query optimization, and so on."


Mendelsohn explains Oracle Big Data SQL, which queries across NoSQL, Hadoop, and Oracle Database.

There's a lot of truth in these statements, but Oracle has its own answer for these gaps with the Oracle Big Data SQL query tool, which is designed to run SQL queries across Hadoop, NoSQL databases (just Oracle's, currently), and Oracle Database. You don't have to move high-scale data from those other platforms to Oracle Database. You just query it in place.

"All your developers know how to program against it, all your standard BI tools and third-party tools just work, and it's how we've solved the problem of big-data analytics," Mendelsohn said. 

InformationWeek took a deep dive on Oracle Big Data SQL in July, and we came away impressed. Broad SQL access is a very good thing, and it's something other data-management vendors, namely Teradata with Query Grid and Microsoft with Polybase, are also working on. At Oracle OpenWorld the company also introduced a data-exploration and visualization tool for Hadoop called Oracle Big Data Discovery.

The good news for Oracle customers is that the company is acknowledging that there are other platforms in the world. You won't catch Mendelsohn or chairman and CTO Larry Ellison admitting to the cost advantages of NoSQL databases or Hadoop. But with Big Data SQL and Oracle Big Data Discovery, the company is providing tools that will help customers tap these platforms.

So there's progress, but the suggestion that SQL "solves big data analytics" doesn't do justice to all the data science, use of algorithms and machine learning, and other techniques unleashing big-data insight. Mendelsohn is a database champion, so it's no surprise to hear him touting SQL. Just keep in mind that SQL is important, but it's not the only important form of big-data analysis.

Avoiding audits and vendor fines isn't enough. Take control of licensing to exact deeper software discounts and match purchasing to actual employee needs. Get the Software Licensing issue of InformationWeek today.