Couchbase Bets On Standard NoSQL Query Language

With N1QL, Couchbase aims to do for NoSQL what SQL did for relational databases: create a standard query language that works across NoSQL systems. Bonus: it's SQL-compatible.

Charles Babcock, Editor at Large, Cloud

June 3, 2015

6 Min Read
<p align="left">(Image: mrhighsky/iStockphoto)</p>

IT Hiring, Budgets In 2015: 7 Telling Stats

IT Hiring, Budgets In 2015: 7 Telling Stats


IT Hiring, Budgets In 2015: 7 Telling Stats (Click image for larger view and slideshow.)

NoSQL data systems, such as MongoDB, Cassandra, and Couchbase, have seen explosive growth because they can capture large amounts of unstructured data from websites and mobile applications.

NoSQL systems' flexibility and scalability makes them highly useful for swiftly acquiring information from the Web or in e-commerce transactions. Their Achilles' heel has been their inconsistent and unpredictable approach to querying. Unlike the standard frontend for relational systems, SQL, no prevailing language for NoSQL systems has emerged.

That may be about to change, thanks to Yannis Papakonstantinou, a professor of computer science and engineering at UC San Diego, and his colleagues, who have developed SQL++, a new query language for popular NoSQL systems. Couchbase is introducing N1QL (pronounced "nickel"), which it developed independent of Papakonstantinou's work. N1QL, the first commercial implementation of an SQL++ language, will launch Wednesday in beta at the Couchbase Connect user conference in Santa Clara, Calif.

[Want to learn more about how MongoDB achieved SQL reporting capability? See MongoDB Gets SQL Reporting Capability.]

Unlike other NoSQL query languages, N1QL is compatible with SQL and adds only a few simple commands to allow the JSON objects in a Couchbase database to be stored, retrieved, and otherwise manipulated like the data elements in the rows and columns of a relational database. It will be part of the upcoming Couchbase Server 4.0, due out later this year.

"We think right now we're the only ones who have got an SQL-compatible query language. We hope in the future that we are not," said Couchbase CEO Bob Wiederhold. Couchbase is what's known as a document database, in which units of English-like text, such as text and character strings found in email or comments on online forums, can be captured, stored, and retrieved as a single entity. Couchbase, MongoDB, Cassanda, and other systems capture such unstructured data as JSON objects and store the object, often with different types of data inside, as a unit.

Figure 1: (Image: mrhighsky/iStockphoto)

(Image: mrhighsky/iStockphoto)

Facebook, Twitter, Google, and other large Web companies were early innovators in the NoSQL big data field and remain heavy users. Many enterprises, needing a way to capture website visitor and customer information, have become heavy users.

Many such systems rely on a JSON object as the basic unit of data capture. JSON stands for JavaScript Object Notation. It started out for use in JavaScript programming, but can now be used with many languages. One JSON object can be nested inside another, so that a relationship between the data contained in each is also captured. For example, a customer's phone number can be a nested object inside a more general-purpose JSON customer object. The query system then knows how to retrieve a particular subset of data when the system user requires it.

N1QL adds the commands "Nest" and "Unnest" to a standard SQL system to give it NoSQL data-retrieval capabilities, said Ravi Mayuram, senior VP of products and engineering at Couchbase. Out of many relational database query languages, "only SQL has endured," which has given enterprises a large pool of database administrators, IT managers, and business analysts capable of using the language. Adding commands to SQL makes accessing data in a NoSQL system fit into the familiar pattern of working with SQL commands, he noted.

The prospect of a standard NoSQL query language that's compatible with SQL means a database administrator might be able to use the same tool to access data in a structured relational database and in an unstructured system.

Perhaps more important, Papakonstantinou's specification allows SQL++ to become a query language that works with MongoDB's Aggregation Pipeline API access path and that serves as a substitute for Cassandra's CQL query language or the AsterixDB system's AQL query language.

Papakonstantinou and co-authors wrote, "the SQL++ semantics can morph into the semantics of existing, semi-structured database query languages," citing the examples above. AsterixDB is an open source project started at UC Irvine that has been proposed as an Apache Software Foundation project. It is currently in a pre-project stage in the Apache Incubator.

The ability of SQL++ to work with different NoSQL systems would be a major step forward. Currently, Web, mobile, and Internet of Things application developers must learn the particular NoSQL system they think will be suitable for collecting the big data they expect the app to generate. "The lack of formal semantics inhibit[s] deep understanding of the query languages [of the various systems] and also impede[s] progress

Page 2: A game-changer for NoSQL developers?

towards clean, powerful, declarative query languages," wrote Papakonstantinou, Kian Win Ong, and Romain Vernoux in their paper, "The SQL++ Query Language: Configurable, Unifying and Semi-Structured."

Figure 2: Yannis Papakonstantinou

(Image: Flickr)

Yannis Papakonstantinou

(Image: Flickr)

NoSQL systems tend to be more developer-driven than data-administration-driven. They have their own constructs, syntax, and conventions, which the relational database systems avoided when SQL became a standard. SQL's English-like commands, arranged in simple declarative sentences, worked with precision and consistency across all relational systems.

The lack of a shared query language has made each NoSQL system an individual learning and programming challenge, with developers needing to know the expressions of each system along with their operational quirks in order to get applications to work.

"N1QL is our query language, which will make it dramatically easier to query and store data in Couchbase," said Wiederhold. Couchbase customers include eBay, Nordstrom, Neiman Marcus, Orbitz, Tencent, Verizon, Wells Fargo, AT&T, Bally's, and Comcast.

The Papakonstantinou team's work has been funded by the National Science Foundation as the Forward project, and it's been trying to prove that NoSQL can implement some of the same consistent rules and commands as the relational world did.

Couchbase's N1QL is the first SQL++ language. Couchbase is now funding the Papakonstantinou team's research. But N1QL is unlikely to be the sole SQL++ offering for very long. SQL++ and N1QL are open source code and there are likely to be more to follow. Somewhere down the road, they may make it dramatically easier to manipulate data in combinations of relational and NoSQL systems, simplifying enterprise IT data management.

[Editor's note: This article has been updated to reflect more accurately the timeline and independence of Couchbase's development of N1QL.]

About the Author

Charles Babcock

Editor at Large, Cloud

Charles Babcock is an editor-at-large for InformationWeek and author of Management Strategies for the Cloud Revolution, a McGraw-Hill book. He is the former editor-in-chief of Digital News, former software editor of Computerworld and former technology editor of Interactive Week. He is a graduate of Syracuse University where he obtained a bachelor's degree in journalism. He joined the publication in 2003.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like


More Insights