A popular new movement aims to take SQL database management systems out of the stack. But when is this emerging approach right for you?
For more than 20 years, SQL has been the unquestioned ruler of the database world. But now it's facing its greatest challenge.
Many of the largest and fastest-growing databases in the world belong to Web search engines and social networks (such as Google, Facebook and Twitter) or other internet companies (such as Zynga, maker of the games Farmville and Mafia Wars). Those companies routinely reject high-end commercial database management systems (DBMS) from the likes of Oracle or IBM. So do smaller companies in similar industries, who can only dream of running databases that big. Their reasons commonly include:
They don't want to pay license fees, and indeed have a strong bias toward open source software.
They don't need high-end features.
In fact, they don't need most SQL functionality.
They aren't that excited about writing SQL anyway (nor generating SQL via, say, an object-relational mapping layer).
License fees aside, they believe commercial database architectures and features get in the way of scalability.
In short, some of the largest and most innovative applications in the world are being built by people who don't see much value in DBMS in general, nor in high-end SQL DBMS in particular.
This rejection of proven products may sound like madness at first, but it turns out to make a certain sense. Databases are being built for single applications, and developers are optimizing performance, networking considerations, and/or software license fees at the expense of application extensibility. In such scenarios, DBMS lose their traditional roles as powerful DML (Data Manipulation Language) interpreters; rather, application programmers have to code every bit of data manipulation smarts themselves. Also falling by the wayside are most DBMS performance optimizations for different classes of queries. All that data management subsystems are used for is to read and write small amounts of data in a very rapid manner, and then to back up, replicate, and otherwise manage the already-stored data.
Consider the following use cases:
A large Web site exists primarily to accept and serve back photographs, songs, small snippets of text, and the contents of simple database records. Almost everything is keyed on user IDs. Throughput is massive. (Think of social networking or photo sharing sites.
Most of what happens in the database is that counters keep getting incremented, and not for transactions in which real money changes hands. So write locks are terrible bottlenecks, and transaction integrity -- while nice-to-have -- is not essential. (Think of social gaming or article sharing sites.)
A central server coordinates application versions and some amount of data across a broad number of occasionally-connected instances, perhaps for a broad variety of applications. (Mobile computing and social gaming sites need this sort of functionality.)
E-commerce aside, these use cases cover a large fraction of what's going on in Internet innovation. And while they all can be satisfied with traditional relational DBMS (which after all can be used to do pretty much anything), they all fit the RDBMS-unfriendly template that joins are inessential or secondary, transaction semantics are inessential or secondary, and two-phase commit is an overly restrictive way of replicating data.
We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.