If Not SQL, Then What?
A number of strategies have been used to address NoSQL needs, most of which can be roughly divided into four groups:
- Simple key-value store.
- Fully SQL/tabular (!).
A key-value store is like a relational DBMS in which there only can be a single, three-column entity-attribute-value table, and in which you can't do self-joins. (In that analogy, the key part of the key-value pair may be thought of as an entity-attribute composite.) Thus, any conception of "object" has to live in the application logic; the data management software is little more than an intelligent storage system. Key-value stores may have modest performance advantages over the more efficient implementations of other models, but otherwise there's little advantage to using a key-value store. (One exception: You might want to use a persistent data store -- such as Membase from Membase, Inc. (the former Northscale) -- as the target for porting an existing memcached-based application.) Most key-value store products, Membase included, have or soon are planned to have alternative interfaces with at least somewhat richer data models.
More powerful are the quasi-tabular systems such as Cassandra, HBase,or (the original one) Google BigTable. In these, you can store what are essentially rows without worrying about whether each row has values for the same set of columns. Thus, a quasi-tabular database is like a relational database -- albeit one with lots of NULL values -- but with its schema controlled by the application program rather than a DBA.
The most prominent NoSQL implementations at big-name Web companies are of Cassandra or HBase, with Facebook, Twitter, Digg, StumbleUpon, and many others having joined the bandwagon. Both Cassandra and HBase are open source projects; neither is deemed to yet have reached its 1.0 release. But they have significant production installations even so. The go-to vendors for Cassandra and HBase are Riptanoand Hadoopspecialist Cloudera,respectively. (HBase is closely tied to the Hadoop MapReduce project.)
There's also a new generation of SQL-based systems that seem to overcome some of the NoSQL community's objections to conventional SQL DBMS, including Schooner, Clustrix, dbShards,VoltDB,and Akiban. These often come in key-value flavors as well, with a performance advantage of less than 2:1 versus the SQL implementations. Schooner somewhat aside, most of these vendors are still in early days in terms of getting actual customers.