Software // Information Management
News
10/10/2010
03:33 PM
Connect Directly
RSS
E-Mail
50%
50%

NoSQL Basics, Benefits and Best-Fit Scenarios

A popular new movement aims to take SQL database management systems out of the stack. But when is this emerging approach right for you?

If Not SQL, Then What?

A number of strategies have been used to address NoSQL needs, most of which can be roughly divided into four groups:

  • Simple key-value store.
  • Quasi-tabular.
  • Fully SQL/tabular (!).
  • Document/object.
DBMS based on graphical data models are also sometimes suggested to be part of NoSQL, as are the file systems that underlie many MapReduce implementations. But as a general rule, those data models are most effective for analytic use cases somewhat apart from the NoSQL mainstream.

A key-value store is like a relational DBMS in which there only can be a single, three-column entity-attribute-value table, and in which you can't do self-joins. (In that analogy, the key part of the key-value pair may be thought of as an entity-attribute composite.) Thus, any conception of "object" has to live in the application logic; the data management software is little more than an intelligent storage system. Key-value stores may have modest performance advantages over the more efficient implementations of other models, but otherwise there's little advantage to using a key-value store. (One exception: You might want to use a persistent data store -- such as Membase from Membase, Inc. (the former Northscale) -- as the target for porting an existing memcached-based application.) Most key-value store products, Membase included, have or soon are planned to have alternative interfaces with at least somewhat richer data models.

More powerful are the quasi-tabular systems such as Cassandra, HBase,or (the original one) Google BigTable. In these, you can store what are essentially rows without worrying about whether each row has values for the same set of columns. Thus, a quasi-tabular database is like a relational database -- albeit one with lots of NULL values -- but with its schema controlled by the application program rather than a DBA.

The most prominent NoSQL implementations at big-name Web companies are of Cassandra or HBase, with Facebook, Twitter, Digg, StumbleUpon, and many others having joined the bandwagon. Both Cassandra and HBase are open source projects; neither is deemed to yet have reached its 1.0 release. But they have significant production installations even so. The go-to vendors for Cassandra and HBase are Riptanoand Hadoopspecialist Cloudera,respectively. (HBase is closely tied to the Hadoop MapReduce project.)

There's also a new generation of SQL-based systems that seem to overcome some of the NoSQL community's objections to conventional SQL DBMS, including Schooner, Clustrix, dbShards,VoltDB,and Akiban. These often come in key-value flavors as well, with a performance advantage of less than 2:1 versus the SQL implementations. Schooner somewhat aside, most of these vendors are still in early days in terms of getting actual customers.

Finally, there are the NoSQL document/object stores, most notably CouchDB (which boasts a Lotus Notes-like replication model) and MongoDB (which has a standard NoSQL laundry list of replication options). These directly store JSON (JavaScript Object Notation) objects -- collections of name-value pairs. CouchDB and MongoDB also both have ways of indexing, querying, and/or updating individual "fields" within the document schema. CouchDB and MongoDB both have considerable numbers of users, generally for applications that don't seem to demand large data volumes or high throughput. The go-to vendor for CouchDB is CouchOne or, if you have a larger database, Cloudant. The company behind MongoDB is 10gen.

Previous
3 of 4
Next
Comment  | 
Print  | 
More Insights
The Agile Archive
The Agile Archive
When it comes to managing data, donít look at backup and archiving systems as burdens and cost centers. A well-designed archive can enhance data protection and restores, ease search and e-discovery efforts, and save money by intelligently moving data from expensive primary storage systems.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Tech Digest - July 22, 2014
Sophisticated attacks demand real-time risk management and continuous monitoring. Here's how federal agencies are meeting that challenge.
Flash Poll
Video
Slideshows
Twitter Feed
InformationWeek Radio
Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.