A popular new movement aims to take SQL database management systems out of the stack. But when is this emerging approach right for you?
If Not SQL, Then What?
A number of strategies have been used to address NoSQL needs, most of which can be roughly divided into four groups:
Simple key-value store.
Fully SQL/tabular (!).
DBMS based on graphical data models are also sometimes suggested to be part of NoSQL, as are the file systems that underlie many MapReduce implementations. But as a general rule, those data models are most effective for analytic use cases somewhat apart from the NoSQL mainstream.
A key-value store is like a relational DBMS in which there only can be a single, three-column entity-attribute-value table, and in which you can't do self-joins. (In that analogy, the key part of the key-value pair may be thought of as an entity-attribute composite.) Thus, any conception of "object" has to live in the application logic; the data management software is little more than an intelligent storage system. Key-value stores may have modest performance advantages over the more efficient implementations of other models, but otherwise there's little advantage to using a key-value store. (One exception: You might want to use a persistent data store -- such as Membase from Membase, Inc. (the former Northscale) -- as the target for porting an existing memcached-based application.) Most key-value store products, Membase included, have or soon are planned to have alternative interfaces with at least somewhat richer data models.
More powerful are the quasi-tabular systems such as Cassandra,HBase,or (the original one) Google BigTable. In these, you can store what are essentially rows without worrying about whether each row has values for the same set of columns. Thus, a quasi-tabular database is like a relational database -- albeit one with lots of NULL values -- but with its schema controlled by the application program rather than a DBA.
The most prominent NoSQL implementations at big-name Web companies are of Cassandra or HBase, with Facebook, Twitter, Digg, StumbleUpon, and many others having joined the bandwagon. Both Cassandra and HBase are open source projects; neither is deemed to yet have reached its 1.0 release. But they have significant production installations even so. The go-to vendors for Cassandra and HBase are Riptanoand Hadoopspecialist Cloudera,respectively. (HBase is closely tied to the Hadoop MapReduce project.)
There's also a new generation of SQL-based systems that seem to overcome some of the NoSQL community's objections to conventional SQL DBMS, including Schooner,Clustrix,dbShards,VoltDB,and Akiban. These often come in key-value flavors as well, with a performance advantage of less than 2:1 versus the SQL implementations. Schooner somewhat aside, most of these vendors are still in early days in terms of getting actual customers.
Finally, there are the NoSQL document/object stores, most notably
The Agile ArchiveWhen it comes to managing data, donít look at backup and archiving systems as burdens and cost centers. A well-designed archive can enhance data protection and restores, ease search and e-discovery efforts, and save money by intelligently moving data from expensive primary storage systems.
2014 Analytics, BI, and Information Management SurveyITís tried for years to simplify data analytics and business intelligence efforts. Have visual analysis tools and Hadoop and NoSQL databases helped? Respondents to our 2014 InformationWeek Analytics, Business Intelligence, and Information Management Survey have a mixed outlook.
Join us for a roundup of the top stories on InformationWeek.com for the week of December 14, 2014. Be here for the show and for the incredible Friday Afternoon Conversation that runs beside the program.