Big Data // Software Platforms
News
10/3/2013
05:21 PM
Connect Directly
LinkedIn
Twitter
Google+
RSS
E-Mail
50%
50%

When NoSQL Makes Sense

IT leaders must know the trade-offs they face to get NoSQL’s scalability, flexibility and cost savings.

 

 
InformationWeek Green -  Mar. 4, 2013 InformationWeek Green
Download the entire Oct. 7, 2013, issue of InformationWeek, distributed in an all-digital format (registration required).

 

 

 

Scalability and flexibility. These are the two key attributes of NoSQL databases, the ones that have made them big data darlings. NoSQL databases haven't quite reached the hype heights of the Hadoop data management framework, but they're drawing a lot of attention and experimentation. Choose wisely among the many and varied NoSQL options, or the trade-offs needed to get scalability and flexibility might be your project's undoing.

The label NoSQL covers a diverse collection of databases that tend to have at least two elements in common: distributed computing architectures and schemaless design. The databases are scalable because they were built to store and manage data distributed across (typically) x86 commodity server clusters that can be easily scaled out by adding more machines. They're flexible because, unlike relational databases, NoSQL databases don't require a predefined schema (a.k.a. data model) that demands one way to manage data in columns and rows. Under relational databases, those data models get ever more difficult to change as the database grows. That rigid data model becomes a problem if a company's evolving business model requires it to use data in a way it never anticipated.

NoSQL databases are also simple and inexpensive compared with their relational counterparts. The simplicity contributes to fast development and fast performance at scale. Many (though not all) NoSQL databases are open source, so you can get started with free community software and add commercial support and helpful commercial add-on modules as your deployment grows. Given that the biggest dissatisfaction with existing databases comes from licensing costs and terms, free and open will look appealing to many IT teams, especially those bootstrapping a pilot project.

Do these characteristics make NoSQL right for your company or project? They might, but there are drawbacks to NoSQL, most notably the lack of SQL querying capabilities and ACID (atomic, consistent, isolated and durable) performance. Those drawbacks can be frustrating to relational database veterans. The capabilities of NoSQL databases are also diverse, so you have to find the right tool for the job.

"Adopt the technology by understanding what it's good at, and try that first," Rick Branson, infrastructure engineer at Instagram, told a recent NoSQL conference. That's good advice, so let's explore the diversity of options.

Scale Redefined

Report Cover
Learn what your IT peers are thinking about enterprise cloud apps, new database options and more with our exclusive survey research in our 2013 Information-Week Enterprise Applications Report, free with registration

What you'll find:
Get This And All Our Reports

Whereas relational databases are general-purpose platforms, NoSQL databases have been developed to tackle particular, often extreme challenges. Amazon.com in 2007 came up with the Dynamo database to keep its massive, global e-commerce site always up and running. (It now sells DynamoDB as an Amazon Web Services online service.) Dynamo helped inspire Facebook's development of Cassandra, which it then contributed to open source. Relational databases just weren't designed to handle the quantity of data, number of users and ever-changing data requirements of outfits such as Amazon and Facebook.

Today, there are four important classes of NoSQL databases: key-value stores such as Riak and Redis; document databases such as MongoDB and Couchbase; wide-column databases such as Cassandra and HBase (the latter is part of the Hadoop framework); and graph databases such as Neo4j and Allegro. All of those databases are well-known among Internet startups and established Web-scale companies.

Instagram, for example, implemented Cassandra in the fall of 2012 (a few months after Facebook acquired it for $1 billion). The image-sharing service draws more than 150 million users a month, and each day those people add 55 million images and like 1.2 billion. The initial Cassandra deployment was a six-node cluster to replace a centralized security server-logging application that had been running on Redis, another NoSQL database. Redis is a key-value store designed for speed, thanks to its in-memory design. But Instagram's applications were growing quickly, outstripping RAM capacity that would have been expensive to scale, says Branson.

To read the rest of the article,
download the Oct. 7, 2013, issue of InformationWeek.

 

Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
D. Henschen
50%
50%
D. Henschen,
User Rank: Author
10/14/2013 | 3:16:56 PM
re: When NoSQL Makes Sense
I just had a chat with Matt Pfeil, co-founder of DataStax, who took some exception to my classification of four NoSQL database types as detailed in the full (downloaded) article. He says Cassandra really started as a key value store and had functionality added to handle wide-column use cases. The same is true of most wide-column products, Pfeil said, and the key point is that Cassandra and wide-column databases can address key-value use cases. In his view, the boundary between key value and wide-column isn't really important.

I'd make the point that the key-value products that don't handle wide column (more complex data) use cases shouldn't be in the same category. Maybe "wide column" undersells the capability of such databases, but eliminating the distinction would seem to oversell the capabilities of simple key value stores.
In A Fever For Big Data
In A Fever For Big Data
Healthcare orgs are relentlessly accumulating data, and a growing array of tools are becoming available to manage it.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Tech Digest - July 22, 2014
Sophisticated attacks demand real-time risk management and continuous monitoring. Here's how federal agencies are meeting that challenge.
Flash Poll
Video
Slideshows
Twitter Feed
InformationWeek Radio
Archived InformationWeek Radio
A UBM Tech Radio episode on the changing economics of Flash storage used in data tiering -- sponsored by Dell.
Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.