Commentary
NoSQL Needed For Cloud-Sized Data
At the Under the Radar showcase for cloud start-ups, I was struck by how relational database, one of the defining technologies of a previous era, has become outmoded in this one. In example after example, it was obvious SQL and structured data tables are no longer the right way to go about handling data.At the Under the Radar showcase for cloud start-ups, I was struck by how relational database, one of the defining technologies of a previous era, has become outmoded in this one. In example after example, it was obvious SQL and structured data tables are no longer the right way to go about handling data.That statement has to do with a particular type of data, the kind that gets generated copiously in a day's activity on the Internet. Each day sees 15 million tweets, 60 million Facebook updates and 1.6 billion people active online in a variety of other ways. It's hard for relational systems to keep up. Relational systems have to work hard at decomposing this data, storing it in tables and building indexes on it -- they work so hard on it that you don't really want your system to undertake the task. It's too expensive. "When you scale up relational systems, you introduce single points of failure... You lose the advantage of their precision but you gain the overhead," as you try to make the system work on a larger and larger data set, said John Quinn, VP of engineering at Digg, the social networking site, and lead off speaker at the Under the Radar's cloud event April 16 on the Microsoft Campus in Mountain View, Calif. Those NoSQL systems you've been hearing about, on the other hand, scale out by distributing their operations across more nodes in a server cluster. "There's nothing wrong with relational database…You just need to use the right tool for the right job," Quinn said, throwing in the fact that NoSQL stands for "Not Only SQL," although there were a few knowing smiles at that one.
Quinn is a leading member of the generation that doesn't want to try to capture terabytes of data with relational systems. He prompted the changeover from the MySQL open source relational database at the social networking site, Digg, to Cassandra, a key value store system. Cassandra performs many of the data sorting operations of a relational database but allows data reads to be done in advance of full updates. The practice sometimes leads to momentary consistency problems, since one user of the data might get a version that differs slightly from the next one, although both sought identical sets.
More Insights
White Papers
- Mobile BI: Actionable Intelligence for the Agile Enterprise
- How To Regain IT Control In An Increasingly Mobile World - by BlackBerry
Reports
More >>Webcasts
- Maximize ROI with Database Consolidation onto Private Clouds
- The ABC's of Cloud Computing in the Midmarket
The large, distributed key value store system "sacrifices consistency to slave lag," or tolerates the lapse between when an update occurs on a distributed node and when it's replicated on other servers. In most NoSQL systems, assured consistency is less an issue -- and less a virtue -- than in relational systems.
The NoSQL approach allows "tune-able consistency. You can trade off consistency for speed," Gunn noted.
Because a server in a NoSQL system automatically creates duplicates of the data on at least one other node, a server in the cluster can fail and no data is lost, the NoSQL system keeps processing and an application keeps running. In addition to Cassandra, MongoDB, Voldemort, and CouchDB are NoSQL systems in the public arena. Google and Amazon operate their own internally.
Gunn did implicitly point to a potential NoSQL shortcoming. Although indexes are associated with relational systems, if you do need an index, you may need an external system to build it. So far, the NoSQL systems have only rudimentary indexing.
That's why the NoSQL enthusiasts say their systems are not for financial or other time-sensitive transactions. Relational systems are. On the other hand, if you're updating your Zynga Farmville plot, then Cassandra makes a lot of sense for capturing that information. Of 24 companies presenting at this event, six had a big data handling, analytics or storage systems in mind. They included Sones, Cloudant, GenieDB, GoodData, neotechnology and Maxiscale.
Each start-ups presented their business and product plans in six minutes at the event, then faced questioning from a three-judge panel of reviewers.
Attend a virtual event on how cloud computing surpasses its architectural predecessors in the IT field, and discuss specific business advantages that have emerged from these dynamically scalable, highly virtualized environments. It happens April 20. Click here to find out more and register.
Related Reading
| To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy. | |
|
|
T-Shirt Giveaway: Each week we're selecting one great comment from our readers. The author of the comment will receive an InformaitonWeek Community t-shirt. So get posting! |
Subscribe to RSSResource Links
This Week's Issue
Technology Whitepapers
- Mobile BI: Actionable Intelligence for the Agile Enterprise
- Creating the Enterprise-Class Tablet Environment - by Yankee Group
- How To Regain IT Control In An Increasingly Mobile World - by BlackBerry
- The BlackBerry PlayBook tablet's Good Bones - by BlackBerry
- Red Alert: Why Tablet Security Matters - by BlackBerry












