Apache's Cassandra Adds Column Data Analysis

Keynote at the Cassandra Summit outlined features in the 0.7 release of the NoSQL database system, notably support for secondary indexes.
Cassandra Summit attendees came from throughout the United States, as well as from Japan, Switzerland, and Australia, he said.

Cassandra was downloaded 14,000 times during the month of July from its distribution server at the Apache Foundation. Four thousand people a day visit the project.

Cassandra is a member of the growing set of so-called NoSQL systems which are organized, like relational databases, into rows and tables but they dispense with two-phase commits and other transaction guarantees found in relational databases. While handling masses of data on a website efficiently, a NoSQL system tolerates slight delays in updates that would have to occur all at once in a relational system. Thus, it might be possible for two users issuing the same query at the same time to get slightly different answers as Cassandra plays catch-up on data updates.

The NoSQL systems try to build in high reliability for operations on a server cluster by creating copies of data on three different nodes. A piece of hardware can fail and there will still be an original and backup copy. Ellis is working on a Hinted Handoff feature in Cassandra which allows a node to be temporarily absent from a cluster and processing of data to continue anyway. A fourth copy is created, with a thread directed to the missing node that it is to update its data set when it comes back on line. If the node reappears, the fourth copy is deleted and the system proceeds with three copies, as before.

"The discussion was pretty technical," said Ellis in a follow-up e-mail message, "but so was the audience."

He said the beta release of Cassandra 0.7 will be tested by users for a month and then, with revisions, become the final release. The target date for Cassandra 1.0 is still undecided, he added.

In Boebel's introduction, he said Ellis organized the original Cassandra project while at Rackspace and built the community around it. When Ellis and fellow Rackspace employee Matt Pfeil left the company to found Riptano, a Cassandra support firm, Rackspace backed the move by investing in the new venture. Riptano supplies consulting and technical support for Cassandra. Ellis is CTO and Pfeil is CEO of Riptano.

"Riptano is doing very well -- we're up to 11 employees now, mostly engineers. So far we've basically been riding the coattails of Cassandra's success," said Ellis.

Editor's Choice
Sara Peters, Editor-in-Chief, InformationWeek / Network Computing
John Edwards, Technology Journalist & Author
John Edwards, Technology Journalist & Author
James M. Connolly, Contributing Editor and Writer