The social networking site opts for the open source Cassandra data management system in what's becoming a not-uncommon move.
Guess what? Cassandra is going to tweet. The open source Cassandra data management system is going to replace the MySQL database system at Twitter, the latest of several MySQL replacements at social networking sites, according to Ryan King, a software engineer at Twitter.
Facebook and Digg, which used to rely on the open source MySQL database system, now part of Oracle, have already made the switch.
Cassandra can be run on large server clusters and is capable of taking in very large amounts of data at a time, performing sorts and calling up relevant data quickly. It's an example of the new types of data handling systems that are powering large Web applications, particularly social networking sites which deal with hundreds of thousands or millions of users.
The implementers of Cassandra and other cloud-based systems, which include Hadoop, Google's Big Table, MemCacheDB, Voldemort, CouchDB and MongoDB, are often referred to as the NoSQL movement. Thier proponents consider traditional relational databases, which use the SQL data access language, unsuitable for the superlarge tasks that confront them.
King is quoted in an interview posted last week on the MyNoSQL blog stating that Twitter wanted a system that could keep up with its growth, as tweets have gone from 2 million a day to over 50 million in 2009.
"We have a lot of data. The growth factor in that data is huge and the rate of growth is accelerating," he said in the blog posting.
King also said in the interview that Twitter sought a system with no single point of failure, which could execute highly scalable writes, and had a healthy open source community behind it.
Cassandra is an Apache Software Foundation project that originally came out of Facebook, which created it to manage its masses of data. The project recently moved out of first-year, incubator status at Apache to full project status and has an active developer group.
Jonathan Ellis is the Cassandra project management committee chair at Apache or general manager. In an interview Friday, he said Cassandra can be loaded with data by an application from a relational database and it will work with it as well as other sources. The implementers of "NoSQL" style systems don't necessarily rule out working with Oracle, IBM's DB2, MySQL , Sybase or Microsoft's SQL Server.
At the same time, Ellis wants to keep the rebellious sounding NoSQL name meaning something distinct from a SQL-based database system. It's hard sound has been explained away by some as meaning "Not only SQL." But Ellis says, nothing doing. "It has a combative connotation" and that's appropriate. "It's catchy. People remember it. It's bringing attention to the way you don't have to keep doing things the way relational databases have dictated" for 30 years now.
Join InformationWeek’s Lorna Garey and Mike Healey, president of Yeoman Technology Group, an engineering and research firm focused on maximizing technology investments, to discuss the right way to go digital.