Twitter Drops MySQL For Cassandra - InformationWeek
Software // Operating Systems
09:56 PM
Connect Directly

Twitter Drops MySQL For Cassandra

The social networking site opts for the open source Cassandra data management system in what's becoming a not-uncommon move.

He went on to explain that where relational systems rely on strictly defined tables with a set number of columns per row, Cassandra can effortlessly expand the number of rows, or data items being grouped together in the system.

Cassandra can likewise expand without application programmer intervention across a server cluster. More hardware can be brought on line and Cassandra can activate itself on a new node, calling out to the cluster load balancer for work to do.

On the other hand, Cassandra doesn't do joins, where related information is brought together from multiple tables into a new table in response to an SQL query. It doesn't guarantee referential integrity, where the user knows the data being used reflects the latest updates. It also can't process transactions, with a guarantee that the transaction will either be completed or discarded, the way relational systems do, concedes Ellis.

Since big Web systems need to assimilate large masses of data and make it available, frequently in read only fashion, systems like Cassandra concentrate on more immediate goals than the pristine data handling rules of relational systems. "Most Web applications do more reads than writes," said Ellis, which changes a key priority in a big online data handling system.

Relational databases with big jobs tend to be put on one piece of big, expensive hardware, such as an IBM mainframe or UltraSparc Unix server, because it's hard to run SQL systems on large clusters. Huge amounts of overhead are generated as the data is divided up and systems check constantly with each other on the integrity of the data being used.

Cassandra, on the other hand, takes to clusters like guppies to water. You can add more machines to a cluster running Cassandra without disrupting its operation, and soon the work is spread over a larger base, Ellis said.

That's why systems like Cassandra are gaining currency with Twitter and other social networking sites that deal with millions of users and masses of data. Yahoo, for example, uses a cluster of 4,000 servers to run Hadoop to index the results of a comprehensive Web crawl. The job still takes 73 hours, but it would take a lot longer if done by conventional relation database means.

Eric Evans, a co-worker of Ellis' at Rackspace in San Antonio, Texas, a managed service and cloud service provider, came up with the name "NoSQL," and Ellis hopes it continues to be used for the Cassandra type systems.

Social applications in the cloud may suddenly expand the amount of information they are collecting per user, as Facebook and Twitter frequently do. With a traditional database, that would mean a DBA would have to stop the database system, redefine the tables to add more columns and restart to collect the data.

With Cassandra, "You don't need to worry about a set number of rows You can add hundreds, thousands or millions of new rows. It lets you have as many rows as you want," said Ellis.

2 of 2
Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
How Enterprises Are Attacking the IT Security Enterprise
How Enterprises Are Attacking the IT Security Enterprise
To learn more about what organizations are doing to tackle attacks and threats we surveyed a group of 300 IT and infosec professionals to find out what their biggest IT security challenges are and what they're doing to defend against today's threats. Download the report to see what they're saying.
Register for InformationWeek Newsletters
White Papers
Current Issue
2017 State of the Cloud Report
As the use of public cloud becomes a given, IT leaders must navigate the transition and advocate for management tools or architectures that allow them to realize the benefits they seek. Download this report to explore the issues and how to best leverage the cloud moving forward.
Twitter Feed
InformationWeek Radio
Archived InformationWeek Radio
Join us for a roundup of the top stories on for the week of November 6, 2016. We'll be talking with the editors and correspondents who brought you the top stories of the week to get the "story behind the story."
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.
Flash Poll