Beyond Server Farms In The Sky

In a previous entry, I wrote about how weeks of outages had forced Twitter, the popular microblogging site, to scale back on service features in an effort to keep its servers from going down.
In a previous entry, I wrote about how weeks of outages had forced Twitter, the popular microblogging site, to scale back on service features in an effort to keep its servers from going down.The outages were caused by increased demand on the Twitter system caused by the estimated 3 million daily Tweet messages. Speculating on ways to solve Twitter performance problems lead me to focus on both personnel and technology, including Twitter's hiring of two new operations engineers, John Adams and Rudy Winnacker, who came over from Google, where he has worked as a system administrator for the past 5 years. On the technological front, the news that Jeff Bezos of Amazon is one of Twitter's new investment partners made me think Twitter might be able to solve its problems by using Amazon's AWS Cloud services to handle traffic spikes. Now the news, that Amazon's S3 online storage service has itself experienced significant downtime has made me rethink that suggestion.

Twitter runs on MySQL on Red Hat Enterprise Linux, on a managed hosting service NTT Managed Hosting Platform, which obviously was not designed to handle the current load. Twitter is one of a number of popular Web sites that have been built on the LAMP architecture. LAMP is a stack of simple, yet powerful technologies that to this day is behind a lot of popular Web sites: Linux, Apache, MySQL, and Perl. (In Twitters' case, the scripting engine is Ruby on Rails rather than Perl, but Twitter's architecture is still basically LAMP). The main problem with Twitter is the performance bottleneck caused by the MySQL relational database. Query-based relational database system just don't scale very well, especially for social-messaging apps like Twitter where the data is not that suitable to partitioning into multiple databases. For example, a popular user like Google Search Engine Optimization guru Matt Cutts can have more than 2,000 Twitter followers, which means each tweet to or from him must be written and rewritten thousands of times. In spite of the fact that Twitter limits messages to 140 characters, you're still talking about a huge number of SQL queries that have the potential to bottleneck in a RDBMS.

Cloud computing offers one solution to the RDBMS bottleneck problem, but just because you can have multiple points of failure doesn't mean that your system won't fail, as we've seen with the case of Amazon's S3 online storage service. Derided at the moment as "could computing" or "fog computing," a cloud computing solution may be premature -- at least for the next few years -- as a workable solution for websites that require high-availability backup. A more workable solution might be a distributed, fault-tolerant and schema-free document-oriented database like CouchDB, currently being incubated by the open-source Apache project. Neither a relational or object-oriented database, a CouchDB database is a flat collection of uniquely named documents. CouchDB also provides a RESTful HTTP API for reading and updating (add, edit, delete) these database documents in addition to supporting incremental replication with bi-directional conflict detection and resolution. A distributed DBMS like CouchDB might just be the ticket to meet the backup needs of high-demand, high-availability social-messaging websites like Twitter, at least until a workable cloud computing solution appears on the horizon.

Editor's Choice
Samuel Greengard, Contributing Reporter
Cynthia Harvey, Freelance Journalist, InformationWeek
Carrie Pallardy, Contributing Reporter
John Edwards, Technology Journalist & Author
Astrid Gobardhan, Data Privacy Officer, VFS Global
Sara Peters, Editor-in-Chief, InformationWeek / Network Computing