Commentary

Roger Smith
 

Beyond Server Farms In The Sky

In a previous entry, I wrote about how weeks of outages had forced Twitter, the popular microblogging site, to scale back on service features in an effort to keep its servers from going down.

In a previous entry, I wrote about how weeks of outages had forced Twitter, the popular microblogging site, to scale back on service features in an effort to keep its servers from going down.The outages were caused by increased demand on the Twitter system caused by the estimated 3 million daily Tweet messages. Speculating on ways to solve Twitter performance problems lead me to focus on both personnel and technology, including Twitter's hiring of two new operations engineers, John Adams and Rudy Winnacker, who came over from Google, where he has worked as a system administrator for the past 5 years. On the technological front, the news that Jeff Bezos of Amazon is one of Twitter's new investment partners made me think Twitter might be able to solve its problems by using Amazon's AWS Cloud services to handle traffic spikes. Now the news, that Amazon's S3 online storage service has itself experienced significant downtime has made me rethink that suggestion.

Twitter runs on MySQL on Red Hat Enterprise Linux, on a managed hosting service NTT Managed Hosting Platform, which obviously was not designed to handle the current load. Twitter is one of a number of popular Web sites that have been built on the LAMP architecture. LAMP is a stack of simple, yet powerful technologies that to this day is behind a lot of popular Web sites: Linux, Apache, MySQL, and Perl. (In Twitters' case, the scripting engine is Ruby on Rails rather than Perl, but Twitter's architecture is still basically LAMP). The main problem with Twitter is the performance bottleneck caused by the MySQL relational database. Query-based relational database system just don't scale very well, especially for social-messaging apps like Twitter where the data is not that suitable to partitioning into multiple databases. For example, a popular user like Google Search Engine Optimization guru Matt Cutts can have more than 2,000 Twitter followers, which means each tweet to or from him must be written and rewritten thousands of times. In spite of the fact that Twitter limits messages to 140 characters, you're still talking about a huge number of SQL queries that have the potential to bottleneck in a RDBMS.


More Software Insights

White Papers

More >>

Reports

More >>

Webcasts

More >>

Cloud computing offers one solution to the RDBMS bottleneck problem, but just because you can have multiple points of failure doesn't mean that your system won't fail, as we've seen with the case of Amazon's S3 online storage service. Derided at the moment as "could computing" or "fog computing," a cloud computing solution may be premature -- at least for the next few years -- as a workable solution for websites that require high-availability backup. A more workable solution might be a distributed, fault-tolerant and schema-free document-oriented database like CouchDB, currently being incubated by the open-source Apache project. Neither a relational or object-oriented database, a CouchDB database is a flat collection of uniquely named documents. CouchDB also provides a RESTful HTTP API for reading and updating (add, edit, delete) these database documents in addition to supporting incremental replication with bi-directional conflict detection and resolution. A distributed DBMS like CouchDB might just be the ticket to meet the backup needs of high-demand, high-availability social-messaging websites like Twitter, at least until a workable cloud computing solution appears on the horizon.


Related Reading




Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

InformationWeek encourages readers to engage in spirited, healthy debate, including taking us to task. However, InformationWeek moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing/SPAM. InformationWeek further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
T-Shirt Giveaway T-Shirt Giveaway: Each week we're selecting one great comment from our readers. The author of the comment will receive an InformaitonWeek Community t-shirt. So get posting!
Subscribe to RSS

Resource Links