The InformationWeek -- Blogs
InformationWeek's Analytics Weblog

Topics:   Analytics

  • Email this page E-mail this page
  • Print this page Print this page
  • Bookmark and Share
  • icon

Beyond Server Farms In The Sky


Posted by Roger Smith, Jul 24, 2008 06:34 PM

In a previous entry, I wrote about how weeks of outages had forced Twitter, the popular microblogging site, to scale back on service features in an effort to keep its servers from going down.


The outages were caused by increased demand on the Twitter system caused by the estimated 3 million daily Tweet messages. Speculating on ways to solve Twitter performance problems lead me to focus on both personnel and technology, including Twitter's hiring of two new operations engineers, John Adams and Rudy Winnacker, who came over from Google, where he has worked as a system administrator for the past 5 years. On the technological front, the news that Jeff Bezos of Amazon is one of Twitter's new investment partners made me think Twitter might be able to solve its problems by using Amazon's AWS Cloud services to handle traffic spikes. Now the news, that Amazon's S3 online storage service has itself experienced significant downtime has made me rethink that suggestion.

Twitter runs on MySQL on Red Hat Enterprise Linux, on a managed hosting service NTT Managed Hosting Platform, which obviously was not designed to handle the current load. Twitter is one of a number of popular Web sites that have been built on the LAMP architecture. LAMP is a stack of simple, yet powerful technologies that to this day is behind a lot of popular Web sites: Linux, Apache, MySQL, and Perl. (In Twitters’ case, the scripting engine is Ruby on Rails rather than Perl, but Twitter's architecture is still basically LAMP). The main problem with Twitter is the performance bottleneck caused by the MySQL relational database. Query-based relational database system just don't scale very well, especially for social-messaging apps like Twitter where the data is not that suitable to partitioning into multiple databases. For example, a popular user like Google Search Engine Optimization guru Matt Cutts can have more than 2,000 Twitter followers, which means each tweet to or from him must be written and rewritten thousands of times. In spite of the fact that Twitter limits messages to 140 characters, you're still talking about a huge number of SQL queries that have the potential to bottleneck in a RDBMS.

Cloud computing offers one solution to the RDBMS bottleneck problem, but just because you can have multiple points of failure doesn't mean that your system won't fail, as we've seen with the case of Amazon's S3 online storage service. Derided at the moment as "could computing" or "fog computing," a cloud computing solution may be premature -- at least for the next few years -- as a workable solution for websites that require high-availability backup. A more workable solution might be a distributed, fault-tolerant and schema-free document-oriented database like CouchDB, currently being incubated by the open-source Apache project. Neither a relational or object-oriented database, a CouchDB database is a flat collection of uniquely named documents. CouchDB also provides a RESTful HTTP API for reading and updating (add, edit, delete) these database documents in addition to supporting incremental replication with bi-directional conflict detection and resolution. A distributed DBMS like CouchDB might just be the ticket to meet the backup needs of high-demand, high-availability social-messaging websites like Twitter, at least until a workable cloud computing solution appears on the horizon.

« You Think You Have Problems? | Main | Zune Phone Rumors Heat Up Again »



Sign Up Now
For InformationWeek News Alerts




This is a public forum. United Business Media and its affiliates are not responsible for and do not control what is posted herein. United Business Media makes no warranties or guarantees concerning any advice dispensed by its staff members or readers.

Community standards in this comment area do not permit hate language, excessive profanity, or other patently offensive language. Please be aware that all information posted to this comment area becomes the property of United Business Media LLC and may be edited and republished in print or electronic format as outlined in United Business Media's Terms of Service.

Important Note: This comment area is NOT intended for commercial messages or solicitations of business.




 
 

  1. Sequential Programming: Like Eating Peas with a Straw.
  2. Biomolecular device using self-assembled DNA nanostructures?
  3. Coreinfo v2.0: A Simple Utility to Understand the Manycore Complexity in Windows


Join The InformationWeek Group On LinkedIn


                           


  1. More Reasons Why Linux Misses The Desktop
  2. Too Much Netbook For Too Litl?
  3. Verizon: $350 ETF Is A Go
  4. Motorola Explains Why Droid Doesn't Have Multi-Touch


  1. Florida Hospital Dials Up iPhones For Nurses
  2. Full Nelson: A Web Presence Needs Sizzle, My Nizzle
  3. Is Antivirus Software Dead?
  4. Practical Analysis: The Fastest-Growing Security Threat
  5. InformationWeek Analytics Research: Federated Search
  6. Securing The Cyber Supply Chain

 

  Ars Technica
Boing Boing
Channel 9 Forums
CRN Blogs
Dr.Dobb's Portal: Blogs
Engadget
Gizmodo
GrokLaw
  Lifehacker
Schneier on Security
Slashdot
TechCrunch
Techdirt
Techmeme
Valleywag

  DECEMBER 2008
NOVEMBER 2008
OCTOBER 2008
SEPTEMBER 2008
AUGUST 2008
JULY 2008
JUNE 2008
MAY 2008
  APRIL 2008
MARCH 2008
FEBRUARY 2008
JANUARY 2008
DECEMBER 2007
NOVEMBER 2007
OCTOBER 2007
SEPTEMBER 2007