The InformationWeek -- Blogs

InformationWeek's Analytics Weblog

Topics:   Analytics

  • Email this page E-mail this page
  • Print this page Print this page
  • Bookmark and Share
  • icon

Beyond Server Farms In The Sky


Posted by Roger Smith, Jul 24, 2008 06:34 PM

In a previous entry, I wrote about how weeks of outages had forced Twitter, the popular microblogging site, to scale back on service features in an effort to keep its servers from going down.


The outages were caused by increased demand on the Twitter system caused by the estimated 3 million daily Tweet messages. Speculating on ways to solve Twitter performance problems lead me to focus on both personnel and technology, including Twitter's hiring of two new operations engineers, John Adams and Rudy Winnacker, who came over from Google, where he has worked as a system administrator for the past 5 years. On the technological front, the news that Jeff Bezos of Amazon is one of Twitter's new investment partners made me think Twitter might be able to solve its problems by using Amazon's AWS Cloud services to handle traffic spikes. Now the news, that Amazon's S3 online storage service has itself experienced significant downtime has made me rethink that suggestion.

Twitter runs on MySQL on Red Hat Enterprise Linux, on a managed hosting service NTT Managed Hosting Platform, which obviously was not designed to handle the current load. Twitter is one of a number of popular Web sites that have been built on the LAMP architecture. LAMP is a stack of simple, yet powerful technologies that to this day is behind a lot of popular Web sites: Linux, Apache, MySQL, and Perl. (In Twitters’ case, the scripting engine is Ruby on Rails rather than Perl, but Twitter's architecture is still basically LAMP). The main problem with Twitter is the performance bottleneck caused by the MySQL relational database. Query-based relational database system just don't scale very well, especially for social-messaging apps like Twitter where the data is not that suitable to partitioning into multiple databases. For example, a popular user like Google Search Engine Optimization guru Matt Cutts can have more than 2,000 Twitter followers, which means each tweet to or from him must be written and rewritten thousands of times. In spite of the fact that Twitter limits messages to 140 characters, you're still talking about a huge number of SQL queries that have the potential to bottleneck in a RDBMS.

Cloud computing offers one solution to the RDBMS bottleneck problem, but just because you can have multiple points of failure doesn't mean that your system won't fail, as we've seen with the case of Amazon's S3 online storage service. Derided at the moment as "could computing" or "fog computing," a cloud computing solution may be premature -- at least for the next few years -- as a workable solution for websites that require high-availability backup. A more workable solution might be a distributed, fault-tolerant and schema-free document-oriented database like CouchDB, currently being incubated by the open-source Apache project. Neither a relational or object-oriented database, a CouchDB database is a flat collection of uniquely named documents. CouchDB also provides a RESTful HTTP API for reading and updating (add, edit, delete) these database documents in addition to supporting incremental replication with bi-directional conflict detection and resolution. A distributed DBMS like CouchDB might just be the ticket to meet the backup needs of high-demand, high-availability social-messaging websites like Twitter, at least until a workable cloud computing solution appears on the horizon.

« You Think You Have Problems? | Main | Zune Phone Rumors Heat Up Again »



Sign Up Now
For InformationWeek News Alerts




This is a public forum. United Business Media and its affiliates are not responsible for and do not control what is posted herein. United Business Media makes no warranties or guarantees concerning any advice dispensed by its staff members or readers.

Community standards in this comment area do not permit hate language, excessive profanity, or other patently offensive language. Please be aware that all information posted to this comment area becomes the property of United Business Media LLC and may be edited and republished in print or electronic format as outlined in United Business Media's Terms of Service.

Important Note: This comment area is NOT intended for commercial messages or solicitations of business.




 
 

  1. Detecting Scalability Problems With Intel Parallel Universe Portal
  2. Just Say No To SFAQL Parallelism
  3. QuickThread: A New C++ Multicore Library


Join The InformationWeek Group On LinkedIn


                           


  1. AT&T, T-Mobile, Verizon All Offering Black Friday Sales
  2. Best Buy Rolls Out $99 Android Sale
  3. Apple Says Users To Blame For iPhone Virus
  4. iPhone And Android Dominate Mobile Web Browsing


  1. Apple Accepts PhoneGap For iPhone Development
  2. Apple Seeks Permanent Halt To Psystar Mac Clones
  3. NIST Director Sees Key Role In Emerging Technologies
  4. Sprint Gets Nod To Buy iPCS
  5. FCC Chair Wants More Broadband
  6. Gartner: Data Center Problems Ahead

 

  Ars Technica
Boing Boing
Channel 9 Forums
CRN Blogs
Dr.Dobb's Portal: Blogs
Engadget
Gizmodo
GrokLaw
  Lifehacker
Schneier on Security
Slashdot
TechCrunch
Techdirt
Techmeme
Valleywag

  DECEMBER 2008
NOVEMBER 2008
OCTOBER 2008
SEPTEMBER 2008
AUGUST 2008
JULY 2008
JUNE 2008
MAY 2008
  APRIL 2008
MARCH 2008
FEBRUARY 2008
JANUARY 2008
DECEMBER 2007
NOVEMBER 2007
OCTOBER 2007
SEPTEMBER 2007