IoT
Cloud // Cloud Storage
News
2/8/2011
05:48 PM
Connect Directly
Google+
LinkedIn
Twitter
RSS
E-Mail
50%
50%
RELATED EVENTS
Risk Data as a Strategy
Apr 06, 2016
There is a renewed focus on risk data aggregation and reporting (RDAR) solutions, as financial ins ...Read More>>

Google Spills Megastore's Secrets

A recently published paper sheds light on an important but seldom discussed Google storage system.

Google's success owes a lot to its computing infrastructure. The company's accomplished engineers have developed and deployed innovations like MapReduce, a way to process large data sets, BigTable, a distributed storage system, Sawzall, an interpreted programming language for analyzing large distributed data sets, the Google File System, a distributed file system, and Google Workqueue, a distributed query management system.

To this list, add Megastore, the storage system that supports Google App Engine, among other applications. Megastore has been used for several years at Google. It was discussed at the SIGMOD 2008 conference but information about the technology has only recently been published, in conjunction with last month's Conference on Innovative Data Systems Research (CIDR).

The paper detailing the technology, "Megastore: Providing Scalable, Highly Available Storage for Interactive Services," describes a storage system tailored to modern interactive online services.

"Megastore blends the scalability of a NoSQL datastore with the convenience of a traditional RDBMS in a novel way, and provides both strong consistency guarantees and high availability," the paper states. "We provide fully serializable ACID semantics within fine-grained partitions of data."

Web applications today, the paper says, have to be highly scalable, have to compete for users through rapid development, have to be responsive in terms of latency, have to provide users with data consistently -- no spreadsheets vanishing into the cloud -- and have to be available at all times.

"These requirements are in conflict," the paper states. "Relational databases provide a rich set of features for easily building applications, but they are difficult to scale to hundreds of millions of users. NoSQL datastores such as Google's Bigtable, Apache Hadoop's HBase, or Facebook's Cassandra are highly scalable, but their limited API and loose consistency models complicate application development. Replicating data across distant data centers while providing low latency is challenging, as is guaranteeing a consistent view of replicated data, especially during faults."

Having dismissed traditional RDBMS (relational database management system) and open source databases like MySQL, the paper also knocks "expensive commercial database systems like Oracle [which] significantly increase the total cost of ownership in large deployments in the cloud."

Megastore is designed to replicate file write operations synchronously across a wide-area network with reasonable latency and support for graceful failover across data centers. It aims to strike a middle ground between the scalability of NoSQL databases and the convenience of a traditional RDBMS.

James Hamilton, a VP and distinguished engineer at Amazon.com, has noted the limited public information about Megastore in several personal blog posts over the years and expressed qualified admiration for the technology when Google's paper was published. "Supporting consistent read and full ACID update semantics is impressive although the limitation of not being able to update an entity group at more than a 'few per second' is limiting," he wrote.

The paper states that over 100 production applications use Megastore as their storage service and that most of Google's customers see availability of 99.999% or higher for these applications. The average read latency for these applications is in the tens of milliseconds range and average write latency ranges from 100 to 400 milliseconds, depending on data center distance and the size of the write operation.

Google declined to comment, preferring to let the paper speak for itself.

Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
Google in the Enterprise Survey
Google in the Enterprise Survey
There's no doubt Google has made headway into businesses: Just 28 percent discourage or ban use of its productivity ­products, and 69 percent cite Google Apps' good or excellent ­mobility. But progress could still stall: 59 percent of nonusers ­distrust the security of Google's cloud. Its data privacy is an open question, and 37 percent worry about integration.
Register for InformationWeek Newsletters
White Papers
Current Issue
2016 InformationWeek Elite 100
Our 28th annual ranking of the leading US users of business technology.
Video
Slideshows
Twitter Feed
InformationWeek Radio
Archived InformationWeek Radio
Join us for a roundup of the top stories on InformationWeek.com for the week of April 24, 2016. We'll be talking with the InformationWeek.com editors and correspondents who brought you the top stories of the week!
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.