Google's Spanner Database Offers Global ScaleSpanner, Google's latest database technology, relies on atomic clocks to ensure the consistency of distributed data.
Google began deploying Spanner in 2011 as part of an effort to rewrite F1, its internal advertising backend. A Spanner deployment, as befits Google's ambitions, is called a universe. The company has three universes: a test/playground universe, a development/production universe, and a production-only universe.
F1 was originally based on a MySQL database that handled tens of terabytes of data. Its initial structure involved a sharding scheme--a way to divide the database--that assigned customer data to a fixed shard. This allowed the use of indexes and sophisticated processing of queries, but proved difficult and costly when resharding--a resource-intensive manual process--was necessary to accommodate database growth.
Because of the complexity of resharding F1 data, Google's engineers began trying to limit the growth of the F1 MySQL database by storing data using Bigtable, a NoSQL data storage technology that the company developed in 2004 and continues to rely on. But Bigtable limited both the transactional data visibility Google needed and the ability to query across the entire data set.
[ What's new with OpenStack? Read OpenStack Poised For Cloud Leadership. ]
Google has always put a premium on the ability to operate at scale and Spanner represents a continuation of that goal.
"Spanner automatically reshards data across machines as the amount of data or the number of servers changes, and it automatically migrates data across machines (even across datacenters) to balance load and in response to failures," Google's research paper explains. "Spanner is designed to scale up to millions of machines across hundreds of datacenters and trillions of database rows."
Google fellow Jeff Dean, one of more than two dozen Google engineers credited as co-authors of the paper, described early work on Spanner back in 2009 at the third ACM SIGOPS International Workshop on Large Scale Distributed Systems and Middleware (LADIS).
Spanner provides applications with the ability to dynamically control where data resides--data stored close to the user can be delivered faster and data replicas stored near each other can be written to faster--and how many times data is replicated. The system can move data dynamically and transparently between data centers to balance resource usage. What's more, it can read and write data consistently on a global scale--something that is difficult for databases to do.
The key to Spanner is what Google calls the TrueTime API, which allows timestamps to be reliably applied to database operations around the globe. Without sufficiently accurate timestamps, network latency and clock variations can lead to synchronization errors in distributed systems.
To achieve a globally consistent time system, Google has deployed a set of atomic clocks and GPS hardware in every data center. Without redundant time references, Spanner would become unreliable, which isn't what Google wants with its revenue-generating ad system. The paper's authors argue that future distributed systems should no longer accept loosely synchronized clocks.
"This is more or less NoSQL throwing in the towel," said Jim Starkey, CTO of NuoDB, a company that makes a cloud database system with goals similar to Spanner, in a phone interview.
Google pushed MySQL to the breaking point, says Starkey, noting that the company's data was heavily sharded and didn't scale. NoSQL, in the form of Bigtable, wasn't the answer for Google's distributed ad system because it only promises eventual data consistency. When accessing data in real-time, he explained, the information might not be consistent throughout the system. NoSQL trades guarantees of atomicity, consistency, isolation, durability--the ACID properties that ensure data reliability during database operations--for performance in distributed systems at scale.
"Trying to get ACID transactions on Bigtable is a programming nightmare and too difficult for most programmers to do," said Starkey.
This is something Google itself acknowledges in its paper: "[W]e have also consistently received complaints from users that Bigtable can be difﬁcult to use for some kinds of applications: those that have complex, evolving schemas, or those that want strong consistency in the presence of wide-area replication."
Starkey says Spanner is very interesting technology, but he's not sure whether it will be useful for other businesses that don't deal with the same distributed data challenges as Google. Even Google, he suggests, may not end up using it for its external-facing applications like Gmail or Picasa, because those apps don't have the same data access requirements as F1.
"The question is can Google really get the throughput that they need for their applications," said Starkey. "My take on it is that Spanner is quite complicated to configure, install, and manage. But it's really exciting to see another company going out there to deal with databases in a way that validates the assumptions we made."