CockroachDB: Ultimate In Database Survival - InformationWeek
IoT
IoT
Data Management // Software Platforms
News
6/4/2015
10:00 AM
Connect Directly
Twitter
RSS
E-Mail
100%
0%

CockroachDB: Ultimate In Database Survival

Cockroach Labs announces $6.25 million in funding for its big data system that it says will survive calamity and maintain data integrity.

8 Linux Security Improvements In 8 Years
8 Linux Security Improvements In 8 Years
(Click image for larger view and slideshow.)

"The joke is -- maybe it's not a joke -- that cockroaches will survive World War III," began Spencer Kimball, haltingly, at CoreOS Fest in San Francisco May 5, as he explained what CockroachDB software is all about. "Mainly," he said, "CockroachDB survives."

Kimball is CEO of Cockroach Labs, a company founded in February with experienced engineers from Google. His CoreOS Fest talk was an early airing of the goals of the CockroachDB system. Kimball worked at Google for more than nine years and helped develop its Colossus distributed file system, work that stands him in good stead as he functions as CEO of a team trying to produce a database system that can function around the globe and never go down.

On Thursday, Cockroach Labs is announcing $6.25 million in funding for the nine-employee company after a sustained background effort to get CockroachDB established as a widely accepted open source project. Benchmark Capital leads the Series A round, with Google Ventures participating.

CockroachDB is now "on the cusp of alpha," or release for early, non-production use by developers, Kimball said in advance of the funding announcement.

[Learn more about NoSQL advances. See Couchbase Bets On Standard NoSQL Query Language.]

CockroachDB is open source code that tries to match the characteristics of Spanner, Google's database system for spanning the globe. Spanner makes indexes of Web crawler information instantly available for the Google Search engine. With split-second timing it manages the user lookups and ad servings that accompany individual searches.

(Image: Danil Melekhin/iStockphoto)

(Image: Danil Melekhin/iStockphoto)

Many enterprises would adopt Spanner, if they could. But it's not open source, and it depends on other Google technologies, like Colossus, which are not available for external operations. CockroachDB is an attempt to provide a standalone system that has Spanner's scalability, survivability, and data integrity. Data inside Spanner is consistent around the globe, with updates managed by its own atomic clock system that skips use of the NTP protocol. CockroachDB plans to duplicate Spanner's scalability and survivability, but most of its users won't need their own atomic clock system, so it's skipping that part.

The main goal is to get a distributed database that is highly survivable and maintains precisely synchronized data throughout the system, no matter how broadly it's distributed. For existing big data systems and for most NoSQL systems, data consistency throughout the system is still a distant goal. "Consistency is very important. But consistency is very, very hard," Kimball said.

Data consistency, like that provided by Spanner, is the bugaboo of proliferating NoSQL systems, which can gorge on huge amounts of data. But as database interactions pile up, the guarantee of data consistency declines. Most NoSQL systems boast "eventual consistency," where the results of data writes will eventually catch up with data reads.

That makes transactions a big problem for NoSQL systems. The user can't be sure the information used in an attempted transaction reflects the most recent changes. Precise, data consistency requires assured transactions that finish updating the system before any reads are executed against the target data.

Cockroach is shooting for data consistency across the system, no matter how many locations the database has been propagated to, Kimball said.

On Facebook, it doesn't necessarily matter if the number of "likes" for your picture of granny's baked beans is off by one or two respondents for three seconds. But for financial and other types of transactions, including those that update the database, "eventual consistency" is anathema to accurate operations.

"After you've written something, you should read what you just wrote one or two milliseconds later," but that's not the case with all NoSQL system interactions. "Most NoSQL systems only supply eventual consistency," he said.

CockroachDB, like its namesake, is able to propagate itself without human intervention. If new servers are added to the cluster, it recognizes the fact, propagates data to them, and adds it to its processing operations. Increases in traffic will trigger a horizontal scaling out by CockroachDB. The loss of a server or servers will prompt it to seek additional compute power elsewhere and rebalance the load.

If CockroachDB achieves its objectives, then the problem of database failure and recovery will become a highly infrequent occurrence. CockroachDB spreads its data around in small 32 or 64 MB chunks on different servers, with multiple copies of the database engine knowing where each chunk is. When a large amount of data is needed for processing, it streams in from many sources "using the CPU, memory, and network bandwidth of many nodes" to reduce the data latency. Multiple copies of the data are kept to ensure the loss of one copy won't leave any gaps in the database.

The size of the distributed data chunks may vary by user, Kimball said, but most will keep them in the 32 to 256 MB range for quick movement of data between cluster nodes and between data centers.

If CockroachDB works as planned, it will be a distributed database that is self-mapping and self-balancing. It will also able to recover from a major loss of hardware without disrupting operations. Failover and mean time to recovery will become outmoded terms.

But in addition to that, it will have precisely synchronized data throughout a distributed system. "It will be consistent all the time, a wonderful feature," Kimball said.

Charles Babcock is an editor-at-large for InformationWeek and author of Management Strategies for the Cloud Revolution, a McGraw-Hill book. He is the former editor-in-chief of Digital News, former software editor of Computerworld and former technology editor of Interactive ... View Full Bio

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
Charlie Babcock
50%
50%
Charlie Babcock,
User Rank: Author
6/5/2015 | 1:58:14 PM
Cockroach team includes GIMP authors
The VP of engineering at Cockroach Labs is Peter Mattis. He and Spencer Kimball are already well known as the authors of GIMP or Gnu Image Manipulation Program, the open source code substitute for Photoshop. They developed GIMP in 1995 while attending the University of Calif. at Berkeley.
Charlie Babcock
50%
50%
Charlie Babcock,
User Rank: Author
6/4/2015 | 2:22:32 PM
Behind the name: designed for survivability
At CoreOS Fest, Spencer launched into a litany of the cockroach's characteristics that make survivable software worthy of the name: you can hold a cockroach under water for 45 minutes and it comes back up and scoots off; they're hard to drown. You can withhold food from a cockroach for three months and it still functions. Hard to believe, but  best of all, you can cut off its head and the body still lives for several days. "It's one of nature's most successful designs," he says.
nasimson
50%
50%
nasimson,
User Rank: Ninja
6/4/2015 | 1:46:23 PM
Exciting times ahead
Despite the disgusting name, this sounds like the holy grail of databases. These are the very features that mega companies like Oracle and Microsoft were trying to get into their popular database platforms. Exciting times ahead!
News
A Data-Centric Approach to the US Census
James M. Connolly, Executive Managing Editor, InformationWeekEditor in Chief,  10/12/2018
News
10 Top Strategic Predictions for 2019
Jessica Davis, Senior Editor, Enterprise Apps,  10/17/2018
Commentary
AI & Machine Learning: An Enterprise Guide
James M. Connolly, Executive Managing Editor, InformationWeekEditor in Chief,  9/27/2018
Register for InformationWeek Newsletters
Video
Current Issue
The Next Generation of IT Support
The workforce is changing as businesses become global and technology erodes geographical and physical barriers.IT organizations are critical to enabling this transition and can utilize next-generation tools and strategies to provide world-class support regardless of location, platform or device
White Papers
Slideshows
Twitter Feed
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.
Sponsored Video
Flash Poll