Software // Information Management
Commentary
9/5/2013
00:06 AM
Connect Directly
LinkedIn
Google+
Twitter
RSS
E-Mail
50%
50%

The Man Who Tortures Databases

Have doubts about NoSQL consistency? Meet Kyle Kingsbury's Call Me Maybe project. Here's the number.

It's an exciting time in the database world. We're casting aside the relational database management system shackles for NoSQL (and "NewSQL," but I'll just use NoSQL as a term for both) systems that let us achieve better availability and scalability by relaxing data consistency requirements. That is, NoSQL systems are built to scale horizontally -- so you can run lots and lots of different servers, minimizing the impact of any of them going down -- and to handle the complexities that arise when you spread data across lots of servers.

Well, at least that's what the marketing literature says.

The problem is that it's very difficult to validate the claims of NoSQL database vendors with any kind of rigorous testing. Sure, it's easy enough to set up a NoSQL database server, unplug one or more machines from the network, and verify that the vendor's claims are accurate for that use case. But in the real world, the number of ways that machines can be separated from one another across a network (otherwise known as partitioning) means there are far more potential snafus than just a complete disconnect. Think very slow response times (when do you timeout?), servers that can only send packets, servers that can only receive packets, and situations where servers cannot tell whether the network is down or not. Yet, in practice, when selecting a vendor, the only test most IT teams can do to verify that a database server is fault-tolerant for its infrastructure is the "unplug" test. If the system passes, we assume the rest of the marketing claims are correct, and we move forward. Maybe that's a good assumption, maybe not.

Which brings us to Kyle Kingsbury, also known as "Aphyr." In May, Kingsbury embarked on a Carly Rae Jepsen-inspired quest to test more rigorously how various databases handle different types of partitions. Kingsbury's Call Me Maybe project ("Hey I just met you / Our network's crazy / But here's my data / So store it maybe") revolves around Jepsen, an open source project that he built in his spare time over the course of a few hundred hours. Jepsen makes testing different types of partitions much easier. A few points: Kingsbury customizes analysis to each database's documentation and marketing claims about how it's supposed to work, and he runs everything on the same physical machine so nodes have perfectly synchronized clocks (DBs run in virtualized containers so that they look like separate hosts). And he doesn't test really nasty situations, like partially connected networks, where each server can see only some of the other servers.

The focus of Kingsbury's analyses with Jepsen is to evaluate how strongly consistent these databases can be -- that is, if you set up a network partition before data can be written everywhere, and then you have a network recovery sometime later, what's happened to your data? And for each of the database servers he's tested so far (PostgreSQL, Redis, MongoDB and Riak have been written up; Cassandra, Kafka, NuoDB and ZooKeeper are in the works and will be presented at Strange Loop), he's identified behavior that is unexpected, based on documentation and marketing. From finding bugs (with a particular configuration, MongoDB treated network errors as successful acknowledgements, which 10gen quickly fixed) to design flaws (Redis' Sentinel protocol can't lead to strong consistency because it's asynchronous) to the fact that default configurations are rarely a good choice, Kingsbury shows that we can't rely on vendor's marketing statements or product documentation, or even a simple "pull the plug" test.

What that means for IT is that, first and foremost, you need to test the application design and database software you're planning to use before you commit. Draw the topology of your application, and then ask questions about what happens as different types of network partitions come into play. The Jepsen project may help you do much more thorough testing than you could afford to do on your own.

Clearly, we're moving past the period where an application lives on a single database server; instead, the disparate data that drives a given application will live in different databases. That said, I reached out to Kingsbury to ask if he has guidance for organizations looking at different NoSQL options, beyond his blog posts. His response: Decide what objectives you have for your application, and make a database selection based on API, performance and consistency requirements. He also says that, for most applications, the best option is likely to have both strongly consistent data stores and strongly available data stores, and be clear on which is best suited for any particular data set.

Seven more questions Kingsbury recommends IT ask about database architectures:

-- What if we cut the network here?

-- What if we isolate a master?

-- What if there's no connected majority?

-- What if there's asymmetric connectivity?

-- How about intermittent connectivity?

-- What if clocks are unsynchronized?

-- What if a node pauses for a few minutes and then comes back?

He also has some personal preferences: Redis for caching, Cassandra for very large clusters and high-volume writes, Hadoop for extremely large amounts of data where slow query time is OK, Riak as an AP key-value store where conflicting versions of an object can be resolved, ZooKeeper for small pieces of CP state data, and PostgreSQL for relational data. (Here's an explanation of CP vs. AP, based on the CAP theorem.)

I have two other recommendations. First, Kingsbury and Peter Bailis, a graduate student at UC Berkeley, have co-written an excellent and detailed overview of network partitions, titled "The Network Is Reliable," that provides a tremendous amount of evidence that many networks are anything but reliable. And if you'd like a shorter -- but still technical -- overview of Jepsen and Kingsbury's analyses, check out "Jepsen: Testing the Partition Tolerance of PostgreSQL, Redis, MongoDB and Riak."

Ultimately, the problem isn't that databases are fallible and have bugs. We should expect that. It's also not that you're going to lose data. This too is expected; you just need to know how much and why and communicate that to the data owners. The problem is that we're building mission-critical applications using distributed databases and simply assuming that those databases will work as advertised. That is an incorrect and dangerous assumption: Distributed systems don't fail as cleanly as single-node systems, so we need to ask more questions than just, "What happens if this node fails?" Find out, maybe.

Comment  | 
Print  | 
More Insights
Comments
Threaded  |  Newest First  |  Oldest First
D. Henschen
50%
50%
D. Henschen,
User Rank: Author
9/5/2013 | 1:53:49 PM
re: The Man Who Tortures Databases
Good, hands-on stuff here, and a real eye-opener for anybody who assumes these next-gen databases come without compromises. It doesn't surprise me, then, that most of the NoSQL case examples I've seen have been applications that aren't quite as mission critical as, say, running a bank or trading floor.
jemison288
50%
50%
jemison288,
User Rank: Moderator
9/7/2013 | 12:56:12 AM
re: The Man Who Tortures Databases
Although given the reliability of NASDAQ's systems, I'm not sure they could do much worse...
Lorna Garey
50%
50%
Lorna Garey,
User Rank: Author
9/5/2013 | 2:26:41 PM
re: The Man Who Tortures Databases
Good enough analysis that I don't mind having "Call Me Maybe" stuck in my head. You mention Strange Loop - I hadn't heard of that show before, but it looks interesting. Would you recommend it for DBAs?
jemison288
50%
50%
jemison288,
User Rank: Moderator
9/6/2013 | 2:18:27 AM
re: The Man Who Tortures Databases
Yes, I would recommend Strange Loop for anyone looking to witness some really bright young people who are showing a number of really interesting things.
The Agile Archive
The Agile Archive
When it comes to managing data, donít look at backup and archiving systems as burdens and cost centers. A well-designed archive can enhance data protection and restores, ease search and e-discovery efforts, and save money by intelligently moving data from expensive primary storage systems.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Tech Digest - September 10, 2014
A high-scale relational database? NoSQL database? Hadoop? Event-processing technology? When it comes to big data, one size doesn't fit all. Here's how to decide.
Flash Poll
Video
Slideshows
Twitter Feed
InformationWeek Radio
Archived InformationWeek Radio
A look at the top stories from InformationWeek.com for the week of September 7, 2014.
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.