MongoDB Counters Couchbase Performance Claims - InformationWeek
IoT
IoT
Data Management // Software Platforms
Commentary
3/31/2015
12:06 PM
Doug Henschen
Doug Henschen
Commentary
Connect Directly
Google+
LinkedIn
Twitter
RSS
100%
0%

MongoDB Counters Couchbase Performance Claims

MongoDB test reports superior durability and throughput compared to Cassandra and Couchbase, but is it another case of biased, vendor-sponsored research?

Big Data Certifications: Finding The One That Works For You
Big Data Certifications: Finding The One That Works For You
(Click image for larger view and slideshow.)

Last week it was Couchbase. On Tuesday, March 31, MongoDB pointed to third-party research that shows its product delivers superior performance to that of its rivals. Whose research can you believe?

As we reported last week, Couchbase started this database-performance claim war by offering research conducted by Avalon Consulting LLC -- clearly sponsored by Couchbase -- that shows the Couchbase NoSQL database management system beating MongoDB on multiple performance measures.

The key point of Avalon's whitepaper was that the matchup was against MongoDB 3.0, that vendor's latest release featuring the recently acquired Wired Tiger 3.0 storage engine. The new storage engine substantially improves that product's write performance and scalability, according to MongoDB, yet by Avalon's measures, Couchbase had higher throughput and concurrency in every test.

[Want more on this database performance flap? Read Couchbase Claims Performance Gains Against NoSQL Rivals.]

MongoDB naturally begged to differ with Avalon's findings, noting that the Couchbase configuration used in the test harnessed three times more hardware than the MongoDB configuration, while the latter deployment had an automatic-failover feature turned off, contrary to MongoDB best practices. "If MongoDB were configured comparably to Couchbase in these tests, the results would be dramatically different," stated Kelly Stirman, MongoDB's director of products, in a comment on that story.

The research sponsored and released by MongoDB on Tuesday was carried out by United Software Associates. It compared MongoDB to Cassandra and Couchbase. The report states that the test featured identical hardware for all three products and featured a Yahoo! Cloud Serving Benchmark test of insert, updata, and read performance.

Predictably, MongoDB won on every measure in United Software Associates' tests, including different workloads and measures of database throughput, durability, and balanced combinations of both. The key twist in this test is that in all cases it featured a single-database-server and a single-client-server, a configuration that hardly stresses scalability or the highly-distributed nature of typical NoSQL database deployments -- or at least those of Cassandra and Couchbase. For MongoDB, it's common to see single-server deployments, according to Stirman.

A MongoDB-sponsored Yahoo! Cloud Serving Benchmark test shows superior performance to rivals, but the test was conducted on a small-scale, single-server deployment.

A MongoDB-sponsored Yahoo! Cloud Serving Benchmark test shows superior performance to rivals, but the test was conducted on a small-scale, single-server deployment.

"Databases are often deployed on a single server, and we know that based on profiles of about 60,000 MongoDB deployments that we have access to via our cloud-management tool," said Stirman in a phone interview with InformationWeek. "When run in a distributed fashion, all of these systems are comprised of multiple, individual servers, so you have to start by looking at what a single server delivers [in terms of performance]."

Contrary to this suggestion, scaled-out performance -- much less scaled-out performance across multiple data centers -- is rarely a clear multiple of single-server performance. In fact, Stirman acknowledged that "it's harder to do an apples-to-apples comparison that way because these products scale out in very different ways."

While Avalon's research featured multi-server configurations and tested concurrency in excess of 500 simultaneous users, United's research tested a single database server and a single client server, with no mention of concurrency demands. On the other hand, Avalon's tests used very different hardware configurations for the two products tested, and MongoDB contends its deployment best practices were ignored.

MongoDB's sponsored research was covered by nondisclosure agreements at this writing, so we have to leave it to Cassandra promoter DataStax and Couchbase to share their take on tests in the comments area below. Suffice it to say that the most reliable tests of database performance are independently verified tests such as TCP benchmarks. Sponsored research invariably delivers exactly what the sponsors pay for: a winning result.

Even better than an abstract benchmark test like a TCP is a proof-of-concept test using your own data and your own anticipated workloads. Only this type of real-world testing will tell you how products will perform in your environment. In the bargain, your people will also gain experience with the features, security, manageability, and ease of development of the products. On this point we're in agreement with Stirman of MongoDB.

"There's a long list of things you should look at, and performance is part of that consideration," he said.

Attend Interop Las Vegas, the leading independent technology conference and expo series designed to inspire, inform, and connect the world's IT community. In 2015, look for all new programs, networking opportunities, and classes that will help you set your organization’s IT action plan. It happens April 27 to May 1. Register with Discount Code MPOIWK for $200 off Total Access & Conference Passes.

Doug Henschen is Executive Editor of InformationWeek, where he covers the intersection of enterprise applications with information management, business intelligence, big data and analytics. He previously served as editor in chief of Intelligent Enterprise, editor in chief of ... View Full Bio
Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
shane.k.j
100%
0%
shane.k.j,
User Rank: Strategist
4/20/2015 | 9:22:46 AM
Re: Couchbase Response

I appreciate you taking the time to respond to our concerns. Couchbase Server continues to demonstrate great performance and scalability in clustered benchmarks. Now, we see MongoDB demonstrating solid performance in single-node benchmarks. It's not a surprise.

 

I think MongoDB is well-suited to single node deployments, but you can't estimate the performance of a cluster by benchmarking the performance a single node. After all, it doesn't account for replication or the performance trade-off between availability and consistency caused by the master/slave architecture of MongoDB.

 

If you were having difficulties with the latest client library, we would have been happy to help. Thumbtack benchmarked Couchbase Server 2.5 before the 2.x libraries were available. They were released with Couchbase Server 3.0, the release you benchmarked. It's not appropriate to benchmark a current release with a library that was intended for the previous release. At the very least, you should have performed the benchmark with the current 1.x release (1.4.9) rather than an outdated one (1.1.8).

We'll have to agree to disagree on CAS.

However, durability is not defined by "writing data to disk". It's a property - data will not be lost if a node fails. A distributed database is durable when replication is synchronous. For example, persistence on AWS is not enough for instance with ephemeral SSDs. If the instance is stopped, the data is lost. However, if the data is replicated... it's not.

usain
50%
50%
usain,
User Rank: Apprentice
4/16/2015 | 5:19:18 AM
Re: Couchbase Response
Thanks for sharing your thoughts, Shane. We at USA have some differing opinions and outlined our responses below to your comments:

1) Well, they chose to benchmark single node deployments.
RESPONSE: Single server test results are absolutely relevant as they represent the building blocks of any system. As noted by the author of YCSB, it is important to first test performance based on a single node, and then to test scalability. The better the performance of a single node, the fewer nodes will be required to meet the demands of a specific application. YCSB was designed by the team at Yahoo! to test vertical as well as horizontal scaling. We plan to test horizontal scaling in a future report.

2) They chose to have Couchbase Server perform two operations per write (read+update) instead of one, but not MongoDB.
RESPONSE: As indicated in your blog post and comments, Couchbase requires two operations to perform an update, whereas MongoDB requires one. In Couchbase, as you know, updates must be performed by the client, and the client must also manage conflict resolution. This adds to the work and ongoing maintenance for application development teams using Couchbase, and it impacts performance and scalability of the system. We implemented the client correctly, following Couchbase's documentation. We catch and retry any CASMismatchException errors that the server throws when an update would overwrite another update to the same document since the document was initially read. As noted in your
blog comments, Couchbase plans to add an equivalent feature in the next release. We will include this feature in future tests when it becomes available.

3) They used a two year old client library for Couchbase Server, but not MongoDB. It waits at least 100ms before checking if writes are durable. The latest, as little as 10μs.
RESPONSE: We used the same client library as Couchbase's benchmark conducted less than a year ago by Thumbtack (we did update it to the latest patch release). We tried to use 2.1.1 but it performed worse and was not able to complete the full test without Timeout exceptions. Since then, someone on the Couchbase forums reported the issue, possibly the same we encountered.

4) While single node deployments require sync disk writes for durability, distributed databases do not. They can leverage sync replication to perform writes on multiple nodes.
RESPONSE: Durability is defined in terms of writing data to disk, not memory. This is true whether using a single server or multiple servers. If your servers lose power or crash, data in RAM that has not been persisted to disk will be lost. However, if deployed correctly across racks and data centers, potential data loss can be minimized by replicating to multiple servers. All three products provide similar capabilities in this area, and we plan to evaluate this in our next round of tests.

We attempted to provide links to evidence but this site does not support URLs in comments.

United Software Associates
nosql-benchmarks@usain.com
usain
50%
50%
usain,
User Rank: Apprentice
4/16/2015 | 3:14:47 AM
Re: Couchbase Response
Thanks for posting, Robin. Here are some of my thoughts based on your response.
We consulted Jonathan Ellis's blog post that you mention to ensure we were following best practices for benchmarking Cassandra. We did use the same version of YCSB across all tests, and we incorporated the YCSB client code for each vendor, including the fork that Jonathan Ellis referenced in his post. If you feel there are some configurations that did not follow your best practices, we would love to hear from you.

All three products provide capabilities in terms of the ability to scale out across many servers and data centers. We plan to test these configurations in our next round of tests.

United Software Associates
nosql-benchmarks@usain.com
RSCHUMACHER400
50%
50%
RSCHUMACHER400,
User Rank: Apprentice
4/3/2015 | 2:43:00 PM
Re: Couchbase Response
As Jonathan Ellis wrote in "How not to benchmark Cassandra," there are a few inflection points that can tell you a lot about how a system will scale. One is when the dataset no longer fits in memory. Another is when the indexes no longer fit in memory. Smaller datasets may not be representative of what you will see as you push past those thresholds.

Here we have a tiny 20 million row dataset, where both data and indexes trivially fit in memory. This is the best possible scenario for MongoDB, since the wiredTiger engine slows dramatically for larger
datasets. Cassandra's log-structured engine continues to deliver consistent performance for larger-than-memory workloads.

Jonathan also explained that "it's important to take care that the same thing is being measured across the board."  Here, United Software Associates isn't even using the same version of the test suite across the different databases.  This can easily make a meaningful difference in the observed results.

Finally, it's telling that MongoDB isn't trying to compete in a clustered scenario. If your dataset fits in memory on a single machine, then as this benchmark implies it may well be that MongoDB is your best choice.  But for modern applications requiring performance and availability at scale across multiple machines and datacenters, look to Cassandra.
shane.k.j
50%
50%
shane.k.j,
User Rank: Strategist
4/1/2015 | 10:50:32 AM
Couchbase Response
MongoDB performs well when it 1) is limited to a single node, 2) doesn't store a lot of data, and 3) doesn't support a lot of users. This is a sweet spot for MongoDB. However, a single node deployment can't meet the rigorous demands of production deployments. Couchbase Server, on the other hand, shines when deployed as a distributed database.

Benchmark Issues

1) Well, they chose to benchmark single node deployments.

2) They chose to have Couchbase Server perform two operations per write (read+update) instead of one, but not MongoDB.

3) They used a two year old client library for Couchbase Server, but not MongoDB. It waits at least 100ms before checking if writes are durable. The latest, as little as 10µs.

4) While single node deployments require sync disk writes for durability, distributed database do not. They can leverage sync replication to perform writes on multiple nodes.

blog.couchbase.com/mongodb-rules-single-node-deployments

 

Shane K Johnson
Product Marketing
Couchbase
How Enterprises Are Attacking the IT Security Enterprise
How Enterprises Are Attacking the IT Security Enterprise
To learn more about what organizations are doing to tackle attacks and threats we surveyed a group of 300 IT and infosec professionals to find out what their biggest IT security challenges are and what they're doing to defend against today's threats. Download the report to see what they're saying.
Register for InformationWeek Newsletters
White Papers
Current Issue
IT Strategies to Conquer the Cloud
Chances are your organization is adopting cloud computing in one way or another -- or in multiple ways. Understanding the skills you need and how cloud affects IT operations and networking will help you adapt.
Video
Slideshows
Twitter Feed
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.
Flash Poll