Carfax Selects MongoDB To Drive 11 Billion Records - InformationWeek
Software // Information Management
10:01 AM
Connect Directly
Cloud Security: Myths & Reality
Apr 11, 2017
Enterprises are expanding their use of cloud services and cloud applications every day - but what ...Read More>>

Carfax Selects MongoDB To Drive 11 Billion Records

Vehicle-history service switches to open source, NoSQL database with an eye to exploring its massive data set in new ways.

There's a 30-year-old relational database up on blocks at Carfax's Columbia, Mo., office.

On Tuesday, the Web service, which supplies used-vehicle history reports to millions of consumers and 30,000 dealerships every year, announced plans to retire its VMS-based RDBMS and switch to MongoDB, the open source, document-oriented database developed and supported by 10gen.

"VMS has been a very valuable OS for us," Carfax CTO Joedy Lenz told InformationWeek in a phone interview. "Unfortunately, with our data volumes, it became fairly expensive to operate and maintain." The production VMS system will be retired within 12 months, he said.

Carfax's Vehicle History Report, created in 1986, is the largest vehicle-history database ever assembled, with nearly 11.5 billion records and growing at 1 billion new records a year. It comprises information from more than 75,000 sources, such as U.S. and Canadian motor vehicle departments, service and repair facilities, insurance companies, and police departments.

[ For more on database vendors, see InformationWeek's Big Data 101: New Vendor-Neutral Guide. ]

When it takes over the driver's seat, the MongoDB will run across 50 servers. Lenz declined to name the hardware vendor. But 10gen CEO Max Schireson told InformationWeek on the phone: "Using inexpensive commodity servers means they can scale out," Schireson said.

While an open source product, 10gen claims some 500 customers worldwide who pay for its consulting and services. This customer list includes marquee Web brands like eBay and Craigslist, but traditional businesses as well, including three of the top 10 global banks and telcos, among others.

Another advantage of using MongoDB is its built-in redundancy. If a node fails, work is picked up by one or more secondary nodes.

In fact, Carfax already uses a seven-node VMS system. However, Lenz shared that in early performance testing, MongoDB ran transactions up to four times faster. But speed and cost savings weren't the only reasons Carfax decided to migrate to a NoSQL architecture.

Unlike their relational predecessors, NoSQL databases like MongoDB, Cassandra and Riak use a flexible, schema-less design that is especially well suited for massive amounts of variable data.

"Mongo does [transaction processing] with the added benefit of analytics and data mining," he said. "The sky's the limit ... we're just scratching surface."

As NoSQL products like MongoDB win new adherents, relational database vendors haven't been sitting still. Just last month, Oracle announced a major upgrade, MySQL 5.6, which includes features for high-scale deployments. For example, Oracle announced it would support direct access to data through the Memcached API, which is up to nine times faster than accessing data through SQL parsing.

Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
How Enterprises Are Attacking the IT Security Enterprise
How Enterprises Are Attacking the IT Security Enterprise
To learn more about what organizations are doing to tackle attacks and threats we surveyed a group of 300 IT and infosec professionals to find out what their biggest IT security challenges are and what they're doing to defend against today's threats. Download the report to see what they're saying.
Register for InformationWeek Newsletters
White Papers
Current Issue
IT Success = Storage & Data Center Performance
Balancing legacy infrastructure with emerging technologies requires laying a solid foundation that delivers flexibility, scalability, and efficiency. Learn what the most pressing issues are, how to incorporate advances like software-defined storage, and strategies for streamlining the data center.
Twitter Feed
InformationWeek Radio
Archived InformationWeek Radio
Join us for a roundup of the top stories on for the week of November 6, 2016. We'll be talking with the editors and correspondents who brought you the top stories of the week to get the "story behind the story."
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.
Flash Poll