Software // Information Management
News
3/8/2013
10:01 AM
Connect Directly
Facebook
Google+
RSS
E-Mail
50%
50%
Repost This

Carfax Selects MongoDB To Drive 11 Billion Records

Vehicle-history service switches to open source, NoSQL database with an eye to exploring its massive data set in new ways.

There's a 30-year-old relational database up on blocks at Carfax's Columbia, Mo., office.

On Tuesday, the Web service, which supplies used-vehicle history reports to millions of consumers and 30,000 dealerships every year, announced plans to retire its VMS-based RDBMS and switch to MongoDB, the open source, document-oriented database developed and supported by 10gen.

"VMS has been a very valuable OS for us," Carfax CTO Joedy Lenz told InformationWeek in a phone interview. "Unfortunately, with our data volumes, it became fairly expensive to operate and maintain." The production VMS system will be retired within 12 months, he said.

Carfax's Vehicle History Report, created in 1986, is the largest vehicle-history database ever assembled, with nearly 11.5 billion records and growing at 1 billion new records a year. It comprises information from more than 75,000 sources, such as U.S. and Canadian motor vehicle departments, service and repair facilities, insurance companies, and police departments.

[ For more on database vendors, see InformationWeek's Big Data 101: New Vendor-Neutral Guide. ]

When it takes over the driver's seat, the MongoDB will run across 50 servers. Lenz declined to name the hardware vendor. But 10gen CEO Max Schireson told InformationWeek on the phone: "Using inexpensive commodity servers means they can scale out," Schireson said.

While an open source product, 10gen claims some 500 customers worldwide who pay for its consulting and services. This customer list includes marquee Web brands like eBay and Craigslist, but traditional businesses as well, including three of the top 10 global banks and telcos, among others.

Another advantage of using MongoDB is its built-in redundancy. If a node fails, work is picked up by one or more secondary nodes.

In fact, Carfax already uses a seven-node VMS system. However, Lenz shared that in early performance testing, MongoDB ran transactions up to four times faster. But speed and cost savings weren't the only reasons Carfax decided to migrate to a NoSQL architecture.

Unlike their relational predecessors, NoSQL databases like MongoDB, Cassandra and Riak use a flexible, schema-less design that is especially well suited for massive amounts of variable data.

"Mongo does [transaction processing] with the added benefit of analytics and data mining," he said. "The sky's the limit ... we're just scratching surface."

As NoSQL products like MongoDB win new adherents, relational database vendors haven't been sitting still. Just last month, Oracle announced a major upgrade, MySQL 5.6, which includes features for high-scale deployments. For example, Oracle announced it would support direct access to data through the Memcached API, which is up to nine times faster than accessing data through SQL parsing.

Comment  | 
Print  | 
More Insights
The Agile Archive
The Agile Archive
When it comes to managing data, donít look at backup and archiving systems as burdens and cost centers. A well-designed archive can enhance data protection and restores, ease search and e-discovery efforts, and save money by intelligently moving data from expensive primary storage systems.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Government, May 2014
NIST's cyber-security framework gives critical-infrastructure operators a new tool to assess readiness. But will operators put this voluntary framework to work?
Video
Slideshows
Twitter Feed
Audio Interviews
Archived Audio Interviews
GE is a leader in combining connected devices and advanced analytics in pursuit of practical goals like less downtime, lower operating costs, and higher throughput. At GIO Power & Water, CIO Jim Fowler is part of the team exploring how to apply these techniques to some of the world's essential infrastructure, from power plants to water treatment systems. Join us, and bring your questions, as we talk about what's ahead.