Startup 10gen, sponsor of the MongoDB open source project, this week launched commercial support for MongoDB, a NoSQL-style data system for use with large Web applications or very large data sets.
MongoDB derives its name from "humongous" and is a document-oriented database that can scale-out as demand increases by adding nodes on a server cluster. In addition to rapid scale out, the project includes traditional database functions, such as dynamic queries and indexes but does not rely on relational database type schemas.
10gen aims to boost enterprise adoption of MongoDB through professional services and technical support offerings, said Dwight Merriman, co-founder and CEO of 10gen.
Merriman is former co-founder and CTO of Doubleclick, where he architected the online advertising management system sold to Google in 2007 for $3.1 billion.
MongoDB joins the emerging class of non-relational data sorters and managers, sometimes referred to as NoSQL systems because of their departure from the strict rules of relational database.
NoSQL systems seek to separate large scale online data handling from relational database systems. Other efforts include the Apache Software Foundation's open source Cassandra project, initially developed at Facebook, Apache's CouchDB open source document data system, Google's Big Table, and Amazon.com's internal Dynamo system.
Several social networking sites have recently announced their movement away from MySQL to rely more heavily on Cassandra, including Twitter, Facebook and Digg. These systems trade off the guarantee of proper transaction handling for speed of updates and ability to produce fast, large scale reads that keep up with hits on a Web site. It's possible to get two different answers to the same query with such systems, critics say, because their data lacks referential integrity.
But advocates say they don't use their NoSQL systems for transactions and their asynchronous updates yield many performance and scaling advantages. The NoSQL systems avoid joins, one of the precision but performance-inhibiting features of relational systems, which combine related data from different database tables.
"We don't think this is going to be a small niche. It will creep out into the enterprise," Merriman said in an interview as 10gen launched March 29. When his company first started making MongoDB available for free downloads last year, they numbered a few hundred a month. But traffic has rapidly built up to a level of 30,000 downloads a month, he said. The activity reflects an interest in systems "where you want high performance, moving binary data (such as unstructured documents or video) is a really good fit," Merriman said. MongoDB won't be used in futures trading systems since it does not focus on completing a long running transaction, he added.
"It's clean and elegant and developers like using it," Merriman said. It's in production use at SourceForge, Electronic Arts, the New York Times, and Boxed Ice.
10gen will charge $5,000 per server for a year of "gold-level" around the clock MongoDB support. Basic and silver support are also available.
IDC analyst Carl Olofson thinks the architecture's got potential.
"This technology will be used for such activities as data warehouse definition and preparation, master data management initiatives and large scale enterprise data indexing and cataloging projects," Olofson wrote in a research note. InformationWeek has published an in-depth report on the expanding profile of data deduplication. Download the report here (registration required).