It's an aggressive step forward just three weeks after the products were announced at Oracle Open World in San Francisco. But it comes as Oracle rivals are also embracing emerging platforms for large-scale data processing. Microsoft announced October 12 that it, too, would release software based on the open source Apache Hadoop project. IBM and EMC each released Hadoop-based software earlier this year, and Oracle's planned Big Data Appliance will include both the NoSQL database and an Oracle distribution of Apache Hadoop.
Oracle's NoSQL Database is available for download on the Oracle Technology Network. The software is based in part on the open source BerkeleyDB database, which Oracle acquired along with Sleepycat Software in 2006. But where BerkeleyDB is a single-node database, Oracle NoSQL includes a new programming interface and support for partitioning for highly distributed processing.
"We've tested the database across multiple hundreds of nodes and it scales very well," said Marie-Anne Neimat, VP of database development at Oracle. "It's targeted at customers who need the scalability to do things like track Web clickthroughs, smart-meter data, network management, or personalization."
[Want more on Oracle's big data strategy? Read Oracle's Big Plans For Big Data Analysis.]
The primary attraction of NoSQL databases is their scalability and schema flexibility. The latter lets organizations add and exploit new data attributes as needed. In contrast, the schema (or data model) for conventional relational databases such as IBM DB2, Microsoft SQL Server, MySQL, and Oracle database have to be revised with each change in data.
Fast-moving, dynamic businesses such as large-scale e-commerce companies and social networks have been big users of NoSQL products. Facebook, for example, runs on open source Cassandra, a transactional NoSQL database that enables it to make frequent schema changes and add new attributes and features to profiles and social network interactions.
Another attractive attribute of Cassandra and other open source products is their low cost, as they're designed to scale out on commodity hardware.
"There's an order-of-magnitude difference in the speed, performance, and cost of deploying conventional relational databases and Cassandra," said Billy Bosworth, CEO at Cassandra enterprise support and system monitoring and management software provider DataStax.
As an example, DataStax customer Constant Contact was considering a $2.5 million investment in a relational database approach that would have taken nine months to deploy; it ended up choosing Cassandra and deploying a system in three months for $250,000, according to Bosworth.
Oracle's NoSQL database software can run on commodity hardware, but Bosworth charged that Oracle's main goal is to deliver the product as part of the Big Data Appliance, which is yet another Oracle engineered system based on Sun hardware and designed to complement the Exadata product line. Oracle has not discussed pricing of the Big Data Appliance, but Bosworth said he expects high prices and the threat of vendor lock-in will ensure continued interest in alternatives such as Cassandra.
"Oracle clearly wants to take you into their whole red stack, but we think plenty of people will say, 'I need some leverage against Oracle, just purely from a procurement standpoint," Bosworth said. "The second point is that Cassandra is very mature, and we've had companies running very mission-critical applications for some time." Netflix, for example, runs large parts of its infrastructure on Cassandra and has done so for more than a year, Bosworth said.
In addition to running the NoSQL database, Oracle's coming appliance will include an open source distribution of Apache Hadoop software, which Oracle said it will back with enterprise service support. Oracle will also bundle in Oracle Linux and the Oracle Java HotSpot Virtual Machine, and it will license a new Oracle Data Integrator that will tap into Hadoop.
The Big Data Appliance itself will be a cross between the Exadata Database Machine and the Exadata Storage Expansion Rack, mixing x86 processing power with high-capacity disk storage. A full rack will provide up to 432 terabytes of storage.
In a statement, Oracle claimed its new NoSQL database will be easier to install, configure, and manage than competitive offerings, and it touted the security blanket of Oracle support.