Oracle on Monday confirmed its rumored plan for a Hadoop appliance, but that's just the beginning of a comprehensive big-data strategy the company unveiled at its annual OpenWorld conference in San Francisco. The company also will release Hadoop data-management software, a NoSQL database, and an enterprise-focused release of R analytics software, and will back it all with Oracle support.
The breadth of the announcement was surprising given that less than a year ago Oracle was discounting the importance of the NoSQL (not only SQL) movement. With the Oracle Big Data Appliance, the Exalytics Business Intelligence Machine, and a battery of planned software announced at the event, Oracle will take on the full spectrum of data, including fast-growing data types such as Web log files, social media data, and mobile data with geospatial information.
The big caveat: No actual release dates were announced for the Big Data Appliance or related software. If that turns out to be a long way off, Oracle will effectively be stalling for time, dissuading Oracle customers from experimenting with third-party NoSQL and Hadoop alternatives.
The Big Data Appliance is a cross between the Exadata Database Machine and the Exadata Storage Expansion Rack. It mixes X86 processing power with high-capacity disk storage. A full rack will provide up to 432 Terabytes of storage. Release dates and pricing have yet to be announced, so there's no telling if it will be competitive with the rock-bottom costs of commodity hardware deployments usually associated with Hadoop.
[ Oracle also unveiled Enterprise Manager 12c at OpenWorld. See Will Management Tool Make IT Want More Oracle? ]
The appliance will include an open-source distribution of Apache Hadoop software, which Oracle says it will back with enterprise service support. Oracle will also bundle in Oracle Linux and the Oracle Java HotSpot Virtual Machine, and it will license an new Hadoop-supportive version of Oracle Data Integrator (ODI). Modules added to the data-integration software are said to automatically generate Java code needed to transform data and run MapReduce processes on Hadoop. There's also a new Oracle Loader for Hadoop that will move data from the Big Data Appliance back into Oracle data warehouses.
The Big Data Appliance could, in theory, stand on its own. But Andy Mendelsohn, Oracle's senior vice president of server technologies, says the appliance is intended to work together with Oracle Exadata, which can be tied in with fast Infiniband connections. The ODI code-generation and loading capabilities are intended to make Hadoop more accessible to mainstream data-management professionals. But Mendelsohn acknowledged that advanced applications such as customer sentiment analysis will require much deeper Hadoop expertise.
"Hadoop is way up there on the hype cycle, but the reality is that there are not a lot of people that can use it as yet," Mendelsohn said. "It's a niche technology today, but over the next three to five years, we'll be working on making it more accessible to less-sophisticated organizations."
The Big Data Appliance software portfolio will also include the Oracle NoSQL Database. Based on the open-source BerkeleyDB product acquired with Sleepycat Software in 2006, the new product is a key-value store database capable of interpreting new data on the fly without a predefined relational schema. As such, the database is said to provide the flexibility and scalability associated with NoSQL databases. Oracle claims it will be easier to install, configure, and manage than competitive offerings, with the added security blanket of Oracle support.
Oracle's planned distribution of the open-source R statistical environment will be adapted for use on large-scale data within the Oracle database, rather than on desktops and laptops where analysts typically use the software. Oracle R Enterprise will run existing R applications and it will use the R client directly against data stored in Oracle Database 11g. This will vastly increase scalability, performance, and security, according to Oracle, along with the promise of software support. Oracle will ship the open-source distribution along with Linux. Separate R packages with database-specific extensions for Oracle 11g will be bundled with that database.
Everything associated with the Big Data Appliance is also intended to work hand in hand with the Exalytics Business Intelligence Machine Oracle announced on Sunday. Oracle filled in missing details about the in-memory appliance on Monday, noting that machine will run Oracle Business Intelligence Foundation software. That means Exalytics will be able to support all current Oracle Business Intelligence Enterprise Edition (OBIEE) applications unchanged.
The appliance also runs an in-memory, parallel-processing version of the Oracle Essbase OLAP database, so it will support existing Corporate Performance Management applications without changes. That's a marked contrast as compared to SAP's Hana Appliance, which to date has required a new generation of purpose-built in-memory applications.
Paired with the Big Data Appliance and R software, Exalytics will bring instantaneous in-memory analysis to bear on the results of Hadoop MapReduce jobs and on R statistical and predictive models and graphical analyses. The results can then be delivered through the dashboarding and reporting capabilities of OBIEE.
One way or another, Oracle had to act, as competitors including EMC, IBM, and Teradata have been stepping up their games in analytics and big data analysis. Oracle had to check those boxes. We'll see how soon it will actually become an active participant in the emerging NoSQL and big data markets.