The big data news kicked off of this week's IBM Information OnDemand (IOD) conference in Las Vegas, where the company also announced a new iPad app for IBM Cognos, geospatial analysis capabilities added to SPSS Statistics software, and new information integration and master data management (MDM) capabilities supported by two InfoSphere server software upgrades.
IBM introduced its InfoSphere BigInsights software in May. The software package includes a distribution of Apache Hadoop, the Pig programming language for MapReduce programming, connectors to IBM's DB2 database, and IBM BigSheets, a browser-based, spreadsheet-metaphor interface for exploring data within Hadoop. Hadoop's key appeals are scalability and flexibility to handle fast-growing and non-relational data such as social network comments, weather data, log files, genomic data, and even video.
[ Need advice on big data best practices? Read 3 Big Data Challenges: Expert Advice. ]
Wind turbine manufacturer Vestas, a new IBM customer highlighted at the IOD event, is using BigInsights software to analyze weather data to figure out the optimal placement of individual turbines and large wind farms.
IBM's move to support big data processing in the cloud comes just weeks after rivals Oracle and Microsoft took their first steps toward embracing Hadoop. Oracle announced in early October that it will introduce its own distribution of Apache Hadoop software, and on Monday it confirmed that an Oracle Big Data Appliance that will run the software will be available in the first quarter of 2012.
Microsoft announced October 12 that it, too, will release an Apache Hadoop-based software distribution (expected some time in 2012, though no quarter was specified). It also announced it will launch a beta Hadoop service on its SQL Azure cloud computing platform before the end of this year.
EMC, IBM, Oracle, Microsoft, and other data warehousing vendors have added integrations to Hadoop so they can move data back and forth to and from their relational databases. In what's described as a differentiator, IBM sees two opportunities for the use of big-data, according to Rod Smith, the company's vice president of emerging Internet technologies.
"The integration of big data into data warehousing environments is important, but the other constituent is the business processional," said Smith in an interview with InformationWeek. "How do people who don't want to learn Pig, program Java, or master Hadoop interact with big data?" Smith said IBM is alone in offering a simple interface like BigSheets as a way for business people to analyze large data sets.
IBM is going one step further to make big data processing accessible by putting BigInsights on its SmartCloud (with basic and enterprise versions), so organizations can learn and experiment with big-data processing and analysis without investing in supporting hardware or hiring deployment experts. Customers will be able to set up and move data into Hadoop clusters in less than 30 minutes, according to the company, and data-processing rates will start at 60 cents per cluster, per hour.
No details were offered on the corresponding cost of maintaining big data sets in the cloud, but you can be sure that storage is not free. Developer sandbox environments are available for both the basic- and enterprise-level services.
IBM is beating Microsoft and Oracle to market with a production-ready cloud service (whereas Microsoft's year-end release will be a beta service), but the company is not the first company to make Hadoop available in the cloud. Amazon released a Hadoop-based Elastic MapReduce service on its Elastic Compute Cloud last year.
In other news at the IOD conference, a new IBM Cognos Mobile iPad app available for download from the Apple app store is said to deliver a highly visual business intelligence experience that will allow Cognos software users to view existing reports, dashboards, and scorecards related to sales, customers, financial data and other attributes. The app is said to support both online and offline viewing and it's an improvement over a previously available Cognos mobile support option for iPad.
SPSS Statistics release 20.0, another IOD announcement, includes a new mapping feature that lets users add geographic dimensions to analyses and reports so organizations can target, forecast, and plan by region, territory or other geographic breakdowns.
The 8.7 release of InfoSphere Information Server improves support for big data as both a source and target for extract/transform/load integration. The server's parallel processing engine is said to support massive scalability, an operations console helps manage system usage across all integration jobs, and a new connector for IBM Netezza is touted as ensuring optimized and balanced loading.
On the data governance front, InfoSphere Master Data Management 10 incorporates business process management software so you can embed MDM work steps into business processes. Services-based interfaces make it easier for developers to connect MDM to consuming applications. Lastly, MDM capabilities are said to support big data analytics; for instance, a shared matching engine helps maintain a single-version of the truth around customer, product, supplier, employee, account and other data attributes.