Inside IBM's Big Data, Hadoop Moves - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

IoT
IoT
Data Management // Big Data Analytics
Commentary
4/3/2013
11:40 AM
Doug Henschen
Doug Henschen
Commentary
Connect Directly
Google+
LinkedIn
Twitter
RSS
E-Mail
50%
50%

Inside IBM's Big Data, Hadoop Moves

IBM DB2 adds in-memory analysis and compression tricks, while PureData System for Hadoop arrives as an appliance. But will IBM beat other tortoises in the Hadoop race?

13 Big Data Vendors To Watch In 2013
13 Big Data Vendors To Watch In 2013
(click image for larger view and for slideshow)

IBM is making a series of high-profile analytics announcements on Wednesday from its Almaden Research Center in San Jose, Calif., a fitting location at the epicenter of big data activity. The themes of the announcements will sound familiar because they've been the subject of announcements by plenty of competitors in recent years, but IBM contends it's setting new standards of performance.

There are two major announcements. This first is BLU Acceleration, a combination of compression, in-memory analysis and vector-processing techniques that IBM says will drive huge improvements in relational database performance. BLU is set for release in the second quarter, and it will benefit DB2 first and foremost. IBM is also bringing the technology to the Informix database with this release and, according to sources, to the Netezza database in future releases.

The second announcement is IBM PureData System for Hadoop, an appliance-based platform that customers will be able to scale up by simply adding more boxes. The hardware will run an upgraded version of IBM's InfoSphere BigInsights Hadoop distribution that's also being announced on Wednesday.

In typical IBM fashion, the announcements are loaded with bravado about the billions the company has spent acquiring software companies in recent years, but there's plenty of substance behind the buzz words.

[ Want more on big data analytics announcements? Read 6 Big Data Advances: Some Might Be Giants. ]

With BLU Acceleration, IBM is taking advantage of the same breakthroughs in low-cost memory and processing power that SAP has been talking about in connection with its Hana in-memory database. BLU is not IBM's answer to Hana, however. The focus for now is strictly on analytics and does not, as yet, address transaction processing, so it's more of a competitive response -- bar raising, IBM contends -- to the likes of Teradata, HP Vertica and EMC Greenplum, and data warehousing uses of Oracle Exadata and Oracle Exalytics.

The techniques employed by BLU include hybrid row and columnar storage, advanced compression, data skipping, vector processing and leveraging of increasingly affordable memory to speed processing. We've seen all these techniques before -- mixed columnar and row from Teradata, HP-Vertica and EMC Greenplum, data skipping from InfoBright and IBM's Netezza database, vector processing from Actian (formerly Ingress), and aggressive use of memory from multiple vendors -- but IBM is alone in putting all of these techniques together.

With BLU, IBM says it will be able to crunch 10 terabytes down to 1 terabyte; bring that 1 terabyte into memory; and effectively crunch it again down to 10 gigabytes. With the data-skipping technology, the database can then focus in on the 1 gigabyte that matters to a query without wading through repeating or irrelevant data. Your mileage may vary, as the saying goes, but IBM reports that BLU improves performance by 8X to 25X over the last DB2 release (10.1).

All these enhancements will undoubtedly reassure existing DB2 customers that the database roadmap is keeping up with state-of-the-art features. "What's novel in IBM's approach is that it's doing acceleration in several ways," analyst Robin Bloor of Bloor Group told InformationWeek. Whether it breaks new performance benchmarks and changes market share dynamics remains to be seen.

"The 25X figure is very aggressive, but when people are making purchase decisions, they do proof-of-concept benchmarks and put one database up against another," said Bloor. "You don't actually know what kind of performance you'll get until you've done the comparisons."

What IBM has yet to address with BLU, and what will likely require more extensive use of memory, is transaction processing as well as analytics. That's what SAP is doing with Hana, it's what Microsoft has announced it will do with project Hekaton (expected in 2015) and it's what Oracle is rumored to be working on for a future Oracle Database release.

"We do see an evolution of this technology beyond reporting and analytic workloads, but I can't comment on a timeframe for that," IBM's Tim Vincent, IBM fellow, VP and chief technology officer told InformationWeek. If the same pattern holds that IBM took in introducing the BLU technologies, it might wait to see what others do before attempting to do them one better.

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Previous
1 of 2
Next
Comment  | 
Print  | 
More Insights
Slideshows
7 Technologies You Need to Know for Artificial Intelligence
Jessica Davis, Senior Editor, Enterprise Apps,  7/1/2019
Commentary
A Practical Guide to DevOps: It's Not that Scary
Cathleen Gagne, Managing Editor, InformationWeek,  7/5/2019
News
Data Science Salary Survey Reveals Market Shift
Jessica Davis, Senior Editor, Enterprise Apps,  6/27/2019
White Papers
Register for InformationWeek Newsletters
Video
Current Issue
A New World of IT Management in 2019
This IT Trend Report highlights how several years of developments in technology and business strategies have led to a subsequent wave of changes in the role of an IT organization, how CIOs and other IT leaders approach management, in addition to the jobs of many IT professionals up and down the org chart.
Slideshows
Flash Poll