Inside IBM's Big Data, Hadoop Moves - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

IoT
IoT
Data Management // Big Data Analytics
Commentary
4/3/2013
11:40 AM
Doug Henschen
Doug Henschen
Commentary
Connect Directly
Google+
LinkedIn
Twitter
RSS
E-Mail
50%
50%

Inside IBM's Big Data, Hadoop Moves

IBM DB2 adds in-memory analysis and compression tricks, while PureData System for Hadoop arrives as an appliance. But will IBM beat other tortoises in the Hadoop race?

IBM was ahead of both Oracle and Microsoft in embracing Hadoop, and it took a different path by introducing its own basic and enterprise distributions of its BigInsights Hadoop software in May 2011. Oracle and Microsoft entered the market in 2012 through partnerships with Cloudera and Hortonworks, respectively.

Now that IBM is announcing its own appliance, the PureData System for Hadoop, the in-house path will give it the advantage of offering a "100% IBM solution with our software distribution and our hardware," said Nancy Kopp, IBM's director of big data, in an interview with InformationWeek.

There will be two key differentiators from the Hadoop appliances that are either on the market (from EMC and Oracle) or in the works (from Teradata), Kopp said. "We saw that there's a key use case emerging for Hadoop as an archival system, so we've built archive capabilities right into the appliance," said Kopp. This will enable customers to offload data from warehouses for cold storage or archival compliance. The data is still active, however, so you can retrieve and restore to faster analytic databases.

[ Are you following the hot debate on the future of Hadoop? Read Will Hadoop Become Dominant Platform? ]

The second differentiator, according to Kopp, is a family of analytic accelerators starting with three: one for social data, one for text analytics and one for machine data. "The accelerators will make it easier to develop applications that take advantage of these data types," said Kopp, and she added that new accelerators will join the family in the future.

Beating the likes of Oracle and Microsoft on Hadoop is one thing. The question is now whether these giants will be the tortoises that ultimately finish ahead of the big data hares like Cloudera and MapR. Cloudera, in particular, is way out ahead in bringing Hadoop deployments to large enterprises with hundreds of deployments. By contrast, you seldom hear about BigInsights, and IBM refuses to disclose the number of customers running the software. At least one customer, MoneyGram, was set to participate in Wednesday's announcement.

IBM has addressed key Hadoop drawbacks that other distributors have addressed, including reliability and availability concerns tied to Hadoop's NameNode and the limited and slow SQL query capabilities of Apache Hive. On this last note, the upgraded BigInsights distribution announced Wednesday and set for release in the second quarter will include BigSQL, IBM's answer to SQL-on-Hadoop analysis.

EMC is set to release its remedy for Hive shortcomings with its Pivotal release later this month, but it looks like IBM will have BigSQL ahead of Cloudera's Impala, Hortonworks' Stinger and MapR's Drill initiatives.

As to the tortoise-and-hare question, Bloor says vendors that control the hardware will have advantages.

"My money would be on the boys with the iron, because they can look at the big picture, and as long as they get their pricing correct, then they're probably going to be able to a better job than vendors that are limited to software," he said.

That suggests that IBM -- as well as EMC/VMWare, HP, Intel, Oracle and no doubt others to come -- will have advantages. Which tortoise will win? We'll have to wait years to find out.

InformationWeek is conducting a survey on IT spending priorities. Take the InformationWeek 2013 IT Spending Priorities Survey today. Survey ends April 5.

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Previous
2 of 2
Next
Comment  | 
Print  | 
More Insights
Slideshows
10 RPA Vendors to Watch
Jessica Davis, Senior Editor, Enterprise Apps,  8/20/2019
Commentary
Enterprise Guide to Digital Transformation
Cathleen Gagne, Managing Editor, InformationWeek,  8/13/2019
Slideshows
IT Careers: How to Get a Job as a Site Reliability Engineer
Cynthia Harvey, Freelance Journalist, InformationWeek,  7/31/2019
White Papers
Register for InformationWeek Newsletters
Video
Current Issue
Data Science and AI in the Fast Lane
This IT Trend Report will help you gain insight into how quickly and dramatically data science is influencing how enterprises are managed and where they will derive business success. Read the report today!
Slideshows
Flash Poll