Big Data // Big Data Analytics
Commentary
4/3/2013
11:40 AM
Doug Henschen
Doug Henschen
Commentary
Connect Directly
Google+
LinkedIn
Twitter
RSS
E-Mail
50%
50%

Inside IBM's Big Data, Hadoop Moves

IBM DB2 adds in-memory analysis and compression tricks, while PureData System for Hadoop arrives as an appliance. But will IBM beat other tortoises in the Hadoop race?

IBM was ahead of both Oracle and Microsoft in embracing Hadoop, and it took a different path by introducing its own basic and enterprise distributions of its BigInsights Hadoop software in May 2011. Oracle and Microsoft entered the market in 2012 through partnerships with Cloudera and Hortonworks, respectively.

Now that IBM is announcing its own appliance, the PureData System for Hadoop, the in-house path will give it the advantage of offering a "100% IBM solution with our software distribution and our hardware," said Nancy Kopp, IBM's director of big data, in an interview with InformationWeek.

There will be two key differentiators from the Hadoop appliances that are either on the market (from EMC and Oracle) or in the works (from Teradata), Kopp said. "We saw that there's a key use case emerging for Hadoop as an archival system, so we've built archive capabilities right into the appliance," said Kopp. This will enable customers to offload data from warehouses for cold storage or archival compliance. The data is still active, however, so you can retrieve and restore to faster analytic databases.

[ Are you following the hot debate on the future of Hadoop? Read Will Hadoop Become Dominant Platform? ]

The second differentiator, according to Kopp, is a family of analytic accelerators starting with three: one for social data, one for text analytics and one for machine data. "The accelerators will make it easier to develop applications that take advantage of these data types," said Kopp, and she added that new accelerators will join the family in the future.

Beating the likes of Oracle and Microsoft on Hadoop is one thing. The question is now whether these giants will be the tortoises that ultimately finish ahead of the big data hares like Cloudera and MapR. Cloudera, in particular, is way out ahead in bringing Hadoop deployments to large enterprises with hundreds of deployments. By contrast, you seldom hear about BigInsights, and IBM refuses to disclose the number of customers running the software. At least one customer, MoneyGram, was set to participate in Wednesday's announcement.

IBM has addressed key Hadoop drawbacks that other distributors have addressed, including reliability and availability concerns tied to Hadoop's NameNode and the limited and slow SQL query capabilities of Apache Hive. On this last note, the upgraded BigInsights distribution announced Wednesday and set for release in the second quarter will include BigSQL, IBM's answer to SQL-on-Hadoop analysis.

EMC is set to release its remedy for Hive shortcomings with its Pivotal release later this month, but it looks like IBM will have BigSQL ahead of Cloudera's Impala, Hortonworks' Stinger and MapR's Drill initiatives.

As to the tortoise-and-hare question, Bloor says vendors that control the hardware will have advantages.

"My money would be on the boys with the iron, because they can look at the big picture, and as long as they get their pricing correct, then they're probably going to be able to a better job than vendors that are limited to software," he said.

That suggests that IBM -- as well as EMC/VMWare, HP, Intel, Oracle and no doubt others to come -- will have advantages. Which tortoise will win? We'll have to wait years to find out.

InformationWeek is conducting a survey on IT spending priorities. Take the InformationWeek 2013 IT Spending Priorities Survey today. Survey ends April 5.

Previous
2 of 2
Next
Comment  | 
Print  | 
More Insights
6 Tools to Protect Big Data
6 Tools to Protect Big Data
Most IT teams have their conventional databases covered in terms of security and business continuity. But as we enter the era of big data, Hadoop, and NoSQL, protection schemes need to evolve. In fact, big data could drive the next big security strategy shift.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Tech Digest - July 22, 2014
Sophisticated attacks demand real-time risk management and continuous monitoring. Here's how federal agencies are meeting that challenge.
Flash Poll
Video
Slideshows
Twitter Feed
InformationWeek Radio
Archived InformationWeek Radio
A UBM Tech Radio episode on the changing economics of Flash storage used in data tiering -- sponsored by Dell.
Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.