Big Data // Big Data Analytics
News
1/30/2014
09:06 AM
Doug Henschen
Doug Henschen
Slideshows
Connect Directly
Google+
LinkedIn
Twitter
RSS
E-Mail
100%
0%

16 Top Big Data Analytics Platforms

Data analysis is a do-or-die requirement for today's businesses. We analyze notable vendor choices, from Hadoop upstarts to traditional database players.
Previous
1 of 17
Next

Revolutionary. That pretty much describes the data analysis time in which we live. Businesses grapple with huge quantities and varieties of data on one hand, and ever-faster expectations for analysis on the other. The vendor community is responding by providing highly distributed architectures and new levels of memory and processing power. Upstarts also exploit the open-source licensing model, which is not new, but is increasingly accepted and even sought out by data-management professionals.

Apache Hadoop, a nine-year-old open-source data-processing platform first used by Internet giants including Yahoo and Facebook, leads the big-data revolution. Cloudera introduced commercial support for enterprises in 2008, and MapR and Hortonworks piled on in 2009 and 2011, respectively. Among data-management incumbents, IBM and EMC-spinout Pivotal each has introduced its own Hadoop distribution. Microsoft and Teradata offer complementary software and first-line support for Hortonworks' platform. Oracle resells and supports Cloudera, while HP, SAP, and others act more like Switzerland, working with multiple Hadoop software providers.

In-memory analysis gains steam as Moore's Law brings us faster, more affordable, and more-memory-rich processors. SAP has been the biggest champion of the in-memory approach with its Hana platform, but Microsoft and Oracle are now poised to introduce in-memory options for their flagship databases. Focused analytical database vendors including Actian, HP Vertica, and Teradata have introduced options for high-RAM-to-disk ratios, along with tools to place specific data into memory for ultra-fast analysis.

Advances in bandwidth, memory, and processing power also have improved real-time stream-processing and stream-analysis capabilities, but this technology has yet to see broad adoption. Several vendors here complex event processing, but outside of the financial trading, national intelligence, and security communities, deployments have been rare. Watch this space and, particularly, new open source options as breakthrough applications in ad delivery, content personalization, logistics, and other areas push broader adoption.

Our slideshow includes broad-based data-management vendors -- IBM, Microsoft, Oracle, SAP -- that offer everything from data-integration software and database-management systems (DBMSs) to business intelligence and analytics software, to in-memory, stream-processing, and Hadoop options. Teradata is a blue chip focused more narrowly on data management, and like Pivotal, it has close ties with analytics market leader SAS.

Plenty of vendors covered here offer cloud options, but 1010data and Amazon Web Services (AWS) have staked their entire businesses on the cloud model. Amazon has the broadest selection of products of the two, and it's an obvious choice for those running big workloads and storing lots of data on the AWS platform. 1010data has a highly scalable database service and supporting information-management, BI, and analytics capabilities that are served up private-cloud style.

The jury is still out on whether Hadoop will become as indispensable as database management systems. Where volume and variety are extreme, Hadoop has proven its utility and cost advantages. Cloudera, Hortonworks, and MapR are doing everything they can to move Hadoop beyond high-scale storage and MapReduce processing into the world of analytics.

The niche vendors here include Actian, InfiniDB/Calpont, HP Vertica, Infobright, and Kognitio, all of which have centered their big-data stories around database management systems focused entirely on analytics rather than transaction processing. German DBMS vendor Exasol is another niche player in this mold, but we don't cover it here as its customer base is almost entirely in continental Europe. It opened offices in the U.S. and U.K. in January 2014.

This collection does not cover analytics vendors, such as Alpine Data Labs, Revolution Analytics, and SAS. These vendors invariably work in conjunction with platforms provided by third-party DBMS vendors and Hadoop distributors, although SAS in particular is blurring this line with growing support for SAS-managed in-memory data grids and Hadoop environments. We also excluded NoSQL and NewSQL DBMSs, which are heavily (though not entirely) focused on high-scale transaction processing, not analytics. We plan to cover NoSQL and NewSQL platforms in a separate, soon-to-be-published collection.

Now dig in and learn more about these analytics vendors and how they compare.

 

Previous
1 of 17
Next
Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
<<   <   Page 2 / 3   >   >>
D. Henschen
50%
50%
D. Henschen,
User Rank: Author
2/3/2014 | 12:47:07 PM
Re: A collection of marketing flyers from 16 vendors
Excellent take, Raj. The likes of IBM, Oracle and Teradata have certainly checked the Hadoop box, but I wonder how hard they push it or whether they try to keep it in a high-scale storage role while favoring their incumbent technologies for the analysis. Cloudera and MapR are saying you can do more and challenge incumbent technologies while Hortoworks holds short of such bold claims -- clearly not wanting to challenge partners Microsoft, Teradata and SAP. The independent DBMS vendors have various strategies and capabilities around working with Hadoop, and they generally don't challenge EDW vendors -- only the high-scale data mart/analytics opportunity. All of these vendors offer "Big Data Analytics Platforms," but they're coming at it from secular angles.
anon4507650351
50%
50%
anon4507650351,
User Rank: Apprentice
2/3/2014 | 9:03:38 AM
Re: Important Big Data Context
Hi Laurianne, as mentioned in another post, Doug's article provides a useful analysis of the Big Data marketplace from a platforms perspective. Since we are still in the infancy of BD (vs BI), the market is changing rapidly and there will be significant consolidation in the next year or two (acquisitions of small players by the giants). I predict that the largest beneficial use of BD will be for machine learning through the Internet of Things (sensors, etc). There are currently only a small number of players focusing on this aspect of BD. It will be interesting to track the development of Hadoop specifically whether companies go pure open source ala Hortonworks or veer more towards proprietery. I suspect the latter.

Raj
anon4507650351
IW Pick
100%
0%
anon4507650351,
User Rank: Apprentice
2/3/2014 | 8:52:05 AM
Re: A collection of marketing flyers from 16 vendors
Srini/Doug, I have been involved in coming up with my company's Big Data strategy including a BDaaS offering and go-to-market stategy. My research including deep dive sessions (architecturally, technically and commercially) with many of the players listed here indicate the same findings as Doug's article. Each vendor approaches the problem from their own perspective based on their previous expertise (e.g. hardware specialisation like HP, storage perspective like EMC/Pivotal, pureplay DB/analytics, etc). Some like IBM and Oracle have thrown billions of $$ at the problem, but mostly use BD as pull through for HW/SW sales. There are relatively a small number of pure BD companies like Palantir.  Most are glorified BI specialists jumping on the BD bandwagon who cannot go one level below their vaporware. There seems to be a gap in the market for a end-to-end offering which is technology independent. Very difficult to achieve and implement this since it it the large players who keep up better with the changing landscape. So overall, a good high level analysis Doug which allows interested parties to narrow down the playing field before commencing more deep dive analysis.

Raj
srini s
50%
50%
srini s,
User Rank: Apprentice
1/31/2014 | 6:10:40 PM
Re: A collection of marketing flyers from 16 vendors
Doug - What I meant by 2006 was not the freshness of the information presented, but the fact that big data is a household term (of sorts) and an authoritative article in this space at 2014 needs more depth and breadth.   

With respect to the comparison on Analytic DBMS, In-memory etc, it was a good metric but a bad choice for this post. In fact, it would have been good if it was a single matrix comparing the 16 technologies (and a few more) against these options (and a few more). Still, I am not able to find out who your intended audience was. It seems to be catering to management, tech, architects and infrastructure admins partially but none fully :(

Neverthess, I apologize for coming out strong on you. Take it as constructive criticism, if you can :)

Cheers

Srini

 
D. Henschen
0%
100%
D. Henschen,
User Rank: Author
1/31/2014 | 3:00:19 PM
Re: A collection of marketing flyers from 16 vendors
Srini, Thanks for your comments. Actian, Cloudera, Hortonworks, MapR, and Pivotal didn't exist in 2006, and most have arrived since 2010. Among the giants, IBM, Microsoft, Oracle, SAP and Teradata have only added support for Hadoop in the last two to three years. "Connecting elements" across all 16 include insight on their offerings for analytical DBMS, in-memory options, streaming options, Hadoop distributions, and hardware/software appliances. And if you read each analysis, I think that you'll find that it's far from a regurgitation of marketing brochures. There are plenty of insights into strategy, market approach, strengths and weaknesses and more.  
srini s
100%
0%
srini s,
User Rank: Apprentice
1/31/2014 | 2:36:25 PM
A collection of marketing flyers from 16 vendors
This is a shallow post to introduce a newbie to a biased view of what is available in the market (on Hadoop and it's related technology) if you want to venture into big data. Instead of sitting through a marketing presentation of each of these vendors, or going through their websites, you can see it here. I would have personally preferred the links to the home page of these technologies. It doesn't seem to take an unbiased approach on merits and pitfalls. Why would you choose one over the other.  A comparison on the technologies based on its focus (which of the VVV that it attacks), what is the TCO, quantifying the BIG and qualifying the ANALYTICS in big data analytics. Both Cloudera and Hortonworks are there. We have HP and IBM too and AWS... meh! Could have included some SAN storage providers too :) There seems to be no plane for comparison. There is no connecting element between pages. I would have liked a better organization of thoughts. 

This would have been a ground breaking post in 2006. But now.. pass!

-Srini

 
D. Henschen
50%
50%
D. Henschen,
User Rank: Author
1/31/2014 | 2:03:48 PM
Read the introduction
This is about platforms for big-data analysis -- as in DBMSs and Hadoop - and I state very clearly in the intro that it does not address analytics companies -- SAS, Qlik, and others you mention -- that focus almost entirely on analytics alone and that tend to work with these platforms. Nor does this address NoSQL or NewSQL databases, which we'll address in a separate collection.
EB Quinn
100%
0%
EB Quinn,
User Rank: Apprentice
1/31/2014 | 8:00:42 AM
Only 16?
There are many more of these "top" solutions, and some of these on the "top" list make no sense whatsoever.  What is the criteria?  Does it have to include Hadoop?  MPP?  Advertiser?  Market share?  Some of the vendors you have included have very little market share, like Pivotal and Microsoft (unless you are incuding Excel).  Where is SAS?  Palantir?  Qlik?  Platfora?  They may not have Hadoop distributions, but they can work with Hadoop just fine (and Platfora is natively based on Hadoop).  Agree with the comment that HPCC should be on this list, way higher than a bunch of the others in terms of proven high end analytics with actual customers.  What is a "platform?"  Cringing.

 

 

 
HM
100%
0%
HM,
User Rank: Strategist
1/30/2014 | 3:03:02 PM
Big Data Solution
Doug, one other open source technology to mention at the top of the decision tree one should consider is HPCC Systems from LexisNexis, a data-intensive supercomputing platform for processing and solving big data analytical problems. Their open source Machine Learning Library and Matrix processing algorithms assist data scientists and developers with business intelligence and predictive analytics. Its integration with Hadoop, R and Pentaho extends further capabilities providing a complete solution for data ingestion, processing and delivery. In fact, both libhdfs and webhdfs implementations are available. More at http://hpccsystems.com/h2h 

 
 
cbabcock
50%
50%
cbabcock,
User Rank: Strategist
1/30/2014 | 2:41:29 PM
Revolutionary times, now and then
Doug, you're right, this is a revolutionary time in data management. The last time was when relational database first appeared. I remember sitting through meeting after meeting with then-major vendors, Software AG of North America in Reston, Va.,(Adabas) and John Cullinane, CEO at Cullinane Software (IDMS). They were furious that their partner, IBM, was coming into "their" data management market. "There's no need for IBM to do that." Well, IBM had invented ad hoc data handling and SQL queries. "Their" systems didn't do that. It was a quick lesson in how vendors get flanked. NoSQL systems don't have the same elegance of relational's deisgn, but they're too useful with unstructured data to get swept back under the rug.
<<   <   Page 2 / 3   >   >>
6 Tools to Protect Big Data
6 Tools to Protect Big Data
Most IT teams have their conventional databases covered in terms of security and business continuity. But as we enter the era of big data, Hadoop, and NoSQL, protection schemes need to evolve. In fact, big data could drive the next big security strategy shift.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Must Reads Oct. 21, 2014
InformationWeek's new Must Reads is a compendium of our best recent coverage of digital strategy. Learn why you should learn to embrace DevOps, how to avoid roadblocks for digital projects, what the five steps to API management are, and more.
Video
Slideshows
Twitter Feed
InformationWeek Radio
Archived InformationWeek Radio
A roundup of the top stories and community news at InformationWeek.com.
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.