Big Data. Big Decisions
InformationWeek
Special Coverage Series


Hadoop Spurs Big Data Revolution

What's Ahead?

(Page 5 of 5)

What's Ahead?

Companies already using Hadoop invariably have bigger plans. AOL is moving critical applications to its 700-node production environment, which is described as a highly reliable and controlled deployment, providing data down to granular levels of detail. The 300-node R&D environment is where many of company's most advanced Ph.D. analytics experts work on cutting-edge projects. Cloudera provides the enterprise support for both deployments, helping AOL with bug fixes, software upgrades, and service problems.

More Insights

Webcasts

More >>

White Papers

More >>

Reports

More >>

At ComScore, it will be several months before Hadoop can scale up and replace its data processing grid, Brown says. That move was delayed in part because ComScore switched from Cloudera's Hadoop distribution to MapR's, which ComScore licensed through EMC Greenplum. MapR's version of Hadoop will let ComScore switch from HDFS to the more mature and widely used Network File System. NFS will enable the company to easily move data back and forth among Hadoop, Sybase IQ, and other data sources and systems, something it couldn't do with HDFS, Brown says.

EMC and partner MapR introduced new Hadoop software and support options this spring, as did IBM with its BigInsights offering. IBM partner Karmasphere, which provides Hadoop development and analytics tools, recently introduced a virtual appliance for BigInsights, designed to speed development of MapReduce jobs and related analytics projects. Microsoft has promised a Windows Server-friendly distribution of Hadoop supported by Yahoo spin-off Hortonworks, another enterprise-focused Hadoop tools and support provider. It's a safe bet that Oracle, too, will find ways to differentiate its Hadoop offering beyond the promised delivery of the Oracle Big Data Appliance.

Only the largest vendors have had the chutzpa to announce their own Hadoop software distributions and support plans. But dozens of others have added integrations and support tools, so they can move data into and out of Hadoop and analyze data sets after they're boiled down by MapReduce processing. That list includes data warehouse vendors Hewlett-Packard, ParAccel, and Teradata; data integration vendors Informatica, Pervasive, Talend, and Syncsort; and business intelligence and analytics vendors Jaspersoft, Pentaho, and SAS.

The latest wave of Hadoop announcements is coming from application developers and service providers. Amazon has offered a Hadoop-based service on its Elastic Compute Cloud since 2009. IBM launched a BigInsights service on its SmartCloud Enterprise platform in October. And Microsoft is promising a beta Hadoop-based service on the SQL Azure cloud platform by year's end.

Hadoop's Many Pieces
Hadoop Subprojects
Hadoop Common Common utilities that support the other Hadoop subprojects
Hadoop Distributed File System Distributed file system that provides high-throughput access to application data
Hadoop MapReduce Software framework for distributed processing of large data sets on compute clusters
Other Hadoop-Related Apache Projects
Chukwa Data-collection system for managing large distributed systems
HBase Scalable, distributed database that supports structured data storage for large tables
Hive Data warehouse infrastructure that provides data summarization and ad hoc querying
Mahout Scalable machine learning and data mining library
Pig High-level data-flow language and execution framework for parallel computing
ZooKeeper High-performance coordination service for distributed applications
Data: Apache Software Foundation
SunGard plans to launch a Hadoop-based managed service that will let customers run MapReduce jobs. No word on when, but CTO Indu Kodukula says the company will run MapR software on EMC Greenplum's modular appliance. It will aim the service at customers that expect to operate 100 TB or more of data but aren't ready to commit to building out their own infrastructure to support Hadoop.

"Most of the requests that we've received to support Hadoop come from large financial customers that have an enormous amount of data and interest in blending in external sources, but they don't entirely know whether the results are going to be meaningful," Kodukula says. Rather than spending first and risking failure, they'd rather experiment with a managed service, he says.

On the apps front, Tidemark introduced an innovative cloud-based performance management application in October built on an "elastic computation grid based on in-memory technology coupled with Hadoop MapReduce processing." That's a mouthful, but it's simpler than it sounds. The in-memory technology is used for the fast analyses you expect in a performance management app (think Cognos TM1, QlikTech, SAP Hana, and Tibco Spotfire-style financial analyses delivered via the cloud). The Hadoop MapReduce part speeds answers to big data problems and blends mixed data types that might not conform to a fixed schema.

Tidemark customer U.S. Sugar, for example, is mixing weather data with the information it gets from growers related to seeds, chemical treatments, and acres planted to better understand and predict crop production. And Acosta, a marketing services firm that works with consumer products companies, is analyzing consumer sentiments expressed in social media to do a better job of stocking products in support of marketing campaigns.

All this support for Hadoop will naturally encourage broader experimentation and is likely to boost adoption. According to a recent InformationWeek survey of 431 business technology professionals involved with information management tools, only about 3% have made extensive use of Hadoop or other NoSQL platforms while 11% have made limited use of it (see chart, below). With all the hype around Hadoop, those figures should begin to rise.

Chart Limited Hadoop Use --So Far

It may be that we're at the apex of Gartner's hype cycle, so beware the trough of disillusionment in the months ahead. For one thing, expect a cacophony of confusing commercial messages. Customer success stories and emerging applications will be the best way to guage Hadoop's progress.

Once Hadoop is proven and mission critical, as it is at AOL, its use will be as routine and accepted as SQL and relational databases are today. It's the right tool for the job when scalability, flexibility, and affordability really matter. That's what all the Hadoopla is about.

Read the sidebar:
Hadoop's Flexibility Wins Over Online Data Provider

« Previous Page | 1234 5  


Related Reading




Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

BYTE encourages readers to engage in spirited, healthy debate, including taking us to task. However, BYTE moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing/SPAM. BYTE further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.

Follow InformationWeek

By The Numbers

What Are Your Primary Concerns About Using Big Data Software?

Base: 417 respondents at organizations using or planning to deploy data analytics, BI or statistical analysis software
Data: InformationWeek 2013 Analytics, Business Intelligence and Information Management Survey of 541 business technology professionals, October 2012

What Do You Think?

What's your attitude about SQL analysis on top of Hadoop?
We want fast, standard SQL analysis capabilities on Hadoop ASAP
Hadoop is for unstructured data; SQL is for relational databases
We'll give SQL on Hadoop a try, but relational DBs will remain the mainstay
Given strong SQL support on Hadoop, we'd nix the data warehouse
We're not interested in Hadoop
No opinion



Related Content

From Our Sponsor

Five Big Data Challenges and How to Overcome Them with Visual Analytics

Five Big Data Challenges and How to Overcome Them with Visual Analytics

Business leaders often need a visual snapshot of data to quickly grasp and use it. This paper identifies five challenges in presenting data and how visual analytics can resolve them. Solutions are suggested to overcome the challenges of: speed, data clarity, data quality, displaying meaningful results, and dealing with outliers.

Game-Changing Analytics: How IT Executives Can Use Analytics to Create Innovation and Business Success

Game-Changing Analytics: How IT Executives Can Use Analytics to Create Innovation and Business Success

Today's competitive advantage requires a deeper understanding of your business, your market and your customers. As an IT executive, you can drive that knowledge transformation. In this white paper, learn how to make decisions as a strategic business leader and three steps to begin an analytics initiative within your enterprise.

Data Visualization Techniques: From Basics to Big Data with SAS Visual Analytics

Data Visualization Techniques: From Basics to Big Data with SAS Visual Analytics

High-performance data visualization turns sophisticated analyses into meaningful graphics, leading to faster and smarter decision making. In this white paper, learn how visual analytics can transform big data, with additional features such as real-time functionality, mobile compatibility, robust applications for technical groups and accessibility for nontechnical users.

Big Data: Lessons from the Leaders

Big Data: Lessons from the Leaders

Financial performance, competitive advantage, operational efficiency, strategic decision making - every business goal can extract value from big data, and the time for doubt or inaction has long passed. In this Economist Intelligence Unit report, in-depth interviews with data pioneers reveal the link between the effective use of big data and the bottom line among other results.

Decision-Driven Data Management: A Strategy for Better Decisions with Better Data

Decision-Driven Data Management: A Strategy for Better Decisions with Better Data

Which came first, the data or the decision? This white paper makes the case for having a decision in mind, then tailoring big data's volume, variety and velocity to achieve business results such as overcoming customer dissatisfaction or creating well-informed strategies in real time.

Informationweek Reports

Research: The Big Data Management Challenge

Research: The Big Data Management Challenge

The challenge of big data is real, but most organizations don't differentiate 'big data' from traditional data, and nearly 90% of respondents to our survey use conventional databases as the primary means of handling data. We'll help you understand what constitutes big data (it's not just size) and the numerous management challenges it poses.