Big Data. Big Decisions
InformationWeek
Special Coverage Series

Commentary

Doug Henschen

Doug Henschen

Executive Editor, InformationWeek

Sears Hadoop Plans: Check Out Data Warehousing's Future

Will Hadoop become the new enterprise data warehouse? Sears' CTO is not alone in seeing a shift in how we'll use relational databases.

12 Hadoop Vendors To Watch In 2012
12 Hadoop Vendors To Watch In 2012
(click image for larger view and for slideshow)
Radio did not spell the end of newspapers, nor television the end of radio, nor the Internet the end of television. But each advance fundamentally changed the use of the prior platform. And so it will be with Hadoop and relational databases.

If the example of Sears can serve as our guide, Hadoop will become a popular central corporate data repository -- perhaps even the leading data repository eventually. It will take over that role not only because it can handle huge volumes of data more cost effectively than relational databases, but also because it easily ingests varied and complex data without first conforming it to a pre-defined schema, as you have to do when using a database. You can save all your data for the long term and apply schema when you need to use it, rather than imposing a schema before it's loaded onto the platform.

More Insights

Webcasts

More >>

White Papers

More >>

Reports

More >>

At Sears, Hadoop was first deployed three years ago and it has since become the central hub of all data management activity for the retailer. CTO Phil Shelley tells InformationWeek that Hadoop is giving Sears the flexibility and scale to make use of all the company's data. "We keep all the raw, transactional data, and because there's enough horsepower in Hadoop, you can then transform it into any form you want whenever you want on they fly rather than having to create cubes or aggregations," Shelley explains.

[ Want the inside story on big data plans at Sears? Read Why Sears Is Going All-In On Hadoop. ]

Hadoop has essentially become the enterprise data store at Sears, but that's not quite the same thing as an enterprise data warehouse. The difference is analysis, some of which can be done with the batch, MapReduce processing native to Hadoop. But the retailer is still using relational databases in many situations. InfoBright's columnar database, for example, is used for fast analysis of data aggregations that used to be created -- with much IT time and expense -- as multi-dimensional OLAP cubes. Cube building is now a thing of the past. Instead, fresh data sets are moved from Hadoop into InfoBright on a daily basis.

In another example, Sears' massive Teradata deployment continue to run high-scale, mission-critical analytical applications. "Teradata is still an important platform for us whenever we need a high-speed SQL interface," explains Shelley. "That could be when we're integrating with SAS [analytics] or doing custom analytics with SQL."

That puts Teradata in the role of analytic data mart, however, as opposed to its usual place as the enterprise data warehouse that holds all important data. Nonetheless, Sears is using more Teradata than ever, says Teradata, and perhaps that's because Hadoop enables the retailer to store and retain more data than ever. Sears is now saving data that it used to throw out and it's retaining indefinitely data that it used to keep for only 90 days or two years. More data for analysis brings more analysis.

Lots of Hadoop users share Shelley's perspective on how it can become a central hub for data management -- longtime Hadoop shop JP Morgan Chase started envisioning this role years ago. In fact, at last month's Strata New York event it seemed that the focus on Hadoop has shifted. The questions are no longer "what is Hadoop" and "does it make sense for my company?" People are now asking, "do I have the people I need to run Hadoop," and "how will I analyze and make use of all that information?"

For now, moving boiled-down data sets from Hadoop into existing relational environments will be part of the answer, but that approach involves data-movement delays that plenty of practitioners would like to avoid. "The BI industry has still got its head in the sand mostly because they're all still thinking about moving and copying data," Shelley tells InformationWeek "These vendor need to get their act together and write tools that run natively on Hadoop and don't copy the data and use ETL to move it into their environment."

 1 | 2  | Next Page »


Related Reading




Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

BYTE encourages readers to engage in spirited, healthy debate, including taking us to task. However, BYTE moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing/SPAM. BYTE further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.

Follow InformationWeek

By The Numbers

What Are Your Primary Concerns About Using Big Data Software?

Base: 417 respondents at organizations using or planning to deploy data analytics, BI or statistical analysis software
Data: InformationWeek 2013 Analytics, Business Intelligence and Information Management Survey of 541 business technology professionals, October 2012

What Do You Think?

What's your attitude about SQL analysis on top of Hadoop?
We want fast, standard SQL analysis capabilities on Hadoop ASAP
Hadoop is for unstructured data; SQL is for relational databases
We'll give SQL on Hadoop a try, but relational DBs will remain the mainstay
Given strong SQL support on Hadoop, we'd nix the data warehouse
We're not interested in Hadoop
No opinion



Related Content

From Our Sponsor

Five Big Data Challenges and How to Overcome Them with Visual Analytics

Five Big Data Challenges and How to Overcome Them with Visual Analytics

Business leaders often need a visual snapshot of data to quickly grasp and use it. This paper identifies five challenges in presenting data and how visual analytics can resolve them. Solutions are suggested to overcome the challenges of: speed, data clarity, data quality, displaying meaningful results, and dealing with outliers.

Game-Changing Analytics: How IT Executives Can Use Analytics to Create Innovation and Business Success

Game-Changing Analytics: How IT Executives Can Use Analytics to Create Innovation and Business Success

Today's competitive advantage requires a deeper understanding of your business, your market and your customers. As an IT executive, you can drive that knowledge transformation. In this white paper, learn how to make decisions as a strategic business leader and three steps to begin an analytics initiative within your enterprise.

Data Visualization Techniques: From Basics to Big Data with SAS Visual Analytics

Data Visualization Techniques: From Basics to Big Data with SAS Visual Analytics

High-performance data visualization turns sophisticated analyses into meaningful graphics, leading to faster and smarter decision making. In this white paper, learn how visual analytics can transform big data, with additional features such as real-time functionality, mobile compatibility, robust applications for technical groups and accessibility for nontechnical users.

Big Data: Lessons from the Leaders

Big Data: Lessons from the Leaders

Financial performance, competitive advantage, operational efficiency, strategic decision making - every business goal can extract value from big data, and the time for doubt or inaction has long passed. In this Economist Intelligence Unit report, in-depth interviews with data pioneers reveal the link between the effective use of big data and the bottom line among other results.

Decision-Driven Data Management: A Strategy for Better Decisions with Better Data

Decision-Driven Data Management: A Strategy for Better Decisions with Better Data

Which came first, the data or the decision? This white paper makes the case for having a decision in mind, then tailoring big data's volume, variety and velocity to achieve business results such as overcoming customer dissatisfaction or creating well-informed strategies in real time.

Informationweek Reports

Research: The Big Data Management Challenge

Research: The Big Data Management Challenge

The challenge of big data is real, but most organizations don't differentiate 'big data' from traditional data, and nearly 90% of respondents to our survey use conventional databases as the primary means of handling data. We'll help you understand what constitutes big data (it's not just size) and the numerous management challenges it poses.