Comments
Big Data Debate: End Near For ETL?
Newest First  |  Oldest First  |  Threaded View
SurajS949
100%
0%
SurajS949,
User Rank: Apprentice
7/11/2014 | 12:10:52 PM
My easier choice ...
Further to this thread and between/before Hadoop and traditional DW is IRI CoSort (www.iri.com), which doesn't rely on more machines or big ETL package costs. It combines transforms in the file system on huge file and relational sources, and addresses semi- and un-structured data. Using an Eclipse front-end, it also addresses the issues Informatica's CTO correctly identifies. ELT isn't the way to go either, since big transforms tax the DB (thus query response), or requires a costly appliance (which like Hadoop, throws hardware at a software problem). The benefits of a paradigm shift to new IT fabric are not always worth the risk.
icokruger
100%
0%
icokruger,
User Rank: Apprentice
1/22/2014 | 7:26:59 AM
ETL

Hadoop is not a Data Integration Solution," I will describe the gaps between Hadoop and a proper Data Integration. To be sure, there are many, many gaps in Hadoop when compared to a traditional data integration solution. But, what is it about the Hadoop infrastructure that is attracting such interest despite these significant gaps? There is a reason Sears has made the decisions it has. There is a reason why many more organizations are aggressively pushing forward to integrate data in Hadoop despite Hadoop's functional gaps.

 

In the era of Big Data, Hadoop's architecture is fundamentally superior for supporting many of the most commonly deployed data integration functions. First and foremost, it can deliver the scale and compute capabilities required to support the information the business demands at a cost that is sustainable. For this reason, organizations are flocking to Hadoop even if key functional capabilities must be written by hand today. Hadoop makes it easy to scale computing power horizontally with low cost components. This architectural benefit is absolutely core to successfully performing the large-scale ETL required for processing Big Data. Hadoop's ability to persist data „ź lots of it in any format – is a new architectural component long missing from traditional data integration platforms. More importantly, this architecture looks like it will also support a broader range of data integration functions.

 

The compute and analysis capabilities of the Hadoop architecture support the requirements of data profiling and data quality. In many ways, data profiling and quality are Big Data problems, particularly with today's growing data sets. This is being tested in our ETL Solution at NSFAS, why profile a sample when I have the entire dataset? The ability to support metadata seems obvious and while HCatalog is immature, it is evolving. Witness the introduction of Navigator 1.0 in Cloudera's 4.2 release, which provides basic data governance capabilities. Not only does the core architecture support advanced data integration functionality, but it also offers a superior framework to do so, enabling vendors to deliver these features at a rapid pace.

 

The main problem Big Data creates is an architectural one, not a functional one. Perhaps it is fair to say that today, Hadoop is not a Data Integration solution



IT's Reputation: What the Data Says
IT's Reputation: What the Data Says
InformationWeek's IT Perception Survey seeks to quantify how IT thinks it's doing versus how the business really views IT's performance in delivering services - and, more important, powering innovation. Our results suggest IT leaders should worry less about whether they're getting enough resources and more about the relationships they have with business unit peers.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Tech Digest September 24, 2014
Start improving branch office support by tapping public and private cloud resources to boost performance, increase worker productivity, and cut costs.
Video
Slideshows
Twitter Feed
InformationWeek Radio
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.