IoT
Comments
Sears Hadoop Plans: Check Out Data Warehousing's Future
Newest First  |  Oldest First  |  Threaded View
JHADDAD3380
50%
50%
JHADDAD3380,
User Rank: Apprentice
11/16/2012 | 2:05:06 AM
re: Sears Hadoop Plans: Check Out Data Warehousing's Future
I see Hadoop as a key component of a big data analytics strategy that complements and needs to integrate with the rest of an enterprise information management infrastructure that may include legacy systems (like the mainframe), relational databases, ERP, CRM, and cloud applications, data warehouse appliances, etc. Not only are the data volumes growing exponentially but the variety of data is increasing with social media, sensor devices, call detail records, industry standards data (e.g. HL7 in healthcare, FIX, SWIFT, and market data in Financial Services, etc.), log files, and the list goes on.

It certainly makes sense to store a lot of the raw multi-structured and unstructured data in Hadoop rather than a traditional relational database. However, even if you assume over time that more and more data will be stored in Hadoop you still need to access the ever increasing variety of data from multiple organizations, residing in different systems and formats, then you need to parse and transform it on Hadoop, before you can do any useful analysis.

IG«÷m hearing from data scientists that about 80% of the work in a big data project is data integration. In fact, in one study of 35 data scientists one of them stated, G«£I spend more than half my time integrating, cleansing, and transforming data without doing any actual analysis. Most of the time IG«÷m lucky if I get to do any G«ˇanalysisG«÷ at all.G«•, (Kandel, et al. Enterprise Data Analysis and Visualization: An Interview Study. IEEE Visual Analytics Science and Technology (VAST), 2012). The need for data integration is greater today than it ever has been. The challenge is to make data integration easier and more productive on emerging technologies such as Hadoop. InformaticaG«÷s PowerCenter Big Data Edition (http://bit.ly/U25Cn8) provides a no-code development environment to visually design data integration flows and then execute them on Hadoop so that data scientists can spend more of their time doing analysis rather than integrating data.


Register for InformationWeek Newsletters
White Papers
Current Issue
Top IT Trends to Watch in Financial Services
IT pros at banks, investment houses, insurance companies, and other financial services organizations are focused on a range of issues, from peer-to-peer lending to cybersecurity to performance, agility, and compliance. It all matters.
Video
Slideshows
Twitter Feed
InformationWeek Radio
Archived InformationWeek Radio
Join us for a roundup of the top stories on InformationWeek.com for the week of July 17, 2016. We'll be talking with the InformationWeek.com editors and correspondents who brought you the top stories of the week to get the "story behind the story."
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.