10 Lessons Learned By Big Data Pioneers
August 23, 2011 02:10 PM How can you prepare for the big data era? Consider this expert advice from IT pros who have wrestled with the thorny problems, including data growth and unconventional data.
Lesson 6: Hadoop Helpers Ease Loading and Processing Pains
The Hadoop market is expected to grow into the billions of dollars, and supporting products and integrations are quickly emerging. Well-known data-integration vendors Informatica, Pervasive Software, SnapLogic, and Syncsort, for example, have all announced products or integrations aimed at making it faster and easier to work with this young processing platform.
Pervasive Software's Data Rush tool optimizes concurrent, parallel processing within Hadoop. Data provider InfoChimps uses Data Rush in combination with Hadoop instances running in Amazon's Elastic Compute Cloud. InfoChimps CTO Philip Kromer, pictured above, says he has seen 2-4X performance increases in tests of Data Rush involving hundreds of gigabytes, cutting 16-hour jobs down to four to eight hours. That makes it possible for InfoChimps to reduce computing costs and harvest that much more data from Twitter and other non-relational data sources.
Informatica, SnapLogic, Syncsort and others are making it possible to load, sort, and aggregate data using a single tool set across conventional databases and Hadoop deployments. A single, familiar approach and tool set should make it easier for your data management professionals to do their work.