Open source data processing platform has won over Web giants for its low cost, scalability, and flexibility. Now Hadoop will make its way into more enterprises.
There's a revolution happening in the use of big data, and Apache Hadoop is at the center of it.
Excitement around Hadoop has been building since its release as an open source distributed data processing platform five years ago. But within the last 18 months, Hadoop has taken off, gaining customers, commercial support options, and dozens of integrations from database and data-integration software vendors. The top three commercial database suppliers--Oracle, IBM, and Microsoft--have adopted Hadoop.
IBM introduced its Hadoop-based InfoSphere BigInsights software in May, and last month Oracle and Microsoft separately revealed plans to release Hadoop-based distributions next year. Both companies plan to provide deployment assistance and enterprise-grade support, and Oracle has promised a prebuilt Oracle Big Data Appliance with Hadoop software already installed.
Will Hadoop turn out to be as significant as SQL, introduced more than 30 years ago? Hadoop is often tagged as a technology exclusively for unstructured data. By combining scalability, flexibility, and low cost, it has become the default choice for Web giants like AOL and ComScore that are dealing with large-scale clickstream analysis and ad targeting scenarios.
But Hadoop is headed for wider use. It's applicable for all types of data and destined to go beyond clickstream and sentiment analysis. For example, SunGard, a hosting and application service provider for small and midsize companies, plans to introduce a cloud-based managed service aimed at helping financial services companies experiment with Hadoop-based MapReduce processing. And software-as-a-service startup Tidemark recently introduced a cloud-based performance management application that will use MapReduce to bring mixed data sources into product and financial planning scenarios.
The Agile ArchiveWhen it comes to managing data, donít look at backup and archiving systems as burdens and cost centers. A well-designed archive can enhance data protection and restores, ease search and e-discovery efforts, and save money by intelligently moving data from expensive primary storage systems.
2014 Analytics, BI, and Information Management SurveyITís tried for years to simplify data analytics and business intelligence efforts. Have visual analysis tools and Hadoop and NoSQL databases helped? Respondents to our 2014 InformationWeek Analytics, Business Intelligence, and Information Management Survey have a mixed outlook.