There's a revolution happening in the use of big data, and Apache Hadoop is at the center of it.
Excitement around Hadoop has been building since its release as an open source distributed data processing platform five years ago. But within the last 18 months, Hadoop has taken off, gaining customers, commercial support options, and dozens of integrations from database and data-integration software vendors. The top three commercial database suppliers--Oracle, IBM, and Microsoft--have adopted Hadoop.
IBM introduced its Hadoop-based InfoSphere BigInsights software in May, and last month Oracle and Microsoft separately revealed plans to release Hadoop-based distributions next year. Both companies plan to provide deployment assistance and enterprise-grade support, and Oracle has promised a prebuilt Oracle Big Data Appliance with Hadoop software already installed.
Will Hadoop turn out to be as significant as SQL, introduced more than 30 years ago? Hadoop is often tagged as a technology exclusively for unstructured data. By combining scalability, flexibility, and low cost, it has become the default choice for Web giants like AOL and ComScore that are dealing with large-scale clickstream analysis and ad targeting scenarios.
But Hadoop is headed for wider use. It's applicable for all types of data and destined to go beyond clickstream and sentiment analysis. For example, SunGard, a hosting and application service provider for small and midsize companies, plans to introduce a cloud-based managed service aimed at helping financial services companies experiment with Hadoop-based MapReduce processing. And software-as-a-service startup Tidemark recently introduced a cloud-based performance management application that will use MapReduce to bring mixed data sources into product and financial planning scenarios.