Will Microsoft's Hadoop Bring Big Data To Masses?
Microsoft has "spent tremendous engineering time" creating a smooth integration between Hadoop and existing Windows security and management capabilities within Active Directory and Microsoft System Center, as well as the easier access possible using SQL Server and Excel, according to Doug Leland, Microsoft general manager of SQL Server marketing, in an interview with InformationWeek last week.
HDInsight can also run within virtual Windows Servers using Microsoft's Hyper-V hypervisor. Microsoft is trying to make that option simpler, too, Leland said, by developing templates that would act as pre-configured instances of HDInsight Server. These could be spun up or shut down at will, bringing it the advantages cloud services like Azure have to expand or contract a big data cluster.
- The Next Generation ESB: Why Integration is the Foundation for Better Business
- Using InfoSphere Information Server to Integrate and Manage Big Data
White PapersMore >>
- Take the InformationWeek 2013 Database Technology Survey
- Security Implications of Big Data Strategies
Hortonworks Offers Similar Hadoop Features
Hortonworks sells a version of Apache Hadoop that already offers many of the advantages touted for HDInsight, including integrated and automated installation, server-management and data-integration modules, and an adaptation of the Apache HCatalog, which allows data to be shared among different Hadoop installations. (See the Hortonworks Data Platform data sheet for more details.)
Though portions of the Hadoop framework are perfectly capable of managing traditional relational data, Microsoft made a point of positioning SQL Server as its preferred database management system for structured data, while HDInsight big data installations manage unstructured data and federations or mergers of multiple external data sets into a larger, big data platform.
"We need to accelerate the process of enabling the masses to benefit from the power and value of Apache Hadoop in ways where they are virtually oblivious to the fact that Hadoop is under the hood," according to a blog from Shaun Connolly, Hortonworks' VP of corporate strategy. "Doing so will help ensure time and energy is spent on enabling insights to be derived from big data, rather than on the IT infrastructure details required to capture, process, exchange and manage this multi-structured data."
The need to query both structured and unstructured data -- and the question of what it actually means to integrate existing data management systems with Hadoop -- are ongoing problems for big data-enamored corporations, according to Forrester analyst Boris Evelson, writing before the HDInsight announcement.
The connection between Hadoop data and Excel is enabled with via ODBC or Scoop connectors that can extract data from Hadoop so it can be imported into SQL Server. Though Hortonworks' materials focus on the value of connecting big data to traditional SQL Server databases, Microsoft's announcements make clear it views SQL Server 2012 as its primary data-management option both in the cloud and on premises.
"The next frontier is all about uniting the power of the cloud with the power of data to gain insights that simply weren't possible even just a few years ago," Microsoft VP Ted Kummert said in a Microsoft press release. "Microsoft is committed to making this possible for every organization, and it begins with SQL Server 2012."
In-memory analytics offers subsecond response times and hundreds of thousands of transactions per second. Now falling costs put it in reach of more enterprises. Also in the Analytics Speed Demon special issue of InformationWeek: Louisiana State University hopes to align business and IT more closely through a master's program focused on analytics. (Free registration required.)
Kevin Fogarty is a freelance writer covering networking, security, virtualization, cloud computing, big data and IT innovation. His byline has appeared in The New York Times, The Boston Globe, CNN.com, CIO, Computerworld, Network World and other leading IT publications.