Hadoop, big data's reigning open source platform, will play a pivotal role in the coming decades as businesses use new and improved business intelligence (BI), security and management tools to exploit the platform's full potential.
Or so says Quentin Clark, corporate VP of data platforms at Microsoft, who took a break from his vacation last week to stop by the Hadoop Summit in San Jose, Calif.
"We believe Hadoop is the cornerstone of a sea change coming to all businesses," Clark told the conference crowd in his June 27 keynote, in which he outlined Microsoft's big data strategy and how the company is integrating Hadoop with its varied selection of enterprise products and services.
Microsoft's data platform includes everything from its SQL Server database products to business intelligence (BI) features in Excel. In addition, playing well with Hadoop's open source community and leading Hadoop proponents like Hortonworks is a key component of Redmond's big data strategy.
"Microsoft believes we have a responsibility to help bring Hadoop into the enterprise, and into this world," said Clark. "Hadoop will be the way that a large class of non-relational data is processed and managed. And it is our responsibility to help bring that to maturity and bring it forward."
[ Want more news from the Hadoop Summit? See Teradata Doubles Down On Hortonworks Hadoop. ]
Clark also expressed Microsoft's intentions to "stick to the principles of open source" by contributing to the Hadoop project, rather than simply using it and adding "stuff that (is) propriety to ourselves."
Last week, Hortonworks announced management packs for Microsoft System Center Operations Manager and Microsoft System Center Virtual Machine Manager, both tools for administering the Hortonworks Data Platform (HDP) distribution.
"Microsoft is here because we're in this partnership with Hortonworks, with the community, deeply involved to help get Hadoop out, literally, to a billion users," Clark said.
Why a billion? That's Microsoft's estimate of the number of Excel users on the planet. The company is positioning itself as a big data player with a powerful set of business intelligence (BI) tools, most notably Excel, for enterprise users.
Microsoft recently released a preview of Data Explorer for Excel 2013, a self-service BI add-in that allows business workers to import data from a variety of sources, including Hadoop. It also recently announced the availability of SQL Server 2012 Parallel Data Warehouse (PDW), a massively parallel processing data warehousing appliance designed for Hadoop integration.
Clark briefly demonstrated Data Explorer at the Hadoop Summit, as well as an Excel visualization tool called GeoFlow, which lets users view data sets in 3-D on Bing Maps.
He also discussed Microsoft's effort to bring Hadoop into the public cloud via Windows Azure.
"HD Insight ... is the Hadoop service that's available as part of Windows Azure," he said. "So whether it's on-prem or in the cloud, we now have these offerings out there and the momentum has been tremendous."
Clark drew comparisons between big data's nascent evolution and the growth of commercial air travel in the 20th Century, particularly the development of the Boeing 747 jumbo jet.
"It wasn't until there were enough runways that were long enough before 747s really took off, so to speak," he said. But once the technology was in place, the jet helped revolutionize global air travel.
"And it is that scale of change that we believe Hadoop represents to the world," he added.
Big data faces many technological hurdles, however, including Hadoop's complexity and the lack of easy-to-use tools for end users.
"Today when we talk to enterprises, they talk to us about a lack of skills. They're like, 'We don't have people who know this technology yet,'" said Clark. "So the user base is not really ready yet. They expect that things will be integrated into the BI tools that they already understand."
Enterprises have security concerns as well. "They talk to us a lot about integration with their existing security and manageability systems," Clark noted. "They need this Hadoop thing to show up in a way that lets them integrate into their existing environment."
The big data market is not just about technologies and platforms -- it's about creating new opportunities and solving problems. The Big Data Conference provides three days of comprehensive content for business and technology professionals seeking to capitalize on the boom in data volume, variety and velocity. The Big Data Conference happens in Chicago, Oct. 21-23.