Intel will fold its Hadoop distribution and invest in Cloudera. It's good news for the entire Hadoop community.
Intel announced Thursday that it has partnered with and taken a majority stake in the Hadoop software distributor Cloudera.
As part of the deal, Intel will fold its Hadoop software and technology investments into Cloudera's distribution. Cloudera, in turn, will make the Intel architecture its "preferred platform" and will support a range of next-generation technologies, including Intel data fabrics, flash memory, and security features.
What changes in the wake of this deal isn't really clear. Cloudera had already scored $160 million in new venture capital backing before Intel's investment. (Terms were not disclosed, but one report put Intel's investment at north of $90 million.) And as Cloudera CEO Tom Reilly noted during Thursday's announcement, Intel-based servers were already powering "nearly 100%" of its customer deployments. In short, with or without this deal, Cloudera would be a well-funded company offering software that runs on lots of Intel chips.
Aside from trading compliments about their respective companies and enthusiasm about the deal, Reilly and Diane Bryant, senior vice president and general manager of Intel's data center group, talked about what a huge market they expect Hadoop to become. Bryant cited Intel research in which 82% of CIOs surveyed said big data platforms promise significant business value, while only 6% said they have deployed the technology. That gap ensures there will be plenty of growth, the executives said.
It had to be hard for Cloudera's rivals to listen to this love fest. Bryant said Cloudera's software will become "Intel's preferred platform," and she vowed to help make it "the best big data platform for our industry." She also called Cloudera "the leading contributor to the Hadoop open source community." That description did not sit well with Hortonworks, which has tried to document that it has done more to develop the open source Apache Hadoop project than any other Hadoop distributor.
"It was an interesting choice of words to say 'contributing,' rather than 'committing,'" Herb Cunitz, Hortonworks' president, told InformationWeek in a phone interview. "Contributing to open source says, 'I have an idea.' Committing to open source says, 'I took that idea and put it into code that benefits the entire community.' There's a big difference."
Diane Bryant, senior vice president and general manager of Intel's data center group, and Cloudera CEO Tom Reilly announce their partnership.
Bryant also brought up concerns about open source fragmentation and proprietary components off the open source core of Hadoop. Hortonworks has raised those concerns more than any other Hadoop distributor, criticizing Cloudera's approach of reserving functionality for its proprietary management software. But Bryant said that the Intel-Cloudera partnership would "give enterprises the confidence that this market will not fragment and that they can count on [Cloudera] as the leading big data platform."
It was no shock that Intel gave up its own Hadoop distribution. There were few discernible signs of market adoption -- at least in North America. Intel insists it's the leading distribution in India and China, but when asked by InformationWeek, it declined to share deployment stats or third-party-verified marketshare figures.
What distinguished Intel's distribution was software optimizations designed to take advantage of Intel chip processing power, bandwidth, and security features. These optimizations improve performance "up to 10X," according to Bryant. Reilly insisted that the merging of that intellectual property into Cloudera's distribution "is not a handoff," but rather a combination of hardware and software engineering talent. "Intel has a roadmap of innovation for Hadoop," and it will show up in future releases, he said.
What's unclear is whether these chip-based optimizations will be available only to Cloudera. When Intel introduced its distribution in North America early last year, it said it would share these advances with the entire Hadoop community, but it also said that its management software for exploiting chips would "remain unique to Intel."
It's easy to guess that this software will now show up uniquely in Cloudera Manager software. But Reilly also said Thursday: "Everything we do together with Intel is going to go into open source. We want to impact the overall industry and have every distribution benefitting." It appears we'll have to wait for future releases to see whether chip optimization is just for Cloudera customers or something for the entire community.
Make no mistake -- the Intel partnership is a very good thing for Cloudera. It's gaining tens of millions of dollars and a well-known, global partner with hundreds of thousands of hardware and software partners. Having an industry infrastructure giant promoting and endorsing Hadoop is obviously going to help Cloudera, and Cloudera repeatedly insisted that it will benefit the entire Hadoop community.
As Jack Norris, CMO at the rival Hadoop vendor MapR, said of the deal, "The rising tide in the Hadoop ecosystem raises all boats."
Engage with Oracle president Mark Hurd, NFL CIO Michelle McKenna-Doyle, General Motors CIO Randy Mott, Box founder Aaron Levie, UPMC CIO Dan Drawbaugh, GE Power CIO Jim Fowler, and other leaders of the Digital Business movement at the InformationWeek Conference and Elite 100 Awards Ceremony, to be held in conjunction with Interop in Las Vegas, March 31 to April 1, 2014. See the full agenda here.
Doug Henschen is Executive Editor of InformationWeek, where he covers the intersection of enterprise applications with information management, business intelligence, big data and analytics. He previously served as editor in chief of Intelligent Enterprise, editor in chief of ... View Full Bio
InformationWeek Must Reads Oct. 21, 2014InformationWeek's new Must Reads is a compendium of our best recent coverage of digital strategy. Learn why you should learn to embrace DevOps, how to avoid roadblocks for digital projects, what the five steps to API management are, and more.