Two major announcements by Pivotal Tuesday signal a continuing shift toward open source products for big data, and for enterprise software in general. The first was that Pivotal will create what it is calling the "world's first open source big data product suite," by releasing its Pivotal HD, Pivotal HAWQ, Greenplum Database, and Gemfire products all as open source projects.
The second is that Pivotal will join Hortonworks, GE, IBM, Verizon, SAS, Infosys, and others in the creation of an Open Data Platform initiative designed to improve the Hadoop ecosystem. Combined, this news serves as a wakeup call to CIOs who may still be wary of open source software.
The fact that Pivotal would make such a big move with a product that drew over $100 million in subscriptions last year shows the company's belief that the enterprise software market is changing. Big companies "saw success in the first generation of open source and adopting it now is more comfortable," Sundeep Madra, VP of Pivotal's data product group, said in an interview with InformationWeek. "Many CIOs used to say 'open source over my dead body,' and now there are a lot of dead bodies on Wall Street."
Pivotal execs say the move is all about community and responsiveness. Rather than waiting for vendors to get around to adding wanted features, open source code lets customers participate in the creation of the features they most want. "It is part of a continuum on a customer journey," Madra said. "Customers have been working with us for a long time. This is what differentiates us."
Of course, this isn't an isolated incident, as these kinds of open source software contributions are a major trend. Pivotal execs say that this move is inspired by the success they had with open source Cloud Foundry. Even Microsoft recently open-sourced the .Net core. And big data has always relied on major open source products including Hadoop, MapReduce, MongoDB, and many others.
One major driver of this shift to open source is the growing demand to manage massive quantities of data, and analyze that data in real-time, especially for Internet of Things initiatives. One major obstacle to delivering on the promise of the IoT has been a lack of data standards and the fact that many enterprises struggle to go from storing big data to acting on it in real-time. That's where the Open Data Platform initiative comes into play, since it's designed to rationalize a fragmented Hadoop community. The problem with Hadoop, according to Madra, is that many enterprises have to run multiple copies of Hadoop in-house. The initiative hopes to focus on a core of Hadoop offerings to eliminate fragmentation.
Madra compared Hadoop's situation today with Linux and Unix. Linux's common core gave it an advantage over Unix, which had many different flavors and commercial versions. Linux's common core offered a single open source alternative. The initiative is hoping to avoid the many flavors of Unix problem and stick with a common core version of Hadoop.
As Ben Fathi, Chief Technology Officer of VMware says in the press release announcing the Open Data Platform initiative, "The Open Source movement is fundamentally changing the way that software is being developed in the industry today. Common frameworks and standards such as Open Data Platform will help solidify Open Source as a proven option for enterprises."
The strong community of open source developers in big data, both those in the Pivotal community and the larger Hadoop community, organizing to tackle some of big data's biggest challenges should be welcome news to even the most open source-averse CIO.