Big Data in the Cloud: Avoiding Vendor Lock-in

More enterprise application and big data vendors are pursuing a cloud-agnostic strategy -- supporting AWS, Microsoft Azure, and Google Cloud. Big data platform provider Hortonworks is taking that strategy one step farther.

Jessica Davis, Senior Editor

June 20, 2018

4 Min Read
<p>(Image: Jessica Davis/InformationWeek)</p>

Avoiding vendor lock-in has become a major concern in the cloud era. If you deploy your infrastructure in one vendor's public cloud, say Amazon AWS, what will you do if the service fails to meet your standards or if the prices increase? What if the features you need are no longer supported?

Sure, there are a couple of other big public cloud providers -- Microsoft Azure and Google Cloud. And more vendors are making sure that their technologies are supported across multiple public clouds. But it can be complicated and time consuming to move your applications and data from one cloud to another. Don't expect the public cloud vendors to make it easier. They don't want to lose your business to a competitor.

Containers are making it easier for organizations to move their workloads from cloud to cloud. But different public cloud vendors use different schemas and security set ups. Therein lies an opportunity for other vendors, however.

"We've kind of joked that it's easy to get data into the cloud, but really hard to get it out," said Arun Murthy, co-founder and chief product officer at Hortonworks, in an interview with InformationWeek.

Founded as a Hadoop distributor, Hortonworks has expanded its mission to offer its own enterprise-grade platform of data and analytics technologies. The two other Hadoop distributors, Cloudera and MapR, have done the same, creating platforms that include multiple open source big data and analytics technologies such as Spark and Kafka.

Hortonworks has been focused on enabling its platform for the cloud and helping enterprises with vendor lock-in concerns. To that end, in October 2017 it released DataPlane Service, a data fabric technology that takes some of the complexity out of working with multiple cloud vendors.

Forrester Research rated DataPlane as a Strong Performer in its Q2 2018 Forrester Wave Report for Big Data Fabric. The research firm said the service builds on existing open source projects such as Apache Ranger for security and Apache Atlas for managing metadata and governance services.

Murthy told InformationWeek that Hortonworks' fabric homogenizes data and workloads, including things like shared security and governance. Different cloud vendors take different approaches to security and governance, he said. DataPlane lets you manage your data and workloads in a consistent way across the major public cloud platforms and on-premises, too.

Hortonworks made a series of announcements this week in conjunction with its DataWorks Summit 2018, expanding its partnerships with the big cloud vendors and updating its platform to improve its cloud deployment. Among the updates to the platform are enhancements to deliver a consistency layer for non-consistent cloud stores. It has also added support for Agile application deployment via containerization, support for deep learning applications, a real-time database enabled via Apache Hive 3.0, and enhanced security and governance to help with regulatory compliance including GDPR.

Hortonworks announced expansions of its partnerships with Google Cloud, Microsoft Azure, and with IBM Cloud. The announcements focus on improved integration and easier deployments for organizations using the Hortonworks Data Platform, DataFlow, and DataPlane on the various cloud services. Murthy said that the goal is to be cloud agnostic.

"We do a lot of work with the specifics of how to interface with these particular clouds," he said. "That is how we look at our cloud strategy."

Many other vendors of enterprise applications and services are on the same path, offering their software in multiple clouds. By giving customers many choices, these vendors help mitigate the risk of cloud vendor lock-in and allow customers to use their preferred cloud vendor.

Murthy pointed to the benefits often cited by the open source community for all open source software, too. He said that Hortonworks' open source roots mean that it lowers costs, and it lowers risks. Because it is open source, there's no risk of a lock-in to a particular vendor's stack, he said. Plus, development happens faster because an entire community works on the software, not just a single vendor.

IBM's relationship with Hortonworks may be a little tighter than the others, however. In June 2017, IBM eliminated its efforts to offer its own Hadoop distribution and instead partnered up with Hortonworks Data Platform. IBM wrote about the expanded partnership in a blog post this week, and it's the integrated IBM Hosted Analytics with Hortonworks is now available as a service in the IBM Cloud.

Overall, Hortonworks' Murthy describes his company's announcements and approach as the embodiment of its "cloud-first" mentality. Beyond vendor lock-in concerns, organizations must also navigate data compliance concerns across multiple countries.

Hortonworks' cloud partnership expansion announcements and announcements from other major vendors demonstrate a pivotal shift in momentum towards the cloud for small and enterprise companies alike. Now vendors are starting to make it easier.

"We want to let you leverage the best parts of the cloud without a lift and shift," Murthy said. "We want to give customers a consistent experience across all the cloud providers."


About the Author(s)

Jessica Davis

Senior Editor

Jessica Davis is a Senior Editor at InformationWeek. She covers enterprise IT leadership, careers, artificial intelligence, data and analytics, and enterprise software. She has spent a career covering the intersection of business and technology. Follow her on twitter: @jessicadavis.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like

More Insights