Cloudera says Amazon Web Services partnership will help customers move beyond Hadoop sandboxes to bring high-scale production platforms into the cloud.
Top 10 Cloud Fiascos
(click image for larger view)
Cloudera and Amazon Web Services (AWS) announced on Thursday a partnership that promises simple and efficient ways to quickly bring production-scale, Hadoop-based Cloudera deployments into the cloud.
Many customers already run Cloudera's software on AWS, but they do so in a two-step process in which they license and download the software from Cloudera and then deploy it on AWS cloud infrastructure. That two-step approach is still in place as of today, but the partnership calls for that process to be collapsed into a single step. The new approach will not only speed and ease deployment, it will bring Cloudera's entire stack to the cloud for production use, according to Tim Stevens, Cloudera's VP of business and corporate development.
"The cloud use cases to date have typically been for research and development, testing, and proof-of-concept experimentation," said Stevens in a phone interview with InformationWeek. "Now we're bringing a certified and supported, enterprise-data-hub experience to the Amazon cloud."
"Enterprise data hub" is Cloudera's vision for the broad use of its platform as the first destination and center of data-management activity within enterprises. Where test-and-dev deployments on AWS have been limited to Cloudera's core open-source Hadoop distribution, Stevens said the partnership will bring more extensive deployments including Cloudera Manager, Cloudera Impala, and Cloudera Search.
What kinds of companies are interested in running high-scale, production Hadoop clusters in Amazon's cloud? The most obvious candidates are Internet firms and media companies running on AWS infrastructure and storing vast quantities of data in its cloud. But large telecommunications companies, financial services, and retailers are also looking to bring at least some of their data into the cloud, says Stevens.
"Retailers, for example, have systems to capture and analyze purchases taking place in stores, and they have separate systems to capture and analyze purchases taking place online," he explained. "Some retailers want to marry those two datasets together in the cloud while others want to do it on-premises." Those leaning toward the cloud seek agility and want to stay out of the business of running Hadoop clusters.
Cloudera isn't the first Hadoop distributor to forge cloud partnerships. MapR struck up relationships with both Amazon and Google in 2012, and MapR instances have been available directly on AWS for more than a year. Hortonworks is partnered with Microsoft, and the Windows-based version of its software has been available as a service on Azure since early this year. IBM, Pivotal, and Teradata have also introduced cloud deployment options for their Hadoop offerings.
Cloudera ramped up its cloud activity in October with the introduction of Cloudera Connect: Cloud. That program counts Savvis (a CenturyLink company), SoftLayer (an IBM company), T-Systems, and Verizon Cloud as service partners, and Amazon is now joining the club.
Cloudera will provide the first line of support for AWS deployments, and it will use its Cloudera Manager systems-management software to determine whether any problems experienced are related to software or hardware.
"If it's a hardware or infrastructure problem, we have a hotline into the Amazon support structure to be able to address that right away," says Stevens.
Will most enterprise cloud deployments of Cloudera's software continue in the proof-of-concept mold? Only time will tell as companies weigh agility and low-touch infrastructure management against the cost, control, and perceived security of storing high-scale data in the cloud.
Doug Henschen is executive editor of InformationWeek, where he covers the intersection of enterprise applications with information management, business intelligence, big data, and analytics. He previously served as editor-in-chief of Intelligent Enterprise, editor-in-chief of Transform Magazine, and executive editor at DM News.
IT groups need data analytics software that's visual and accessible. Vendors are getting the message. Also in the State Of Analytics issue of InformationWeek: SAP CEO envisions a younger, greener, cloudier company. (Free registration required.)
6 Tools to Protect Big DataMost IT teams have their conventional databases covered in terms of security and business continuity. But as we enter the era of big data, Hadoop, and NoSQL, protection schemes need to evolve. In fact, big data could drive the next big security strategy shift.
Big Data Brings Big Security ProblemsWhy should big data be more difficult to secure? In a word, variety. But the business won’t wait to use it to predict customer behavior, find correlations across disparate data sources, predict fraud or financial risk, and more.