Cloudera Director 2.0 Debuts, Google Donates To Apache Foundation: Big Data Roundup

Cloudera has updated a key tool for managing big data, Google has contributed its Cloud Dataflow platform to the Apache Foundation Incubator, and plenty more in this Big Data Roundup for the week ending January 24.
IoT 2016: 13 Hot Trends For Business
IoT 2016: 13 Hot Trends For Business
(Click image for larger view and slideshow.)

Hadoop distributor Cloudera has issued an update to one of its offerings, Google has submitted one of its efforts to the Apache Foundation as a potential incubator project, MariaDB has raised some funds, and Netflix CEO Reed Hastings talks about the limits of data. Plus, we look at how data could help you predict this year's Academy Awards, all in our Big Data Roundup for the week ending January 24.

Cloudera Director 2.0

Let's start with Cloudera's news. This Hadoop distribution company updated Cloud Director, its big data deployment and management tool. Cloudera said that version 2.0 simplifies running common Hadoop workloads in the cloud, such as ETL and modeling, business intelligence and analytics, and application delivery. The tool is designed for both scale and production environments in the cloud, according to Cloudera VP of products Charles Zedlewski.

Cloudera has added spot instance support to decrease hosting costs for transient workloads and automatic job submissions to spin up and terminate clusters on a per-job basis. Cloudera also said that the newest release of Cloudera, version 5.5, has introduced support for Apache Hive and Apache Spark on Amazon S3, so users can continue to use the their choice of tools, independent of where the data resides.

In addition, the new version adds cluster cloning and cluster repair to increase the end-user base and repair clusters without affecting end users. And for application delivery workloads, Cloudera Director 2.0 has integrated high availability and Kerberos configurations within the overall bootstrap workflow, making it easier to set up, Cloudera said.

[ Intel is a big Cloudera investor. Want to know more about Intel's big data efforts? Read Intel's TAP Big Data Platform Gains Healthcare Cloud Partners. ]

The new release works across major cloud platforms, including AWS and Google Cloud Platform, and it includes the Open Cloud Connector to enable integration with other preferred or private clouds. Users who want to deploy on Microsoft Azure can provision Cloudera Enterprise via the Azure Marketplace, the company said.

Google Cloud Dataflow

Google this week has sent a proposal for its Cloud Dataflow to be accepted as an Apache Foundation Incubator project. Google's Cloud Dataflow is a platform for processing big data in the cloud. It features an open source, Java-based SDK to help make it easy to integrate with other cloud-based analytics tools. That includes letting organizations use their existing tool investments and integrate them even as they adopt more advanced technologies.

Google announced the submission in a blog post this week. The search giant said that it has submitted the project along with participants from Cloudera, Data Artisans, Talend, Cask, and PayPal.

MariaDB Raises Funds

MariaDB, which offers an open-source relational database, has raised $9 million in equity funding for advanced technology development and to accelerate sales, the company announced this week. The round included investments from Intel Capital and California Technology Ventures. The company also announced the appointment of Michael Howard as its new CEO and Michael "Monty" Widenius as CTO.

AI, Machine Learning, And IoT

At InformationWeek, we've put together some interesting coverage in the last week about adoption of AI and machine learning in the enterprise, as well as some significant open source machine learning moves by big vendors in recent months. We also explored how businesses and other enterprises are using the Internet of Things (IoT) to drive value.

We also examined how companies are monetizing data now and in the months and years ahead.

Netflix And Data

Netflix is known for leveraging data to drive its business, including decisions on creating new original shows. CEO Reed Hastings recently offered a bit of a caveat, though. Speaking at the DLD Conference in Munich, Germany, last week, Hastings said: "We start with the data. But the final call is always gut. It's informed intuition," according to this report on VentureBeat. Hastings pointed to one Netflix exec, Ted Sarandos, as the man who has the "golden gut," but also cited a distributed group of executives and managers who make the final call over content.

And The Oscar Goes To...

Finally, FiveThiryEight provided some insights this week on how to game Academy Award predictions. We'd tell you all about it, but...Spoilers!