Spark Spreads, Apache Arrow Accepted: Big Data Roundup

Databricks announced a free community edition of Spark along with free training materials. Apache Arrow became a project within the Apache Software Foundation. And SAP announced support for Spark in its Predictive Analytics platform. We've got that and more in our big data roundup for the week of Feb. 21, 2016.

Jessica Davis, Senior Editor

February 21, 2016

4 Min Read
InformationWeek logo in a gray background | InformationWeek
<p align="left">(Image: PonyWang/iStockphoto)</p>

Where 2016 US Presidential Contenders Stand On Tech Issues

Where 2016 US Presidential Contenders Stand On Tech Issues


Where 2016 US Presidential Contenders Stand On Tech Issues (Click image for larger view and slideshow.)

It's been a busy week in big data land. We've got news about a free community edition of Apache Spark plus more news from Spark distributor Databricks, a new Apache Software Foundation project for big data called Arrow, Gartner's Magic Quadrant for Advanced Analytics, and more.

Llet's start with the news from Databricks, the main commercial distributor of Apache Spark. This week at the Spark Summit East in New York, the company rolled out a beta release of Databricks Community Edition, a free version of the cloud-based big data platform. It comes with a set of training resources, including a massive open online course (MOOC), "Introduction to Big Data with Apache Spark."

According to Databricks, the new service provides data scientists and IT pros with the technology they need to get started with Spark, including access to a microcluster and a cluster manager and notebook environment. The free version will be generally available in the second quarter.

[ Want the tech perspective on the Super Bowl? Read NFL CIO: Super Bowl 50 Tech Was A Game Changer. ]

Databricks said it will continue to develop Spark tutorials and training materials to be part of the Community Edition over time.

"As developers at heart, we find value in empowering professionals to tackle big data problems, and as a result, we are committed to the development of the Spark engine and the healthy growth of the community," said Ion Stoica, executive chairman at Databricks, in a prepared statement. "We're happy to contribute back to the community by releasing Community Edition of Databricks for free and we're excited to see how users experiment with the platform."

During Spark Summit East, Databricks also launched Databricks Dashboards as an expansion to its enterprise Spark platform. Databricks said the Dashboards are intended to enable data pros to transform complex results into visual formats that are easy for business users to consume.

Speaking of Spark, SAP announced support for the technology in its Predictive Analytics 2.5 platform, and also announced the acquisition of a mobile visualization company to enhance its advanced analytics portfolio.

Apache Arrow

Apache Arrow has been accepted as a full-fledged project by the Apache Software Foundation. The technology is designed to improve the performance and speed of big data components that work together as part of a larger system.

The project is backed by Tomer Shiran and Jacques Nadeau, the founders of Dremio, who are also the force behind Apache Drill. The technology is designed to enable various projects within the big data Hadoop ecosystem to talk to each other more easily, and to enable multiple development languages to work with the ecosystem.

Magic Quadrant, CDO, And Privacy Shield

Gartner released its 2016 Magic Quadrant report for Advanced Analytics, and we covered the details in a report this week. Plus, we've got an update on the state of the chief data officer, and a look at best approaches for enterprises to take as they await interpretations of the EU-US Privacy Shield.

Cognitive Computing Competition Prize: $5 Million

Finally this week, we've got some news about a new Cognitive Computing Competition. The IBM Watson AI XPRIZE is a $5 million competition challenging teams to develop and demonstrate how humans can collaborate with cognitive technologies to tackle the world's greatest challenges.

Every year leading up to TED2020, teams will go head-to-head at IBM's World of Watson annual conference to compete for interim prizes and the chance to advance to the next year's competition. Three finalist teams will deliver TED Talks in 2020 to provide demonstrations of what they have achieved, according to the competition's website.

A panel of judges will evaluate ideas for technical validity, and the winners will be chosen by TED and XPRISE communities "based on the audacity of their mission and the awe-inspiring nature of the teams' TED Talks in 2020."

IBM said it believes the competition can accelerate the creation of landmark breakthroughs.

What have you done to advance the cause of Women in IT? Submit your entry now for InformationWeek's Women in IT Award. Full details and a submission form can be found here.

About the Author

Jessica Davis

Senior Editor

Jessica Davis is a Senior Editor at InformationWeek. She covers enterprise IT leadership, careers, artificial intelligence, data and analytics, and enterprise software. She has spent a career covering the intersection of business and technology. Follow her on twitter: @jessicadavis.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like


More Insights