Databricks on Thursday announced a developer-certification program for Apache Spark, the hot big-data-analysis platform that handles everything from machine learning, streaming-data analysis, and SQL querying to R-based analytics, graph analysis, and MapReduce processing.
Spark is quickly gaining interest, thanks to the appeal of offering an all-purpose analysis platform boasting in-memory performance advantages. Developed by Databricks, the open-source Spark platform has attracted more than 300 contributors since its launch in 2012.
To date, Databricks has certified software distributors and support providers, a list that includes Cloudera, Hortonworks, IBM, Oracle, and SAP, but with demand growing to get the software deployed, would-be customers were putting on pressure on Databricks to promote development as well as software distribution and support.
[Want more on Apache Spark? Read Databricks Spark Plans: Big Data Q&A. ]
"Lots of organizations are ready to build apps with Spark, and they're asking us for help, but we're not a professional services organization," said Arsalan Tavakoli, Spark's head of business development, in a phone interview with InformationWeek. "The two goals of the certification program are to help Spark developers demonstrate their skill set and to help companies looking to build applications find qualified experts."
The new Spark developer-certification program will be announced at next month's Strata Conference. Event organizer (and tech publisher) O'Reilly has partnered with Databricks to promote the program to its audience and will manage registrations and handle customer service for the program. Databricks will develop the course content, exams, and certification standards.
Databricks and O'Reilly have yet to settle on costs, course length, and whether the course will be entirely online or will involve in-person testing. In either case, Databricks experts will proctor the final exam, said Tavakoli.
Now in its 1.1 release, Spark is available through Databricks-certified software distributors including BlueData, Cloudera, DataStax, Guavus, Hortonworks, IBM, Oracle, Pivotal, SAP, and Stratio. Three of these distributors, Cloudera, DataStax, and MapR, have also won Databricks certification to provide Spark software support, though that's the natural next step for all distributors.
Among the growing signs of Spark maturation, Tavakoli said there are now more than 40 users of Spark Streaming in production, and Spark SQL has matched the MLLib machine-learning component of the platform in terms of active development.
Apply now for the 2015 InformationWeek Elite 100, which recognizes the most innovative users of technology to advance a company's business goals. Winners will be recognized at the InformationWeek Conference, April 27-28, 2015, at the Mandalay Bay in Las Vegas. Application period ends Jan. 9, 2015.