headhunters, universities, MOOC providers, and others offering various forms of training to big-data-analysis wannabes. Here a sampling of related developments in 2014:
- Big Data Salaries Top BI, Data Warehousing
- 2014 US IT Salary Survey: BI & Analytics
- 10 Big Data Online Courses
- Big Data Job Hunting: Unconventional Advice
- UC Berkeley Breeds Data Scientists Online: $60K, 18 Months
With education and training opportunities flourishing, there will no doubt be waves of new talent available by the time we meet the graduating classes of 2015, 2016, and 2017.
4. Cloud options multiply
Hadoop, NoSQL databases, analytics tools and platforms: You name the technology and businesses are likely to start experiments in the cloud. And many will stay there, having no interest in deploying servers and administering software on premises. That's certainly true of small and midsize practitioners we've met. Vendors are responding to the demand. Here are few of the notable cloud-oriented big data announcements made in 2014:
- IBM Buys Cloudant, Eyes Amazon's Turf
- Microsoft Brings Predictive Analysis To Azure
- MongoDB Debuts On Microsoft Azure, Google Compute Engine
- MongoDB Launches Offensive As Rivals Rev Up
- HP Cloud Adds Big Data Options
5. Focus turns to analysis
Data platforms will inevitably be commoditized. The real value in data is delivered through analysis, not just putting it all in a lake or on a data hub. Apache Spark offers a compelling promise: Machine learning, SQL, R-based analytics, graph network analysis, and streaming analysis all on one system. Support soared in 2014. Here's a sampling of related coverage this year:
- Databricks Spark Plans: Big Data Q&A
- Will Spark, Google Dataflow Steal Hadoop's Thunder?
- Hortonworks Invests In Spark On Hadoop
- DataStax Cassandra Release Packs More Than Spark
- MapR Brings Spark In-Memory Analysis To Hadoop
Spark detractors (perhaps threatened) are whispering that Spark is too green or that niche alternatives (like Apache Storm for streaming analysis) might be better. Spark developer Databricks is responding with system tweaks and benchmark tests said to prove scalable performance.
Rest assured the real competitive battle in big data will be to lead in providing tools and capabilities for data analysis. Multiple commercial vendors (including Actian, Pivotal, and Teradata) seem to be aping the multi-analysis-engine platform strategy, and Cloudera's recent acquisition of DataPad, which offered a Python-based data analysis library, showed it's headed deeper into analytics.
Fasten your seatbelts -- it's going to be a competitive, and interesting, 2015.
Apply now for the 2015 InformationWeek Elite 100, which recognizes the most innovative users of technology to advance a company's business goals. Winners will be recognized at the InformationWeek Conference, April 27-28, 2015, at the Mandalay Bay in Las Vegas. Application period ends Jan. 16, 2015.