Strata + Hadoop World in San Jose, Calif., gave us plenty of news to cover in this week's edition of the big data roundup. We've got updates from Microsoft, Hortonworks, Intel and its Trusted Analytics Platform, and some training advances from Confluent, the company behind Apache Kafka.
Let's start with Microsoft. The company spent the week talking up some new programs, particularly for developers, at its Build 2016 conference in San Francisco. Microsoft executives also came down to San Jose to talk about recent announcements and investments in big data AI, analytics, and machine learning.
During an interview with InformationWeek, Joseph Sirosh, Microsoft Data Group corporate vice president, laid out some of the news. He provided an overview of advanced analytics at scale with R Server for HDInsight, as well as the most recent version of Spark for HDInsight, both now available in preview. HDInsight is Microsoft's managed Apache Hadoop, Spark, R, HBase, and Storm cloud service. These new versions let customers train and run machine learning models on larger datasets than could be handled before, the company said.
Microsoft also announced application integration to provide easier access to big data apps. The company said that customers can now deploy popular big data applications with HDInsight without the need for coding or scripting. Microsoft also announced the general availability of the Azure Data Catalog.
In addition, the company is expanding its support for Jupyter Notebooks, a tool for data scientists, with R support in Azure ML Studio.
"We are innovating along three major dimensions -- cloud, data, and intelligence," Sirosh told InformationWeek. "The cloud is a new way of delivering data, computing, and intelligence, and we are delivering these as fully managed services."
MemSQL this week rolled out version 5 of its platform. The company said that it ushers in a new era of capturing and querying data simultaneously for real-time analytics. Eric Frenkiel, CEO and cofounder of the company, announced the release in his keynote address at Strata + Hadoop.
"There's no doubt that we live in an on-demand economy," he said. "We all want access to data now." MemSQL's new version is designed to deliver it, he said.
Among the new features is Streamliner, which enables one-click creation of programmable pipelines, including one-click deployment of integrated Apache Spark.
The new version also enables the merging of transactions and analytics into a single database through Hybrid Transaction/Analytical Processing (HTAP), with concurrent support for OLTP and OLAP queries.
"The really exciting thing is the flexibility," Frenkiel said. Organizations can put a custom predictive model into an in-memory system. They can put Streamliner to work for IoT, he said.
Hadoop distribution company Hortonworks released DataFlow 1.2 this week, adding support for more processors to reach a total of 130 now supported. They include Kafka, Couchbase, Microsoft Azure, Event Hub, and Splunk, company executives told InformationWeek in an interview. The support for Apache Kafka and Apache Storm adds support for streaming analytics capabilities -- support that is experiencing growing demand from customers.
Hortonworks CTO Scott Gnau gave a talk during Strata + Hadoop about how we are at the tipping point in the big data revolution, as the structured enterprise data of years past is about to be dwarfed by the sheer quantity of other types of data.
Intel and TAP
Intel this week provided an update on its Trusted Analytics Platform, more commonly known as TAP. The chipmaker's TAP platform is comprised of a package of open source big data and analytics tools that organizations can use for free to boost their own data science initiatives. Several healthcare organizations and, more recently, a power company consortium have embraced the platform. Now, Intel will make it easier for companies to get up and running with the platform by partnering with cloud services provider Rackspace to offer one-click deployment.
Apache Kafka Training
Confluent -- the startup spun out of LinkedIn to further develop and promote the Apache Kafka big data message-streaming technology -- has announced that it will offer new public training to help organizations add skills with the platform. Confluent University will offer the training onsite and at public venues. Confluent CTO Neha Narkhede told InformationWeek in an interview that two training options are currently offered, a three-day developer course and a two-day operations course.
Winning a Sweet Ride
Finally, from Strata + Hadoop World, StrongDM was running a different kind of competition to gain attention for its technology, an access control tool that can prevent the wrong people from getting access to information that should stay private. The tool is designed to prevent employees from accidentally compromising sensitive data.
StrongDM was raffling off a 1997 Honda Civic owned by one of the cofounders, an old BlackBerry smartphone, and a 14.4k modem. We're not sure who won these fabulous prizes, but StrongDM took home the Audience Favorite award for its technology, so congrats are due to CEO and cofounder Schyuler Brown and the rest of the team.