One of the frequently cited use-cases for real-time data comes from Uber, which has harnessed the power of data and real-time analytics to actually predict when your driver will arrive to pick you up. Uber created this service with a technology called Apache Kafka (among other technologies). It is the power and momentum of what real-time data can do -- as in that Uber application -- that brought more than 500 users to the Kafka Summit in San Francisco this week.
Hosted by the messaging and streaming technology's distributor Confluent, the inaugural event featured customer use-cases. It served as the backdrop for a handful of partners to unveil their own announcements around the technology, as well.
Apache Kafka is a technology that was created inside LinkedIn to manage that social network's massive data messaging flows. The technology was then spun out as an Apache Software Foundation big data project and a separate company called Confluent.
"The whole Hadoop community is pivoting to data in motion," Gartner VP of research Merv Adrian told InformationWeek in an interview. "Most of the conversation about Hadoop had been about data at rest, but increasingly the whole community is thinking about real-time or near real-time."
Apache Kafka is one of the open source projects today that fulfills that need.
"Kafka is one of the emerging stars here," Adrian said. The technology is supported by four out of the five Hadoop distributors (Cloudera, Hortonworks, IBM, and Amazon), according to Adrian. The one Hadoop distribution company that doesn't officially support Kafka as part of its stack, MapR, announced a new training program at the Kafka Summit this week that is designed to help developers connect its platform with Kafka.
MapR Offers Free Training
MapR announced free stream processing on-demand training for real-time analytics and IoT applications. The Hadoop distributor said that the new training will enable Apache Kafka developers to extend their real-time analytics and their IoT applications. The training covers MapR streams that provide Kafka compatibility, according to a statement released by MapR.
Confluent's Kafka Users Survey
Confluent itself announced the results of a survey of more than 100 Kafka users around the world. Twenty-nine percent of the respondents work for organizations with more than $1 billion in annual sales. The survey, conducted by Researchscape International and sponsored by Confluent, revealed that 88% of the respondents said Kafka would be a mission-critical part of their data and application infrastructure by 2017.
A total of 72% of respondents use Kafka for stream processing to enable all incoming data to flow in a continuous stream, and 68% said they plan to incorporate more stream processing over the next six to twelve months.
Organizations are using the technology for a wide variety of applications, including application monitoring, recommendation and decision engines, security and fraud detection, IoT applications, and dynamic pricing applications.
"We see more and more organizations embracing real-time data and stream processing, and Kafka is at the heart of that shift," said Jay Kreps, one of Kafka’s co-creators and the CEO and cofounder of Confluent, in a prepared statement.
Striim Partners With Confluent
End-to-end streaming integration and intelligence platform company Striim also announced a new partnership with Confluent in conjunction with the Kafka Summit. Striim said the partnership would bring real-time change data capture to the Kafka ecosystem.
"At Striim, we help companies free data from enterprise databases," said Sami Akbay, founder and EVP of Striim, in a prepared statement. "Our partnership with Confluent makes it easy for our joint customers to release their most valuable data and make it available in real time to the organization in a way that is orchestrated, secure and reliable."
Confluent-Certified Connectors for Kafka
The news out of the Summit this week built on Confluent's announcement from last week that it was adding more than a dozen new connectors, including HDFS, JDBC, Cassandra, and S3 through Kafka Connect. These new connecters enable new real-time data streams for Kafka.
More Disruption Ahead
Adrian told InformationWeek that Kafka and other technologies like it are examples of how the software development and delivery model has changed. Over the last decade, technology has been developed, proved its general usefulness, and then moved out into the broader community. Hadoop, Java, MySQL, Linux, and others have fit this pattern.
"It's a new way for products to emerge compared to the way they used to come out of R&D labs," Adrian said. "This is different. We are in our second decade of this happening. The pace and delivery of software has been completely transformed."