IBM Unveils Data Science Experience Dev Environment - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

IoT
IoT
Data Management // Big Data Analytics
News
6/7/2016
09:06 AM
Connect Directly
Twitter
RSS
E-Mail
50%
50%

IBM Unveils Data Science Experience Dev Environment

IBM rolls out a new integrated development environment in the cloud to build on its investment in Apache Spark and enable data scientists to collaborate with each other, regardless of their preferred language or data source.

10 Tips For Hiring Top Tech Grads
10 Tips For Hiring Top Tech Grads
(Click image for larger view and slideshow.)

IBM has introduced an integrated development environment called the Data Science Experience, designed to build on the company's $300 million investment it made in the Apache Spark ecosystem in 2015.

The project, which was announced Tuesday, is an extension of IBM's commitment to what IBM Analytics product development VP Rob Thomas calls "the analytics operating system" referring to Spark. The announcement is timed to coincide with this week's Spark Summit in San Francisco.

"I've asserted that everybody would be using Spark in the future, and that seems to be coming true even faster than we thought," he told InformationWeek in an interview. Thomas said that the Data Science Experience is the first enterprise app for this analytics operating system.

(Image: matdesign24/iStockphoto)

(Image: matdesign24/iStockphoto)

"This is an experience that is optimized and built on Spark," Thomas said. "You can think of it as the first integrated development environment for real-time and high-performance analytics. We are enabling data scientists to build machine learning apps with Apache Spark and do that regardless of their skill set or their tool of choice."

[Interested in finding out more about the latest developments in machine learning? Read Salesforce Delivers Machine Learning to Microsoft Outlook.]

The technology is an extension of the open source project Jupyter Notebook, a web application that allows data scientists to create and share documents that contain live code, equations, visualizations, and explanatory text. The Jupyter technology is used for data cleaning and transformation, numerical simulation, statistical modeling, machine learning, and more, according to the project's home page. Thomas said the Data Science Experience also leverages Spark and SystemML, which is the machine learning optimizer that IBM contributed to open source last year.

"When you build this type of environment on open source suddenly you can start to integrate a collaboration across a variety of different areas," he said. "That's also why we also are announcing an ecosystem initiative with a number of partners who are launching with us including H2O, RStudio, and Lightbend. Because of this open framework, we can bring a whole ecosystem of partners to this."

IBM's Data Science Experience is a cloud-based development environment that consolidates multiple open source tools including RStudio, Python, libraries from machine learning startup H2O.ai, and Notebooks, thus letting developers use familiar tools and still collaborate with other developers who may use other tools. The goal is to help developers get their applications into production faster.

The Data Science Experience also builds on IBM's current Data Scientist Workbench capabilities, which include connections to multiple data sources, and have more than 7,000 registered users.

Thomas said the Data Science Experience changes the approach to data science to make it a team sport.

"No matter what type of skill you have, whether it's R, or Python, or Scala, or SPSS, you can work in the Data Science Experience," he said. "You can collaborate and share datasets and collaborate and share models, and it doesn't require you to know the other languages."

Collaboration platforms like this one haven't existed in the past because so many tools were proprietary, Thomas said. This type of collaboration tool is new in the data science world, according to Thomas.

"We think this will really change the adoption of data science and machine learning in every enterprise," he said.

IBM first signaled its big commitment to Spark a year ago when Thomas told InformationWeek that Spark "is the future of enterprise data." In November, IBM's SystemML was accepted into the Apache Incubator.

Jessica Davis has spent a career covering the intersection of business and technology at titles including IDG's Infoworld, Ziff Davis Enterprise's eWeek and Channel Insider, and Penton Technology's MSPmentor. She's passionate about the practical use of business intelligence, ... View Full Bio

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
Slideshows
Reflections on Tech in 2019
James M. Connolly, Editorial Director, InformationWeek and Network Computing,  12/9/2019
Slideshows
What Digital Transformation Is (And Isn't)
Cynthia Harvey, Freelance Journalist, InformationWeek,  12/4/2019
Commentary
Watch Out for New Barriers to Faster Software Development
Lisa Morgan, Freelance Writer,  12/3/2019
White Papers
Register for InformationWeek Newsletters
Video
Current Issue
The Cloud Gets Ready for the 20's
This IT Trend Report explores how cloud computing is being shaped for the next phase in its maturation. It will help enterprise IT decision makers and business leaders understand some of the key trends reflected emerging cloud concepts and technologies, and in enterprise cloud usage patterns. Get it today!
Slideshows
Flash Poll