Microsoft Open Sources Deep Learning, AI Toolkit On GitHub - InformationWeek
Data Management // Big Data Analytics
09:06 AM
Connect Directly

Microsoft Open Sources Deep Learning, AI Toolkit On GitHub

Previously available to academic researchers, Microsoft's Computational Network Toolkit (CNTK) now has a friendlier open source license.

8 Ways To Monetize Data
8 Ways To Monetize Data
(Click image for larger view and slideshow.)

On Monday Microsoft joined its peers, including Google, Facebook, and Yahoo, in offering a deep learning framework to support artificial intelligence applications.

The company released its Computational Network Toolkit (CNTK) as an open source project on GitHub, thus providing computer scientists and developers with another option for building the deep learning networks that power capabilities like speech and image recognition.

CNTK has been available to academic researchers since last April under a more restrictive license.

There are already several dozen deep learning toolkits and modules available. But the pace at which this technology is appearing has quickened. According to artist and developer Kyle McDonald, the average interval between deep learning framework releases was 47 days in the 2010-2014 period. Last year, he claimed in a tweet, that interval shrank to 22 days.

That may be because AI has become a major focus at leading technology companies. In early 2015, Facebook open sourced modules for the Torch deep learning toolkit. Then in November, Google released TensorFlow. In January this year, Baidu released Warp-CTC. Even Yahoo joined in, releasing a dataset derived from the Yahoo News Feed to fuel machine learning systems.

(Image: Microsoft)

(Image: Microsoft)

Microsoft attributes the surge in interest to the growing number of researchers running machine learning algorithms supported by deep neural networks -- systems modelled on the processes in human brain. Microsoft says that many researchers believe such systems can enhance artificial intelligence applications.

The rapid improvements over the past few years in the speech recognition capabilities of applications like Apple's Siri and Google Translate, and in the image recognition capabilities of Google Photos, suggest that belief is well-founded. As mobile and Internet-connected devices proliferate, AI can be expected to become even more important as a way to facilitate function without traditional keyboard-based interaction.

But corporate interest in releasing such toolkits isn't entirely altruistic. By making software used internally available as open source code, these companies benefit from contributions that improve their code. By encouraging external research talent to become familiar with internal toolsets, they make the path by which these people could become employees a bit easier to traverse.

Xuedong Huang, Microsoft's chief speech scientist, extolled the speed of CNTK in a blog post. "The CNTK toolkit is just insanely more efficient than anything we have ever seen," he said.

CNTK can take advantage of the number-crunching power GPUs on single computers (Windows or Linux) or computing clusters.

TensorFlow can utilize distributed GPUs too, but only on Linux machines. TensorFlow runs on OS X without CUDA parallel GPU support (perhaps not for long). It also can be run on Windows through Docker, which likewise limits GPU usage. Windows support through Bazel appears to be planned.

[See AI, Machine Learning Rising In The Enterprise.]

One disadvantage of CNTK is that it requires C++. TensorFlow supports Python as well as C++. However, Microsoft is planning to add support for Python and C#. It's also developing an Azure cloud service, referred to as Project Philly, that will provide the ability to run CNTK, among other applications, across multiple virtual GPUs.

In a Facebook post expressing support for an assessment of deep learning frameworks conducted by Microsoft researcher Kenneth Tran, Yann LeCun, director of AI at Facebook, contends that Torch has the fewest deficiencies among deep learning frameworks. "Torch has an almost perfect rating on all counts," he notes. "Theano and TensorFlow lack speed, Tensorflow and Caffe lack flexibility."

Ultimately, however, these toolkits depend upon data, and neither of the companies providing deep learning tools are offering third-parties access to the massive datasets they use to train their models. To use that data, start with a job application.

What have you done to advance the cause of Women in IT? Submit your entry now for InformationWeek's Women In IT Award. Full details and a submission form can be found here.

Thomas Claburn has been writing about business and technology since 1996, for publications such as New Architect, PC Computing, InformationWeek, Salon, Wired, and Ziff Davis Smart Business. Before that, he worked in film and television, having earned a not particularly useful ... View Full Bio

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
User Rank: Ninja
1/28/2016 | 10:33:03 PM
Toolset is still valuable
Even without data, the toolkits are still of use. Not just these Tech companies, utility companies, govt, media, hospitals, Telecoms have a lot of data on which the toolkit can be applied on.
A Data-Centric Approach to the US Census
James M. Connolly, Executive Managing Editor, InformationWeekEditor in Chief,  10/12/2018
10 Top Strategic Predictions for 2019
Jessica Davis, Senior Editor, Enterprise Apps,  10/17/2018
AI & Machine Learning: An Enterprise Guide
James M. Connolly, Executive Managing Editor, InformationWeekEditor in Chief,  9/27/2018
Register for InformationWeek Newsletters
Current Issue
The Next Generation of IT Support
The workforce is changing as businesses become global and technology erodes geographical and physical barriers.IT organizations are critical to enabling this transition and can utilize next-generation tools and strategies to provide world-class support regardless of location, platform or device
White Papers
Twitter Feed
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.
Sponsored Video
Flash Poll