Microsoft Open Sources Deep Learning, AI Toolkit On GitHub
Previously available to academic researchers, Microsoft's Computational Network Toolkit (CNTK) now has a friendlier open source license.
8 Ways To Monetize Data
(Click image for larger view and slideshow.)
On Monday Microsoft joined its peers, including Google, Facebook, and Yahoo, in offering a deep learning framework to support artificial intelligence applications.
The company released its Computational Network Toolkit (CNTK) as an open source project on GitHub, thus providing computer scientists and developers with another option for building the deep learning networks that power capabilities like speech and image recognition.
There are already several dozen deep learning toolkits and modules available. But the pace at which this technology is appearing has quickened. According to artist and developer Kyle McDonald, the average interval between deep learning framework releases was 47 days in the 2010-2014 period. Last year, he claimed in a tweet, that interval shrank to 22 days.
That may be because AI has become a major focus at leading technology companies. In early 2015, Facebook open sourced modules for the Torch deep learning toolkit. Then in November, Google released TensorFlow. In January this year, Baidu released Warp-CTC. Even Yahoo joined in, releasing a dataset derived from the Yahoo News Feed to fuel machine learning systems.
Microsoft attributes the surge in interest to the growing number of researchers running machine learning algorithms supported by deep neural networks -- systems modelled on the processes in human brain. Microsoft says that many researchers believe such systems can enhance artificial intelligence applications.
The rapid improvements over the past few years in the speech recognition capabilities of applications like Apple's Siri and Google Translate, and in the image recognition capabilities of Google Photos, suggest that belief is well-founded. As mobile and Internet-connected devices proliferate, AI can be expected to become even more important as a way to facilitate function without traditional keyboard-based interaction.
But corporate interest in releasing such toolkits isn't entirely altruistic. By making software used internally available as open source code, these companies benefit from contributions that improve their code. By encouraging external research talent to become familiar with internal toolsets, they make the path by which these people could become employees a bit easier to traverse.
Xuedong Huang, Microsoft's chief speech scientist, extolled the speed of CNTK in a blog post. "The CNTK toolkit is just insanely more efficient than anything we have ever seen," he said.
CNTK can take advantage of the number-crunching power GPUs on single computers (Windows or Linux) or computing clusters.
TensorFlow can utilize distributed GPUs too, but only on Linux machines. TensorFlow runs on OS X without CUDA parallel GPU support (perhaps not for long). It also can be run on Windows through Docker, which likewise limits GPU usage. Windows support through Bazel appears to be planned.
One disadvantage of CNTK is that it requires C++. TensorFlow supports Python as well as C++. However, Microsoft is planning to add support for Python and C#. It's also developing an Azure cloud service, referred to as Project Philly, that will provide the ability to run CNTK, among other applications, across multiple virtual GPUs.
In a Facebook post expressing support for an assessment of deep learning frameworks conducted by Microsoft researcher Kenneth Tran, Yann LeCun, director of AI at Facebook, contends that Torch has the fewest deficiencies among deep learning frameworks. "Torch has an almost perfect rating on all counts," he notes. "Theano and TensorFlow lack speed, Tensorflow and Caffe lack flexibility."
Ultimately, however, these toolkits depend upon data, and neither of the companies providing deep learning tools are offering third-parties access to the massive datasets they use to train their models. To use that data, start with a job application.
What have you done to advance the cause of Women in IT? Submit your entry now for InformationWeek's Women In IT Award. Full details and a submission form can be found here.
Thomas Claburn has been writing about business and technology since 1996, for publications such as New Architect, PC Computing, InformationWeek, Salon, Wired, and Ziff Davis Smart Business. Before that, he worked in film and television, having earned a not particularly useful ... View Full Bio
6 Tools to Protect Big DataMost IT teams have their conventional databases covered in terms of security and business continuity. But as we enter the era of big data, Hadoop, and NoSQL, protection schemes need to evolve. In fact, big data could drive the next big security strategy shift.
Big Data Brings Big Security ProblemsWhy should big data be more difficult to secure? In a word, variety. But the business won’t wait to use it to predict customer behavior, find correlations across disparate data sources, predict fraud or financial risk, and more.
Top IT Trends to Watch in Financial ServicesIT pros at banks, investment houses, insurance companies, and other financial services organizations are focused on a range of issues, from peer-to-peer lending to cybersecurity to performance, agility, and compliance. It all matters.
Join us for a roundup of the top stories on InformationWeek.com for the week of September 25, 2016. We'll be talking with the InformationWeek.com editors and correspondents who brought you the top stories of the week to get the "story behind the story."