Microsoft's Project Oxford Gets Emotional With Machine Learning, AI

Microsoft announces new services in its Project Oxford suite of developer tools based on machine learning and artificial intelligence.

Larry Loeb, Blogger, Informationweek

November 11, 2015

3 Min Read

7 Microsoft Improvements We Need To See

7 Microsoft Improvements We Need To See

7 Microsoft Improvements We Need To See (Click image for larger view and slideshow.)

Microsoft has unveiled plans to release new tools in its Project Oxford that will help developers take advantage of latest advances in machine learning and artificial intelligence, including a tool that can recognize emotion. The company made the announcement Wednesday at the Future Decoded conference in London.

Project Oxford, a suite of developer tools based on Microsoft's machine learning and artificial intelligence research, was introduced in May at the Build conference. It uses a Web-based RESTful interface to add voice, text, and image services. Project Oxford face, speech, and computer-vision APIs have also been included as part of the Cortana Analytics Suite.

The emotion tool released Wednesday "can be used to create systems that recognize eight core emotional states -- anger, contempt, fear, disgust, happiness, neutral, sadness or surprise -- based on universal facial expressions that reflect those feelings," according to a Microsoft blog post about the announcement.

{Image 1}

However, not all emotions are detected with the same level of confidence, according to reports. Because the tool can only handle static images at the moment, emotions such as happiness can be detected with a higher level of confidence than other emotions such as contempt or disgust.

The emotion tool is now available to developers as a public beta. Other new tools announced Wednesday will be released as beta versions by the end of the year. They include video, Custom Recognition Intelligent Services, speaker recognition, and updates to face-detection APIs.

The video tool, which is based on some of the same technology found in Microsoft's Hyperlapse, helps analyze and edit videos by tracking faces, detecting motion, and stabilizing shaky video.

Custom Recognition Intelligent Services (CRIS) can tailor voice recognition for a specific situation, such as a noisy venue. The tool can also be used to help an app better understand people who have traditionally had trouble with voice recognition, such as non-native speakers or those with disabilities. Microsoft said that it will be available as an invite-only beta by the end of the year.

The speaker recognition tool can used to discover who is speaking based on learning the particulars of an individual's voice. The tool might be useful for identification of a speaker in a conference call, for example. It will be available as a public beta by the end of the year.

[No, AI Won't Kill Us All.]

Microsoft also announced updates to Project Oxford's face APIs to include facial hair and smile prediction tools, and an improved visual age estimation and gender identification.

Connecting speaker recognition and face detection may also serve as the foundation of an authentication system for users similar to what Google is doing with the still-under-development Project Abacus.

All of the real work goes on inside of Microsoft's Azure Cloud platform, and that seems to be the way Microsoft wants it. These kinds of features serve as a gateway to Azure for developers, and may grow into a competitive advantage against other commodity cloud providers.

About the Author(s)

Larry Loeb

Blogger, Informationweek

Larry Loeb has written for many of the last century's major "dead tree" computer magazines, having been, among other things, a consulting editor for BYTE magazine and senior editor for the launch of WebWeek. He has written a book on the Secure Electronic Transaction Internet protocol. His latest book has the commercially obligatory title of Hack Proofing XML. He's been online since uucp "bang" addressing (where the world existed relative to !decvax), serving as editor of the Macintosh Exchange on BIX and the VARBusiness Exchange. His first Mac had 128 KB of memory, which was a big step up from his first 1130, which had 4 KB, as did his first 1401. You can e-mail him at [email protected].

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like

More Insights