Natural Language Processing: Big Data's RoleNatural Language Processing: Big Data's Role
Understanding human language, and the intent behind our words, remains a daunting challenge for computers. UK startup The Outside View aims to help.
February 13, 2014
16 Top Big Data Analytics Platforms
16 Top Big Data Analytics Platforms (Click image for larger view and slideshow.)
Computers do many things faster and more efficiently than the human brain, but they're decidedly inferior when it comes to extracting meaning from human language. As BigData-Startups.com founder Mark van Rijmenam writes in a recent blog post, the key stumbling block here is that computers understand "unambiguous and highly structured" programming language, while human language is a minefield of nuance, emotion, and implied intent.
Van Rijmenam also quotes a Chronicle of Higher Education post by Geoffrey Pullum, a professor of general linguistics at the University of Edinburgh. Pullum outlines three prerequisites for computers to master human language: "First, enough syntax to uniquely identify the sentence; second, enough semantics to extract its literal meaning; and third, enough pragmatics to infer the intent behind the utterance, and thus discerning what should be done or assumed given that it was uttered."
Pragmatics pose the biggest challenge in this field, van Rijmenam writes. "Often a lot more is told by what is not said in a sentence or conversation than what is said."
[Author Phil Simon discusses the power of visualization tools to find clarity in chaos. Read Big Data Is Nothing If Not Visual.]
There's no shortage of technology firms working on natural language processing solutions that teach computers better human language skills. For example, the London startup The Outside View is developing a voice analytics tool that studies the mood and quality of sales calls to determine the likelihood of a sale closing.
Rob Symes, CEO of The Outside View, and Jason Filos, head of its voice analytics effort, told us how predictive analytics can help sales teams unleash the potential of their voice recordings, CRM data, and email.
"I come from a family of entrepreneurs, and we're all salespeople," Symes said. "We've always measured how many times you do something -- how many emails, how many phone calls. We've measured the quantity, but what we're trying to do is measure the quality of those interactions." Chief executives "know who their top billers are and who their bottom salespeople are. But the key question is why. That's what we're trying to answer."
Analyzing unstructured data, including the contents of email and voice calls, is tricky business. Email messages often contain abbreviations and spelling errors. "So the first step is to make sense of what has been said," Filos said. "In order to do this… you need some very clever algorithms, which have only been around for the past 10 to 15 years or so."
Phone conversations add layers of complexity to the problem. "What we're aiming to do is -- in a non-intrusive way -- to detect the emotion of the people who are engaging in the conversation," Filos said. "We've been setting up a system which basically takes the audio file from phone calls and, as a first step, makes a transcript of it."
Of course, speech-to-text conversion isn't new, but that's just the start of the data analysis. "We then try to classify, as a first step, two basic emotions," Filos said. These include the "normal" or "neutral" state and a more agitated condition such as anger or excitement.
This approach poses a challenge: The speaker could intentionally skew the results. "For example, you can pretend that you're angry, or you could be lying," Filos said. "So there are a lot of false positives involved. The basic idea is to classify or estimate the emotions of the people we're talking to over the phone and then compute the probability of whether this was a successful call or not."
Symes said The Outside View's voice analytics software is being tested under NDA by several companies, including a "very, very large international media company in the UK," which he would not name. "The main thing for us is getting enough user case studies to show there's a benefit."
Engage with Oracle president Mark Hurd, Box founder Aaron Levie, UPMC CIO Dan Drawbaugh, GE Power CIO Jim Fowler, former Netflix cloud architect Adrian Cockcroft, and other leaders of the Digital Business movement at the InformationWeek Conference and Elite 100 Awards Ceremony, to be held in conjunction with Interop in Las Vegas, March 31 to April 1, 2014. See the full agenda here.
About the Author(s)
You May Also Like