Expert Analysis: Is Sentiment Analysis an 80% Solution? - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Software // Information Management
04:13 PM
Connect Directly

Expert Analysis: Is Sentiment Analysis an 80% Solution?

Sentiment-analysis technologies aren't perfect. But what critics are missing is the value of automation, the inaccuracy of human assessment, and the many applications that require only "good-enough" accuracy.

So perhaps human sentiment analysis isn't as good as folks suppose; certainly not 100%. Try a few examples yourself. Imagine that you're not just reading, that you're generating data for monitoring/measurement/analysis purposes. Are these tweets positive, negative, both, or neutral -- at the tweet, sentence, and feature [e.g., public option, Obama, GOP, Avatar] levels?

Here's a thought for those against a public option. Sign a legal doc removing u from any government health care EVER - U pay no matter how $

Obama trying bipartisanship delayed #hcr and allowed GOP to redefine in negatively. Takes willing sides for cooperation #p2 #tcot

Seattle's hippest pastor says "Avatar is Satan."

Of course, I cherry-picked those examples to illustrate that it's sometimes difficult to assign sentiment polarity precisely. Take the first example. It's not explicitly pro a health care public option, is it, even while it's implicitly against public-option opponents? At the tweet level, is it pro, con, neither, or both?

Automated Sentiment Analysis

Regarding automated systems, Bing Liu says "acceptable accuracy and even the measure of it is quite tricky because sentiment analysis is a multi-faceted problem with several sub-problems. For most practical applications, they all need to be solved*** In terms of precision and recall of opinion orientation classification (not other sub-problems), I believe a precision around 90% will be sufficient, but some companies asked for near 100% precision based on my practical experience. (They need to be educated!) Recall is a slightly different issue. A reasonable value will be OK as one does not need to catch every sentence with opinions to find the problems of a product."

Mikko Kotila says leading providers such as Sysomos and Radian6 estimate their automated sentiment analysis and scoring system to be 80% accurate. Without citing examples, I asserted that many systems don't do even that well, not that they have to in order to be useful. But can anyone do better?

Dave Nadeau, creator of restaurant-review start-up InfoGlutton, can. According to Nadeau, InfoGlutton is trained on restaurant reviews from 25 sources for more than 100 restaurants. The proprietary corpus is made of 6,000 reviews totaling 40,000 sentences. Nadeau offers the statistics that:

  • "InfoGlutton sentiment analysis at sentence level is 89.5% accurate, with classifiers tuned for very high (~92%) precision for the positive and negative sentiments.
  • InfoGlutton sentiment analysis at review level is 94% accurate, with classifiers tuned for very high (~96%) precision for the positive and negative sentiments.

Accuracy Beyond Precision

My earlier Twitter examples allow me to introduce the notion that there's more to accuracy than classification precision. Accuracy in text analytics and search is typically computed from both precision and recall. Recall is the proportion of target features (documents, entities, whatever) found and the proportion of those found that were found correctly. Doesn't it go without saying that human methods will never match the recall, speed, and reach of automated methods?

What do you make of this tweet?

Le service apres vente de Toshiba est vraiment... MAUVAIS!!!! (Probleme: Regza LED)

Hint: "vraiment mauvais" is French for truly bad, and "le service après vente" is post-sales service. I'm sure you recognize "Toshiba" and I'd infer that "Regza LED" is a product. It's a computer's ability to find and analyze sources across languages, and to operate 24/7, scanning volumes of data that would overwhelm humans, that gives an edge to automation.

This example illustrates recall. If Toshiba (or their rivals) monitored only English-language sources, they would miss A LOT of relevant material. The tweet I offered above is in French; I can only guess at the volume of material posted in Japanese, Korean, Spanish, Arabic, Russian, and other languages. I'm not claiming that every automation solution can handle every language, nor that your efforts have to be exhaustive. But I will say that if your brand is multi-national, your recall, and thus your overall sentiment-analysis accuracy, are going to suffer without automation.

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
2 of 3
Comment  | 
Print  | 
More Insights
InformationWeek Is Getting an Upgrade!

Find out more about our plans to improve the look, functionality, and performance of the InformationWeek site in the coming months.

Why IT Leaders Should Make Cloud Training a Top Priority
John Edwards, Technology Journalist & Author,  4/14/2021
10 Things Your Artificial Intelligence Initiative Needs to Succeed
Lisa Morgan, Freelance Writer,  4/20/2021
Lessons I've Learned From My Career in Technology
Guest Commentary, Guest Commentary,  5/4/2021
White Papers
Register for InformationWeek Newsletters
Current Issue
Planning Your Digital Transformation Roadmap
Download this report to learn about the latest technologies and best practices or ensuring a successful transition from outdated business transformation tactics.
Flash Poll