Online comments don't fall neatly into like or dislike, so it take nuances to understand them.
Online comments don't fall neatly into "positive" and "negative" buckets. There's a range of consumer sentiment that challenges even the most sophisticated natural language processing technologies. At last month's Sentiment Analysis Symposium, Catherine van Zuylen, VP of products at Attensity, a social analytics software vendor, provided this list of difficult comment-analysis problems:
False negatives: The words "crying" and "crap" suggest negativity, but then there is "I was crying with joy" or "Holy crap! This is great." Here's where simplistic tools might be fooled.
Relative sentiment: "I bought a Honda Accord." Great for Honda but bad for Toyota.
Compound sentiment: Doing work for movie studies, Attensity has had to make sense of comments such as "I loved the trailer but hated the movie." Big mobile phone companies encounter mixed messages such as "I love the phone but hate the network."
Conditional sentiment: "If someone doesn't call me back, I'm never doing business with them again." Or "I was really pissed, but then they gave me a refund."
Scoring sentiment: Vendors are expected to measure relative sentiment, but how positive is "I like it" versus "I really like it" versus "I love it"?
Sentiment modifiers: "I bought an iPhone today :-)" or "Gotta love the cable company ;-<". Emoticons are straightforward, but what words are they connected to?
International sentiment: Japanese have unique emoticons, like (;_;) for crying. Italians tend to be far more effusive and grandiose, whereas Brits are generally drier and less effusive, making those relative scoring challenges mentioned earlier all the more complicated.
Sophisticated systems can be optimized to handle these kind of problems, van Zuylen says. But analyst Seth Grimes says no amount of tuning will lead to perfection, so it's best to focus the extra effort on developing insight about and acting on the majority of clear-cut sentiments.