Sentiment Analysis & the English Language 12 November 2010

Currently I am working on Sentiment Analysis, so I will probably post a series of smaller posts on issues that I come across. Today I was looking at YouGov’s website and public opinions of Nick Clegg, the social anthropologist currently serving as deputy prime minister. Here is a snapshot of the current opinions at the time, and you can see that most of them are classed as negative:

Most? I would say all… but apparently whatever system YouGov use for sentiment analysis cannot cope with idioms. And shooting yourself in the foot is not exactly a tricky one to identify I should think.

But this raises a more complex issue: there are many ways to express opinions, attitudes, judgements, etc in language. This is a much larger problem than counting the number of ‘positive’ and ‘negative’ words in a text. To begin with, words in isolation rarely have a meaning; opinions are usually subjective; and then there’s irony and sarcasm.

Yes, Clegg did really well when he supported the Tories on tuition fees…

Continuing on this theme, here’s another issue: the assumption that the scope of a sentiment is the whole text. Here’s an opinion (positive) from the same site about student protests:

I completely support the right to protest; however, violence is unreasonable.

This is somewhat positive, supporting the students, but in the second clause there is an additional judgment condemning the violent incidents that happened at the demonstration. This seems to suggest that the proper carrier of attitude should be the clause, rather than the sentence, let alone the text. Not everything is just black and white.



1. Sheng - 12 November 2010

Hey Oliver,

Yes, Alec Go and Richa Bhayani discussed the similar problem in their paper “Exploiting the Unique Characteristics of Tweets for Sentiment Analysis”. They define it as sarcasm, and they did not find very effective method to solve this problem.


