What are the most challenging issues in Sentiment Analysis(opinion mining)?
Opinion Mining/Sentiment Analysis is a somewhat recent subtask of Natural Language processing.Some compare it to text classification,some take a more deep stance towards it. What do yo开发者_运维百科u think about the most challenging issues in Sentiment Analysis(opinion mining)? Can you name a few?
The key challenges for sentiment analysis are:-
1) Named Entity Recognition - What is the person actually talking about, e.g. is 300 Spartans a group of Greeks or a movie?
2) Anaphora Resolution - the problem of resolving what a pronoun, or a noun phrase refers to. "We watched the movie and went to dinner; it was awful." What does "It" refer to?
3) Parsing - What is the subject and object of the sentence, which one does the verb and/or adjective actually refer to?
4) Sarcasm - If you don't know the author you have no idea whether 'bad' means bad or good.
5) Twitter - abbreviations, lack of capitals, poor spelling, poor punctuation, poor grammar, ...
I agree with Hightechrider that those are areas where Sentiment Analysis accuracy can see improvement. I would also add that sentiment analysis tends to be done on closed-domain text for the most part. Attempts to do it on open domain text usually winds up having very bad accuracy/F1 measure/what have you or else it is pseudo-open-domain because it only looks at certain grammatical constructions. So I would say topic-sensitive sentiment analysis that can identify context and make decisions based on that is an exciting area for research (and industry products).
I'd also expand his 5th point from Twitter to other social media sites (e.g. Facebook, Youtube), where short, ungrammatical utterances are commonplace.
I think the answer is the language complexity, mistakes in grammar, and spelling. There is vast of ways people expresses there opinions, e.g., sarcasms could be wrongly interpreted as extremely positive sentiment.
The question may be too generic, because there are several types of sentiment analysis (document level, sentence level, comparative sentiment analysis, etc.) and each type has some specific problems.
Generally speaking, I agree with the answer by @Ian Mercer, and I would add 3 other issues:
- How to detect a more in depth sentiment/emotion. Positive and negative is a very simple analysis, one of the challenge is how to extract emotions like how much hate there is inside the opinion, how much happiness, how much sadness, etc.
- How to detect the object that the opinion is positive for and the object that the opinion is negative for. For example, if you say "She won him!", this means a positive sentiment for her and a negative sentiment for him, at the same time.
- How to analyze very subjective sentences or paragraphs. Sometimes even for humans it is very hard to agree on the sentiment of this high subjective texts. Imagine for a computer...
Although this is a little bit an old question, let me add some note related to Arabic sentiment anlsysis in specific. Arabic language has morphological complexities and dialectal varieties which require advanced preprocessing and lexical building processes that surpass what is needed for the English language.
Please, refer to
- "https://www.researchgate.net/publication/280042139_Survey_on_Arabic_Sentiment_Analysis_in_Twitter"
- "https://link.springer.com/chapter/10.1007/978-3-642-35326-0_14"
精彩评论