Text Sentiment Analysis Reading Notes

Zhankui He / 25 Mar 2018

Opinion and Emotion Mining are classic tasks, which are included in Sentiment Analysis. The survey clearified the definitions of Opinion and Emotion, and demonstrated methods and resources about this task.

Definitions

  • Sentiment Analysis: mining opinions, sentiments, and emotions based on observations of people’s actions that can be captured using their writings, facial expressions, speech, music, movements, and so on.
  • Emotion: discrete and consistent responses to internal or external events that have a particular significance for the organism; emotion has short-term duration.
  • Feeling: a subjective representation of emotions, private to the individual experi- encing them; similarly to emotion, it has short-term duration
  • Mood: a diffuse affective state that compared to emotion is usually less intense but with longer duration
  • Affect: an encompassing term used to describe the topics of emotion, feelings, and moods together.

text_sentiment

Levels of Classification

  • Document Level: In this category, the whole document, whether short or long, is the atomic unit of input to the problem, and the polarity of the whole document is the essence of the study. The simplest sentiment analysis task
  • Sentence Level: The objective of this group of studies is to determine the polarity of a sentence. The surrounding context is important.
  • Aspect Level: This category, also known as feature-based opinion mining, encom- passes the study of discovering opinion polarities about a specific aspect of a product or service.

Supervised Methods

  • General-perpose Machine Learning: SVM, Naive Bayes, etc.

  • Representation Learning: RNN and CNN in Deep Learning

  • Feature Engineering:

  1. Presence-Based and Frequency-Based Features.
  2. Unigram and N-Gram Features.
  3. Part of Speech.
  4. Syntax.
  5. Negation.
  6. Topic-Oriented Features.

Unsupervised Methods

  • Lexicon Expansion: A lexicon is a dictionary of words, each word associated with a score showing its degree of polarity. However, to have higher performance, one may need to create his/her own lexicon of words suitable for the domain in question.

    Lexicon Expansion is build a lexicon automatically, including “corpus-based lexicon expansion”, “dictionary-based lexicon expansion”.

  • Domain Adaptation: adapting the classifier trained over the source to be useful for the target. That is conducted by two steps: Clustering feature and Alignment

  • Other Methods:

  1. Bootstrapping: The general idea is to use an initial pre-trained classifier on another dataset to label the target dataset and then use this newly labeled dataset to train a new classifier.
  2. Belief Network Modelling
  3. Combing lexical and machine-learning methods: Lexical and learning meth- ods can be combined to compensate the disadvantages and drawbacks of each other.

who am i

A junior student of Computer Science at Fudan University, where my education track is Data Science. I believe the unstructured multimedia data will be understood and retrieved well by algorithms.

where am i

HOME OF THIS BLOG
Personal Page: (Aaron) Zhankui He
Email: zkhe15@fudan.edu.cn
Github: AaronHeee

what is this

This is a blog about paper reading weekly, recording the academic thinking for myself. The topics focus on Video Analysis, Information Analysis, Social Network and Machine Learning.

© MMXIII — MMXVI by (Aaron) Zhankui He from Hugo Sereno Ferreira.
Content available under Creative Commons (BY-NC-SA) unless otherwise noted.
This site is hosted at Github Pages and created with Jekyll.