Print Print


Open the Visualization
Analysis of cast of characters (NNP) in Tom Sawyer

Purpose and Relevance

This demonstration builds upon the POS (Part of Speech) tagging from UIMA in the previous demonstration (UIMA and SEASR). We use a simple graph-based thesaurus to do sentiment analysis on a body of text. Once again, the underlying purpose and relevancy of this demonstration is to show how two different frameworks can be used to solve a problem quickly.

Overview

In this demonstration we use a service (currently under development) that allows the graph-like traversal of a thesaurus. Given any two words (adjectives for now), we attempt to find a path between the two words using only synonyms. For example, to get from “delightful” to “rainy” using only a thesaurus (without using antonyms or colloquialisms), you would use the following path: ['delightful,' 'fair,' 'balmy,' 'moist,' 'rainy'] with a path length of 4.

Each pair of words in the path can also be assigned a metric that describes how strong the link between the two words is. One metric is the number of common synonyms the two words share. In the above example, “delightful” and “fair” share the following set of synonyms:
['fair,' 'charming,' 'pleasant,' 'attractive,' 'enchanting,' 'lovely,' 'ravishing,' 'delicate'].

Another metric might be if the pair is symmetric. That is, “delightful” offers “fair” as a synonym and “fair” offers “delightful” as a synonym. This Synonym Network (SynNet) also allows researchers to attach new labels to a collection of words (e.g synonyms). SynNet will allow us to crudely attempt sentiment analysis using a bag-of-words model for document classification. This service could be used in conjunction with WordNet for even more robust sentiment analysis.

Sentiment analysis involves classifying text based on its sentiment. It is usually thought of when determining the attitude of a speaker or a writer or determining whether a review is positive or negative. Sentiment analysis currently is getting a lot of attention. It is a difficult problem with many challengers. Our contribution is another metric that could be used to determine sentiment within a body of text. As a simple metric to measure which adjective is closest to which emotion label, we use path length from the adjectives within the text to the nearest emotion label.

In this demonstration, we ask: What emotion is being conveyed by the writer? We will attempt to classify a body of text to one of the six core emotions described by Parrot: Love, Joy, Surprise, Anger, Sadness, and Fear.

The following diagram shows the breakdown:


Figure 1. Parrot (2001) emotions

Process

Once the UIMA process finishes the POS tagging (as shown in UIMA and SEASR), we extract all the adjectives and calculate the shortest path from each adjective to one of the emotions. This new dataset is then processed by a component based on the flare ActionScript visualization library. This visualization component will be available for similar type of analysis. For this demonstration, we provide an embedded visualization of the demonstration.

Data Input and Manipulation

Much like the previous demo, Frequent Pattern Mining from UIMA Data, the window size of the text will affect the analysis. We chose a window size of either 10 or 20 paragraphs. Because both the SynNet service and the visualization components are both under development, please let us know if you have any ideas on how to improve them.

Visualization of Results

This URL will bring up the visualization shown above comparing the two texts:
1. Turn of the Screw
2. Joyful Hearts

Only the adjectives were used to attempt to label a window of 10 paragraphs to the six emotions outlined by Parrot.

Data Type Restrictions

We are using UIMA for loading documents and are dependent on the supported formats (text, xml, pdf, html, etc.).

Scale Limitations

The visualization is currently only showing one text document at a time.

References

  1. sentiment analysis, http://en.wikipedia.org/wiki/Sentiment_analysis
  2. emotions, http://en.wikipedia.org/wiki/List_of_emotions
  3. Parrot 2001, http://changingminds.org/explanations/emotions/basic%20emotions.htm
  4. wordnet, http://wordnet.princeton.edu
  5. flare actionscript library, http://flare.prefuse.org

Leave a Reply