Measuring the emotional content of librar* blogs

Posted by Dave Pattern on April 15, 2009
News

I’m going to be adding some features this week which measure the emotional content of the blogs using the “affective norms for English words” (ANEW) list. The list contains just over a thousand words, along with a measure of their “pleasure” and “arousal” values (both measured from 0 to 10). The values were derived from a series of studies carried out in the late 1990s.

As each word in the list has two values, they can be plotted on a chart. If you plot “pleasure” horizontally and “arousal” vertically, then the words that have similar values cluster together.

If you use the image below as a guide, then words which carry a negative emotion (i.e. do not generate pleasure) tend towards the left of the chart (red). The more negative the word is, the more to the left it will be. Conversely, words with a strong positive emotion tend to the right (green). Words in the middle are more neutral.

The “arousal” value measures the “strength” of the emotion. Words that evoke a stronger emotional response are further down (i.e. boxes 7, 8 & 9). Words that don’t are nearer the top (boxes 1, 2 & 3).

ec1

As you’d probably expect, boxes 1, 3 and 8 don’t contain many words. In other words, box 1 would contain strongly negative words that don’t have an emotional “punch”, and box 8 would contain words that are neutral but that evoke a strong emotional response.

To give you an idea, here are some of the words that would appear in each box:

1] negative pleasure, low arousal — bored, dreary, messy, and overcast
2] neutral pleasure, low arousal — paper, table, chin, umbrella, quiet, and nonchalant
3] positive pleasure, low arousal — relaxed, sleep, bird, secure, cozy, butterfly, and pillow
4] negative pleasure, medium arousal — funeral, sad, misery, jail, toothache, fraud, and infection
5] neutral pleasure, medium arousal — hammer, boxer, trumpet, alien, industry, and army
6] positive pleasure, medium arousal — rainbow, luxury, paradise, liberty, reward, and family
7] negative pleasure, high arousal — rape, murder, bomb, terrorist, anger, and danger
8] neutral pleasure, high arousal — lion, masturbate, alert, curious, tease, and aggressive
9] positive pleasure, high arousal — erotic, desire, rollercoaster, orgasm, miracle, joy, and kiss

When the words are plotted on the chart, you get a shape that’s not unlike a map of Australia (if you’re reading this blog post Kathryn, I hope it doesn’t make you home sick!)…

scatter

Let’s look at a couple of examples of what happens if we apply this to library blogs. First of all, the “In the Library with the Lead Pipe” blog (click for the full sized image):

example1

There are 4 things shown on the image:

1) The overall scatter of words in the ANEW list are shown as small blue dots. This is shown simply as a guide to indicate the overall shape (as per the previous image that resembled the map of Australia).

2) The average emotional content of each blog post is shown as a small green cross. This is a calculated by looking for all occurrences of ANEW words in the blog post. The average position is then calculated. Therefore, if a blog post contained lots of strongly negative content, you would expect the green cross to be towards the bottom-left.

3) The average emotional content of all the blog posts is shown as a larger red cross. This is calculated as before, but is the average for all of the content on the blog. Therefore, if a blog contained lots of posts with strongly positive content, you would expect the red cross to be towards the bottom-right.

4) Word usage frequency is indicated by the transparent circles. This gives an indication of the type of words being used on the blog. Larger circles indicate that words with the same pleasure & arousal values have been used more frequently.

Taking all of the above into account, you can see that “In the Library with the Lead Pipe” tends not to have strongly emotional blog posts. However, where emotional words are used, they are mostly positive.

Here’s a second example, this time for the “MaisonBisson.com” blog…

example2

Although the average emotional content (red cross) for the entire blog is similar to “In the Library…”, Casey is obviously a much more emotional blogger. The blog posts (green crosses) are scattered more widely, as is the variation in the emotional words (circles) being used. The circles towards the bottom-left indicate some strongly negative emotional content.

4 Comments to Measuring the emotional content of librar* blogs

Dave Pattern
April 15, 2009

The blog pages should now be displaying an image of the emotional content.

Kathryn Greenhill
April 16, 2009

Wasn’t homesick until someone pointed out the Australian shape in the middle of an otherwise positive pleasure, medium arousal post….

…although that analogy would put my house in the “negative pleasure, high arousal” section, so maybe it makes me less likely to want to hurry home…

Does each blob on the graph represent a single word repeated over and over – or just those with the same degree of affect? Just wondering, because I have a great big blob of something around the rainbow/paradise region and was wondering what it can be…

Dave Pattern
April 16, 2009

:-)

Yep — the vast majority of the blobs will represent the usage of one of the ANEW words (only “sinful” and “quarrel” appear to have the same pleasure and arousal values). I’m also matching on word stems, so “sinfully” and “quarrelsome” would count towards the blob size.

The great big blog represents the word “people”, with 38 occurrences since Dec 08.

The next largest blobs are: “learn”, “present”, “person”, “computer” and “tool”

[...] me to look at DLTJ with a critical and curious eye. The first was the work by David Pattern in Measuring the emotional content of librar* blogs. The second was a post by Leslie Carr on the effect of Google users in finding information.ANEW [...]

Leave a comment

WP_Big_City